I uploaded the M12 GUID list and the master sets list if you wanted to test the GUID extraction stuff, they're going to be located here: http://octgn.gamersjudgement.com/OCTGN/
The individual set lists are going to be arranged in 3 columns: number | name | GUID. The filename will always be the 3-letter code for the set.
Now the sets.txt file is a little more complex. This is going to be a master list of all sets in OCTGN, and contains the data necessary to construct the top part of a set file (set GUIDs, names, packaging info). The 3-letter code in this file will match the name of the individual set lists.
Each set entry's got many properties, I'm going to list them in order:
1) 3-letter code for the set
2) The set's full name
3) The set's GUID
4) The set's recommended version number
5) GUID of the main booster pack
6) Frequency of opening a mythic rare
7) Frequency of opening a rare
8) Number of uncommons in the booster
9) Number of commons in the booster
10) Number of basic lands in the booster
11) GUID of the unlimited basic land pack (if the set has one)
you can keep the // for the OCTGN stuff, with our conventions, the Split cards (invasion), Flip cards (kamigawa), and double-faced cards (innistrad) will all have the same text structure:
I just noticed that the line breaks are being extracted as +#xd;+#xa; when they should be &'s instead of +'s
scratch that, I realized this was my fault and got it to work properly.
Some additional things I thought of though:
1) Is there a way to extract the tokens from a set along with the cards? In our sets, we include the tokens in the XML file, and in the RELS the location looks like "/tokens/T1.jpg" (basically we number the tokens as T1, T2, T3, etc)
2) When you choose to save the XML/RELS sets, can you make the default filename be the card's 3-letter code instead of the full name? The reason being that sets with spaces and special characters don't work well with OCTGN.
3) the rules text on Level Up creatures are extracted a little weird... I was hoping they'd look a little similar to Planeswalker texts, formatted like so:
Level Up {3} (__reminder text__)
[Level 2-4]: First Strike (4 / 4)
[Level 5+]: Double Strike (5 / 5)
4) If possible, I'd like the text for the double faced cards in Innistrad (and the flip cards from kamigawa) to extract exactly like Split cards do, where it looks like:
actually, I was thinking of creating a "sets.txt" file which stores information regarding the sets themselves (full name, 3-letter codename, set GUID, packaging data), so that the extractor can figure out if the set has land packs, if there's mythics, etc.
So maybe, for each set it could look on this file to see if the set already exists... if it does, then it'll load the correct set's GUID database and start matching GUIDs. If it doesn't, then it'll assume a brand-new set and generate new GUIDs. (Maybe it can generate the GUID database text file for that set as well, so I can quickly upload it afterwards).
I've run into an issue with compiling the text files though. There are cards that exist in multiple versions in sets (like the basic lands, and stuff like the Urza lands). We'll need some way to differentiate the different versions of these cards. For that reason, I'm going to add the multiverseID to the text file as well, so it'll be "1|Karn Liberated|XXXXXXX". The issue is that our old gatherer extractor had issues with MultiverseIDs for a lot of the old sets, as such we don't have them for anything in the old card frame. So I'm not sure what strategy I'm going to use to obtain all these...
EDIT: I'm doing collector number instead of multiverseID, since many promo cards dont have IDs.
@Dresden:
I was talking about GUIDs specifically for OCTGN's use. I'm not aware of any other currently-available MTG applications that use GUIDs in a similar fashion, nor do I know of any GUID database of MTG cards available. You're welcome to use ours once I get the database published.
The issue is that the extractor currently has no way of knowing what the correct pre-determined GUIDs are for sets that already exist in OCTGN. So instead, it simply generates new ones. This works fine when we want to use the extractor to generate brand-new sets, but for updating existing sets we don't want to create new GUIDs.
@Chaudakh:
I don't have the GUID list ready yet, I wanted to check with you first to see if it was a good idea. In order for it to work, I have to make sure my list has the correct spellings for all the cards, and the easiest way is to extract gatherer data, so we're in a bit of a paradox here. I think we can bypass this if you add some sort of verification tool to the extraction process: when it tries to obtain a GUID, if the card doesn't exist in the GUID list (i.e. spelling or punctuation is wrong), it gives an alert message. This way I can update the GUID list as I go along, and eventually I'll be able to find all the mistakes.
While we're on the topic of the GUID list, do you have any preferences on the formatting of the files? I want to make sure they are formatted as efficiently as possible, taking into account things like server strain and bandwidth as well. Would it be easier to have one massive file, or break them up into individual sets? My current plan is to have individual sets, and format them as so:
NPH.txt
[noparse]
New Phyrexia|2566f33a-8472-4ee7-b37b-abbd24a1a5a5
Karn Liberated|001d9638-76f4-4827-85d1-adfb1e4ef95e
Apostle's Blessing|aee11b03-23c3-4512-be24-e55c36436ba9
[/noparse]
now that I've got a stable (and super fast!) internet connection, I want to get the MTG oracle text updates done, so I've been devising a strategy for both short-term and long-term OCTGN set maintenance.
I mentioned earlier that the biggest obstacle is the GUIDs. Generating new ones is fine if we want to export newly-released sets, but when it comes to updating the rules text on pre-existing sets, we have to make sure the GUIDs are the same. If we generate new GUIDs for cards in a set each time we want to release a patch, then people will essentially have to recreate their decks every single time (and it'll mess up a lot of the autoscripts as well) a patch is released.
I have two ideas that would fix this solution, unfortunately both of them involve changing the OCTGN export settings in the extractor:
1. I publish a "GUID database" of every card from every set released on OCTGN to the internet. This would consist of a simple text webpage for each set, and lists the names and corresponding GUIDs for each of the cards. What the extractor would need to do is download these text files, and when it would generate a GUID for a card, instead it extracts the card's preexisting GUID from the list.
2. Instead of exporting to the OCTGN XML format, we have an option to export as an OCTGN excel spreadsheet, arranging the information into cell rows instead of the XML structure (but maintaining all the special rules I mentioned like
for line breaks, " for quotations, etc). I can then quickly copy-paste the data into a spreadsheet on my computer which already lists card names and GUIDs, then simply write a =CONCATENATE function to compile each card into the XML format. Then I can simply copy the XML lines and replace the ones in the current set files.
I would like to see #1 happen because I've already gotten several requests by other developers to publish an online GUID database (for things like deck converters), so its a foundation that will already exist. However, exporting as an excel spreadsheet would also work and I'm sure several non-OCTGN users would like this feature. In fact, if we had an excel export option, then all the OCTGN-related stuff wouldn't be necessary anymore (except for the special text rules mentioned above, but those could be amalgamated as "export options")
Also, for the OCTGN export stuff, it'd be really useful if we had the ability to export multiple sets at once, it gets a little tedious having to do them one-at-a-time.
I'm at work right now and can't check, but did you get rid of the XML filtering I mentioned earlier for stuff like the AE character? I know we asked a while back to add a filter to remove these special characters, however we did some testing and all those special characters are indeed supported by XML and octgn, so the only things that need to be specifically filtered are the & symbol, quotations and line breaks.
Also, I found a bug last night... cards with single colored mana costs (such as Ancestral Recall) do not show its mana symbol in the "Cost" column. Their costs are empty, but their converted mana costs still show up as 1. Colorless 1 mana cards show their symbols as normal, though.
As Gaspare mentioned, we're going to go back to numbering the card images by alphabetical order instead of collector number order (since the slightlymagic hi-res scans have the card name as their file name and it's a lot of work to rename them to collector number), so the filename paths in the rels export should be numbered based on card name.
Also, I wanted to clarify my earlier message about special characters. OCTGN/XML indeed supports special characters like Æ or ö, so if you've added replacement scripts to convert them to basic letters, then we don't need that anymore. The ONLY replacements that need to occur are " for double quotations, & for & symbols, and for line breaks.
Also, for the flip cards from Kamigawa, they can be treated the same way as the split cards.
I did some further testing and OCTGN supports full UTF-8 encoding, so characters such as the AE character can be used in xmls. In fact, I'm pretty sure all special characters except for & and " can be used.
We have other developers who have asked for this kind of database for things like deck converters, so I plan on keeping it updated as often as possible.
We typically release incomplete sets and incremental patches as new sets are spoiled, so we'll likely have the entire GUID database for new sets online by the time the set arrives on gatherer. This actually makes it easier on us as well, as we used to type out the entire card information by hand as the cards are revealed (since we didn't have good extractors). Now we can simply record the name and GUID, and have the extractor fill in the rest once the set's released.
Private Mod Note
():
Rollback Post to RevisionRollBack
To post a comment, please login or register a new account.
oh btw, you can take the OCTGN export stuff out of the code if you wish, we don't use that format for our sets anymore.
The individual set lists are going to be arranged in 3 columns: number | name | GUID. The filename will always be the 3-letter code for the set.
Now the sets.txt file is a little more complex. This is going to be a master list of all sets in OCTGN, and contains the data necessary to construct the top part of a set file (set GUIDs, names, packaging info). The 3-letter code in this file will match the name of the individual set lists.
Each set entry's got many properties, I'm going to list them in order:
1) 3-letter code for the set
2) The set's full name
3) The set's GUID
4) The set's recommended version number
5) GUID of the main booster pack
6) Frequency of opening a mythic rare
7) Frequency of opening a rare
8) Number of uncommons in the booster
9) Number of commons in the booster
10) Number of basic lands in the booster
11) GUID of the unlimited basic land pack (if the set has one)
TEXT A
//
TEXT B
I just noticed that the line breaks are being extracted as +#xd;+#xa; when they should be &'s instead of +'sscratch that, I realized this was my fault and got it to work properly.
Some additional things I thought of though:
1) Is there a way to extract the tokens from a set along with the cards? In our sets, we include the tokens in the XML file, and in the RELS the location looks like "/tokens/T1.jpg" (basically we number the tokens as T1, T2, T3, etc)
2) When you choose to save the XML/RELS sets, can you make the default filename be the card's 3-letter code instead of the full name? The reason being that sets with spaces and special characters don't work well with OCTGN.
3) the rules text on Level Up creatures are extracted a little weird... I was hoping they'd look a little similar to Planeswalker texts, formatted like so:
4) If possible, I'd like the text for the double faced cards in Innistrad (and the flip cards from kamigawa) to extract exactly like Split cards do, where it looks like:
So maybe, for each set it could look on this file to see if the set already exists... if it does, then it'll load the correct set's GUID database and start matching GUIDs. If it doesn't, then it'll assume a brand-new set and generate new GUIDs. (Maybe it can generate the GUID database text file for that set as well, so I can quickly upload it afterwards).
I've run into an issue with compiling the text files though. There are cards that exist in multiple versions in sets (like the basic lands, and stuff like the Urza lands). We'll need some way to differentiate the different versions of these cards. For that reason, I'm going to add the multiverseID to the text file as well, so it'll be "1|Karn Liberated|XXXXXXX". The issue is that our old gatherer extractor had issues with MultiverseIDs for a lot of the old sets, as such we don't have them for anything in the old card frame. So I'm not sure what strategy I'm going to use to obtain all these...
EDIT: I'm doing collector number instead of multiverseID, since many promo cards dont have IDs.
I was talking about GUIDs specifically for OCTGN's use. I'm not aware of any other currently-available MTG applications that use GUIDs in a similar fashion, nor do I know of any GUID database of MTG cards available. You're welcome to use ours once I get the database published.
The issue is that the extractor currently has no way of knowing what the correct pre-determined GUIDs are for sets that already exist in OCTGN. So instead, it simply generates new ones. This works fine when we want to use the extractor to generate brand-new sets, but for updating existing sets we don't want to create new GUIDs.
@Chaudakh:
I don't have the GUID list ready yet, I wanted to check with you first to see if it was a good idea. In order for it to work, I have to make sure my list has the correct spellings for all the cards, and the easiest way is to extract gatherer data, so we're in a bit of a paradox here. I think we can bypass this if you add some sort of verification tool to the extraction process: when it tries to obtain a GUID, if the card doesn't exist in the GUID list (i.e. spelling or punctuation is wrong), it gives an alert message. This way I can update the GUID list as I go along, and eventually I'll be able to find all the mistakes.
While we're on the topic of the GUID list, do you have any preferences on the formatting of the files? I want to make sure they are formatted as efficiently as possible, taking into account things like server strain and bandwidth as well. Would it be easier to have one massive file, or break them up into individual sets? My current plan is to have individual sets, and format them as so:
NPH.txt
I mentioned earlier that the biggest obstacle is the GUIDs. Generating new ones is fine if we want to export newly-released sets, but when it comes to updating the rules text on pre-existing sets, we have to make sure the GUIDs are the same. If we generate new GUIDs for cards in a set each time we want to release a patch, then people will essentially have to recreate their decks every single time (and it'll mess up a lot of the autoscripts as well) a patch is released.
I have two ideas that would fix this solution, unfortunately both of them involve changing the OCTGN export settings in the extractor:
1. I publish a "GUID database" of every card from every set released on OCTGN to the internet. This would consist of a simple text webpage for each set, and lists the names and corresponding GUIDs for each of the cards. What the extractor would need to do is download these text files, and when it would generate a GUID for a card, instead it extracts the card's preexisting GUID from the list.
2. Instead of exporting to the OCTGN XML format, we have an option to export as an OCTGN excel spreadsheet, arranging the information into cell rows instead of the XML structure (but maintaining all the special rules I mentioned like for line breaks, " for quotations, etc). I can then quickly copy-paste the data into a spreadsheet on my computer which already lists card names and GUIDs, then simply write a =CONCATENATE function to compile each card into the XML format. Then I can simply copy the XML lines and replace the ones in the current set files.
I would like to see #1 happen because I've already gotten several requests by other developers to publish an online GUID database (for things like deck converters), so its a foundation that will already exist. However, exporting as an excel spreadsheet would also work and I'm sure several non-OCTGN users would like this feature. In fact, if we had an excel export option, then all the OCTGN-related stuff wouldn't be necessary anymore (except for the special text rules mentioned above, but those could be amalgamated as "export options")
Also, for the OCTGN export stuff, it'd be really useful if we had the ability to export multiple sets at once, it gets a little tedious having to do them one-at-a-time.
Also, I found a bug last night... cards with single colored mana costs (such as Ancestral Recall) do not show its mana symbol in the "Cost" column. Their costs are empty, but their converted mana costs still show up as 1. Colorless 1 mana cards show their symbols as normal, though.
Also, I wanted to clarify my earlier message about special characters. OCTGN/XML indeed supports special characters like Æ or ö, so if you've added replacement scripts to convert them to basic letters, then we don't need that anymore. The ONLY replacements that need to occur are " for double quotations, & for & symbols, and for line breaks.
Also, for the flip cards from Kamigawa, they can be treated the same way as the split cards.
We typically release incomplete sets and incremental patches as new sets are spoiled, so we'll likely have the entire GUID database for new sets online by the time the set arrives on gatherer. This actually makes it easier on us as well, as we used to type out the entire card information by hand as the cards are revealed (since we didn't have good extractors). Now we can simply record the name and GUID, and have the extractor fill in the rest once the set's released.