I've gotten a fair bit of experimenting done, and I'm getting good numbers on my training costs. Very good. Scary good. Good enough that my first thought is that something is wrong rather than it being a special net. I first thought that it may be over-fitting, but these are small nets with some more regularization than the char-rnn one. Once my current one finishes training I'll start generating cards to get a better idea of what's happening, but in the mean time I'm at 0.13 training loss and dropping.
0.13!?! What infernal contract did you sign to make that happen, because I want in on it!
And make sure about the overfitting. There could be swarms of look-alikes and subtle variations, not sure. But the prospect is exciting all the same.
EDIT: In any case, I'm interested to see how far we can push things while not specializing or tailoring the architecture (by that, I mean, without having to continually provide context information, I assume you haven't done that yet).
EDIT(2): More important question: How much extra regularization magic was needed? I'd love to consult the relevant literature.
Except there's a rule that states a card cannot be put into a hidden zone that isn't it's owners.
400.3. If an object would go to any library, graveyard, or hand other than its owner's, it goes to its owner's corresponding zone.
But 101.1 says that if a card contradicts a rule, the card takes precedence. So if 400.3 says 'You can't put an opponent's card into your hand' but a card says 'Put an opponent's card into your hand', isn't 400.3 superseded in that instance?
Tonight is a Friday, so obviously it's time to go through the 3700+ cards generated by the 0.8-temperature bias network! I hear other people go out and party but this is way more fun. And this time rather than digging for multicolour gems I'll be looking for any in particular
Starting with black!
Accurity Mast1B
Enchantment - Aura (Common)
Enchant creature
Whenever enchanted creature becomes tapped, add R to your mana pool for each aura attached to it.
Metalcraft — Accurity Mast has indestructible as long as it has a % counter on it.
At the beginning of your upkeep, put a % counter on Accurity Mast.
Remove three % counters from Accurity Mast: Add B to your mana pool.
~~~I'm impressed by the number of coherent abilities here. It's definitely not a common-level card, but eh. The % counters have a source, a passive use and an active use. It's also incredibly undercosted.
Bodowor Harvest2B
Enchantment (Rare)
Whenever a creature dies, put the top four cards of your library into your graveyard.
~~~The ultimate Sultai enchantment? Especially with Exploit, this can really provide a boost for Delve. I'd think the Chromantiflayer decks would love this.
Captain Hero2B
Creature - Human Wizard (Common)
When Captain Hero enters the battlefield, target opponent puts the top X cards of his or her library into his or her graveyard.
2/1
~~~Yeah, that name. I just had to share that name.
Disruptive Diefelit5BB
Creature - Demon (Mythic Rare)
Flying
At the beginning of your upkeep, flip a coin. If you lose the flip, exile Disruptive Diefelit.
Whenever a permanent an opponent controls deals combat damage to a player, that player draws a card.
5/5
~~~This could even have some red in it, with the coin flip, but I think the demon flavour works well enough for that here. I think it's terrific that this demon turned a bad thing (taking damage) into a good thing (drawing a card). It's pretty spot-on flavour-wise.
Sleeper's Edemoction3B
Enchantment (Rare)
Players can't play lands with the same name as a card in hand.
~~~Wow. One land at a time, then? Topdecking lands is the name of the game now.
That's black done. Might do other colours tonight; it takes a while to go through hundreds of cards manually.
edit: Mono blue really has nothing exciting. Mostly 'return to hand' cards from graveyard (which is black), simple counter-spells which are just boring, and oddly enough too many direct damage spells. This one is interesting though;
Spawning Loyalty1U
Instant (Uncommon)
Counter target spell unless its controller pays 2. If you do, you gain control of target permanent.
~~~Vastly undercosted, but interesting. I'll counter your spell and nick your Nissa, please and thank you.
Wild Prey3U
Instant (Common)
Counter target spell unless its controller pays X, where X is the number of creature cards in its controller’s hand.
~~~A nice variation on a counterspell, and it's unique too. But best of all, X worked out!
@Tiir Hey don't tease us like that, give a bit more explanation about what you did! lol
He doesn't want to get us all excited by boasting about successes he hasn't proven yet. He'll need to dig through a dump of cards generated by the network to see what's really going on. Remember, if the network is too good, it's possible that something is wrong, haha. Still, I'm excited to see what the results look like either way.
BUT, with an external context, many fun things can happen: THE RNN CAN LEARN TO USE THAT AS EXTERNAL MEMORY. Has such a thing ever been tried? The basic idea's simple:
in the training text, there are tags that mean that you start recording (we'd put such tags before the first usage of a counter name for instance). When e.g. [#a is met, context field a (a certain number S of characters) is flushed, then as each following character is met it is added there. After o of [#apupo, a contains "pupo...." and continues to store characters until it's full. Later the net can use the contents of the field to output the correct counter name.
Yes, actually! There was a work not long ago by Google's Deepmind on "Neural Turing Machines" (NTMs) (link here: http://arxiv.org/pdf/1410.5401v2.pdf), which are neural networks that have permanent external memory that they can use as a scratchpad. In fact, the authors of that paper suggest that
Additionally, NTM with a feedforward controller learns faster than NTM with an LSTM controller. These two results suggest that NTM’s external memory is a more effective way of maintaining the data structure than LSTM’s internal state. NTM also generalises much better to longer sequences than LSTM [...]"
That is, you can have a simple-minded network that doesn't have internal memory like our LSTM model, but it writes down everything it sees, and that it can give good results. I'm not completely convinced about that statement. There might be a better way of harmonizing the LSTM strategy where-by the LSTM holds onto short-term things and files long-term information away using the NTM.
But keep in mind that this paper only came out like 10 months ago, so this is all very new, unexplored territory. In May, Facebook researchers put out a similar work describing what they call "Memory Networks" (link here: http://arxiv.org/pdf/1410.3916.pdf), which they applied to the task of question answering. In short, the network has to read a text, record any relevant information about that text in external memory, and then the network has to be able to answer questions about that text on demand. And it does oh so well:
Bilbo travelled to the cave. Gollum dropped the ring there. Bilbo took the ring.
Bilbo went back to the Shire. Bilbo left the ring there. Frodo got the ring.
Frodo journeyed to Mount-Doom. Frodo dropped the ring there. Sauron died.
Frodo went back to the Shire. Bilbo travelled to the Grey-havens. The End.
Where is the ring? A: Mount-Doom
Where is Bilbo now? A: Grey-havens
Where is Frodo now? A: Shire
And when the text doesn't tell it exactly the info that it needs, it can fill in missing information using its knowledge of any previous texts that it studied:
Fred went to the kitchen. Fred picked up the milk. Fred travelled to the office.
Where is the milk ? A: office
Where does milk come from ? A: milk come from cow
What is a cow a type of ? A: cow be female of cattle
Where are cattle found ? A: cattle farm become widespread in brazil
What does milk taste like ? A: milk taste like milk
What does milk go well with ? A: milk go with coffee
Where was Fred before the office ? A: kitchen
Okay, so it's not perfect. Cows exist in places other than Brazil, "milk tastes like milk" is tautological, and the network has a very creative interpretation of English grammar. But it's still amazing to me because it started with no prior knowledge of English, let alone cows or offices. This sort of stuff is eventually going to replace (low-level) call centers and helpdesks. We're reaching the point where you can throw a manual at a machine and within minutes it's able to answer questions about the inner workings of cars and computers.
So yes, external memory is definitely a very exciting prospect. For our purposes, there's evidence that suggests we could generate Magic cards more efficiently using such a strategy, lol.
EDIT: For now I'd suggest using a strategy where we supply the context continually rather than leaving it up to the network to record things in external memory primarily because the span over which the network has to remember stuff is very small. We'd want it to flush its external memory after each card to avoid old ideas from creeping into new cards. On the other hand, external memory would make it easier to make cards that follow a particular theme, because it can keep revisiting the theme in different cards. That in turn forms the basis for things like fully automated set construction. So yeah, I can see why we'd use it in the long run, but in the short term, I think there are easier solutions that will deliver the results we need.
1) I would love to see a memory network fed the Comp Rules, then take the rules advisor test.
2) X spells can be regularized by defining X in the rules text of every card. Simply add the clause, "where X is the amount of mana paid," to spells with X in their cost. Now the network will see clearly that all Xs must be defined.
3) P/T and mana cost are very important, relative to their length. They are also very complex to determine correct values. A network striving for "good enough" may deliberately sacrifice accuracy on such complex fields in order to get better at easily predicted things like "enters the battl??????". We can prevent such calculated laziness by strongly penalizing and rewarding P/T and mana cost. If correctly assessing the string "3/4" was worth as much fitness as an entire valid body of rules text, the network would be forced to tackle the problem to achieve fitness.
With a great deal of generous help from Talcos, I'm getting close to having a version of the graveyard-themed set built out. I'm still not quite certain when the draft itself will happen--maybe week of Labor Day--but it's been good fun taking things this far! A few impressions from the process:
The side effects of our current "priming" process kinda bug me. I would rather just dump more cards freeform and sift using MSE to get the things we need. Dumping cards is far less computationally intensive than training new networks, and MSE makes finding the cards you need quite painless.
One caveat to that: at least on my computer, MSE filtering gets bogged down for card sets larger than a few megabytes. It's easier in the end to work with several smaller dump files than, say, one 20MB file.
For future set skeletons, I'd want to have a mana curve checklist along with type/rarity/etc. It's difficult to gauge that on the fly.
Current nets are not very good with Threshold. They do, however, produce some delightful Slivers, if we were to make a block featuring them.
I miss RNN-generated names, when working with dumps not featuring them. Until we have a good process for feeding a card back to a net and getting a name output, I think I'll stick with named cards, never mind the computational gains in leaving them off.
Some favorites going into the set! (Though tweaking may occur.) Talcos named "Relentless Haymaker," "X" is a nameless one yet to be christened, and the rest are RNN-named.
Relentless Haymaker (common) 4R
Artifact Creature ~ Scarecrow
Bloodthirst 1
Trample
Delve
(3/3)
~~~~~~~~
Spined Gelissa (uncommon) 3B
Creature ~ Horror
Whenever an opponent casts a red spell, put a +1/+1 counter on Spined Gelissa.
Threshold ~ as long as seven or more cards are in your graveyard, Spined Gelissa enters the battlefield with two +1/+1 counters on it. B, remove a +1/+1 counter from Spined Gelissa: put a 2/2 black zombie creature token onto the battlefield.
(3/3)
~~~~~~~~
Airn the Anabandar (uncommon) 6R
Creature ~ Giant
Whenever Airn the Anabandar becomes blocked, prevent all combat damage that would be dealt to Airn the Anabandar by target creature this turn.
(4/4)
~~~~~~~~
x (uncommon) 4R
Enchantment ~ Aura
Enchant Creature
Enchanted creature gets +10/+11 and has haste.
Whenever enchanted creature attacks and isn't blocked, destroy it.
Delve
Whenever a source you control deals damage to an opponent, you may untap it and remove it from combat.
~~~~~~~~
Marsh Leader (rare) 1W
Creature ~ Human Monk
Flash
When Marsh Leader enters the battlefield, put a +1/+1 counter on target creature you control.
When Marsh Leader leaves the battlefield, return it to the battlefield under your control at the beginning of the next end step.
(2/1)
~~~~~~~~
Knight of the Hean Master (rare) 7RR
Creature ~ Avatar
Kicker R
If Knight of the Hean Master was kicked, it enters the battlefield with a +1/+1 counter on it and with "2U: target creature blocks this turn if able."
Threshold ~ 2W, T: put a +1/+1 counter on target creature.
(7/7)
~~~~~~~~
Yep, the MSE sets are used to handling card numbers of normal set size (e.g. 2-300). Give it 30,000 cards and of course it'll feel bogged down, it simply wasn't designed for that. I can handle about 4000 cards on my PC with MSE without any slowdown, so that might be an ideal size.
@Elseleth What's the problem with Talcos' net's Threshold cards?
Well, part of the problem was the fact that I gave him dumps from my network trained only on modern cards. I forgot to take into consideration the fact that Odyssey block was not in modern, and the only modern legal card with threshold is the Time Spiral printing of Mystic Enforcer. So when I asked the network for threshold cards, it just made up new meanings on the spot for every single card (my bad).
Meanwhile, the previous networks did decently with threshold, but because the "seven or more cards in your graveyard" language sometimes comes before the relevant text (see Mystic Enforcer) and sometimes comes after (e.g. Barbarian Ring), the machine can inadvertently drop the language. That part's not so bad though because the intent is still clear (the text looks like the original templating of the threshold mechanic). The other problem is that threshold shows up on different card types, and even when a creature card starts with "as long as you have seven or more cards in your graveyard", it can potentially end with text like "@ deals 2 damage to target creature or player", which would only make sense on an instant or sorcery.
In any case, we appreciate the contributions. At this point Elseleth has chosen cards for every slot, but some cards may end up getting replaced if they can't be tweaked to fit the cost curve for the set, so he and I may end up borrowing some of these, we'll see.
We're still making our initial pass over the cards, seeing where we stand. I want to make it through this first round of tweaking and selection before we share it with y'all in a beautified form. The reason is that while design by committee can be a successful process, things will go more quickly and smoothly if we start with a product that's at least half-way done. Not much longer though. I'm sure Elseleth will be eager to share his work with you.
@Elseleth What's the problem with Talcos' net's Threshold cards?
I had a bugload of Threshold creatures with "Threshold ~ @ deals 2 damage to target creature or player." When the mechanic had something appropriate, it was usually boring, like "@ gets +2/+2". But then, I don't have your regex and grep skills, so sifting's a more eye-crossing process for me!
Please please, bitte, pretty please record the draftgame! Youtube or even twitch for life commentary would be awesome. Heh for heavens sake you could give twitch a deck and have a democracy of watchers play a round.
For numerous reasons, that's not too likely to happen. You'll have to settle for a blog post, maybe with some pictures, I'm afraid!
On that note there could be a nice social experiment, you stage 4 drafts:
Ambitious! I'd need more help on the ground to pull something like that off. Outing myself, here, I run mainly in tabletop RPG circles, not Magic. So I've struggled even to rustle up enough people to run a single draft, much less four where the participants don't overlap!
Got inspired to try rolling my own net architecture. I'm not totally sure what it's capable of (in particular, I'm not sure if it's capable of anything interesting), but I think it could have potential if anyone wants to mess with a different method of generation (and I actually put together the stuff I need for it).
Right now, it's written in pure Python. The basic concept is that the propagation of information through the network varies according to the structure of what was passed in. The idea is to have an order-agnostic input setup that takes partial information, and fills in the details. No directionality to memory, optimized for whispering. So far, I've only established that it can learn answers, and you have to train it on, at least in the simple case I've got, the questions you want answered in order to answer them. I think what I have right now is lacking some crucial randomness in the unspecified data, but I'm not sure how to inject that just yet.
By way of example, if I had a version of this network that could understand "1/1 Creature for G", a given iteration of the net would always produce the same creature for that query.
So yeah, not sure if this can be scaled up, but it might represent an alternative method of generation, if it goes anywhere.
Take a survey during and after games. That way you could gauge if players can differentiate between RNN and Real cards or if there is some "Uncanny valley" effect. Naturally you should randomize the names and flavortexts (to prevent name recognition) although if we get better with those, the RNN still could hold up if the players are told its a survey by WotC on a new set. Since there are books and such to the MTG sets one could also ask what the story of those books might be based on flavortexts names and mechanics.
I guess one could make a few further observations but i am not a student of psychology.
Edit: Talcos iirc. you worked in a university or something like that, maybe there is an department that would be interested in this.
I like the idea! However, the paperwork involved in obtaining human test subjects is tedious. I'm very thankful to work in a field where I can do most everything by simulation, haha.
But all the same, I would like to get feedback from those who draft the set.
Got inspired to try rolling my own net architecture. I'm not totally sure what it's capable of (in particular, I'm not sure if it's capable of anything interesting), but I think it could have potential if anyone wants to mess with a different method of generation (and I actually put together the stuff I need for it).
Right now, it's written in pure Python. The basic concept is that the propagation of information through the network varies according to the structure of what was passed in. The idea is to have an order-agnostic input setup that takes partial information, and fills in the details. No directionality to memory, optimized for whispering. So far, I've only established that it can learn answers, and you have to train it on, at least in the simple case I've got, the questions you want answered in order to answer them. I think what I have right now is lacking some crucial randomness in the unspecified data, but I'm not sure how to inject that just yet.
By way of example, if I had a version of this network that could understand "1/1 Creature for G", a given iteration of the net would always produce the same creature for that query.
So yeah, not sure if this can be scaled up, but it might represent an alternative method of generation, if it goes anywhere.
I'd be very interested in seeing how that turns out. Did you draw inspiration for your NN architecture from any previous works? I'm interested in the details.
---
Elseleth's set is looking pretty good. It'll need some more tweaking and balancing, but I'm sure he'll make it available to y'all soon enough.
Having a fun draft with machine-generated cards organized into a set was my end goal from day one. Two months and 16 days later and it looks like we'll achieve it.
After that happens (sometime in September, looks like), I'll probably end up scaling back my commitments to the project for the present. Not for lack of interest, mind you, but my machines will be running experiments for me practically 24/7, and I'll be busy writing my dissertation. That and a new semester has started today and I'll have undergraduates to help mentor and so on. All that having been said, I'm able and willing to offer advice and support to anyone who continues the work; my figurative/literal door is always open for those seeking consultation. More improvements can definitely be made. For card generation, there are lots of new and interesting architectures to consider that could dramatically improve upon our current capabilities. Meanwhile, I expect image generation capabilities to improve in the near future, and that opens up new possibilities as well.
For those of you who have been avid readers thus far, this is just the beginning. I don't just mean for Magic either. The sorts of things we've been discussing are having or will have a tremendous impact on diverse range of industries including transportation, manufacturing, healthcare, finance, law, etc. And yes, game design and development too. It's a very exciting time to be alive.
Traceback (most recent call last):
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/eventlet/wsgi.py", line 454, in handle_one_response
result = self.application(self.environ, start_response)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1836, in __call__
return self.wsgi_app(environ, start_response)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/engineio/middleware.py", line 34, in __call__
return self.wsgi_app(environ, start_response)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1820, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1403, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/_compat.py", line 33, in reraise
raise value
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1477, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1381, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/_compat.py", line 33, in reraise
raise value
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1475, in full_dispatch_request
rv = self.dispatch_request()
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1461, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/croxis/src/mtgai/app/main/views.py", line 263, in card_select
extra_template_data['urls'] = convert_to_urls(session['cardtext'], cardsep=session['cardsep'])
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/werkzeug/local.py", line 368, in <lambda>
__getitem__ = lambda x, i: x._get_current_object()[i]
KeyError: 'cardtext'
this is what I get when I just press Generate in Croxis' website. Something is wronk and someone is german (Werkzeug?)
Fixed! Also added hardcast's snapshot
Private Mod Note
():
Rollback Post to RevisionRollBack
Proud to be saving the world since 1984 -- I also have an open source website to make AI generated magic cards. Source code
[quote from="mwchase »" url="http://www.mtgsalvation.com/forums/creativity/custom-card-creation/612057-generating-magic-cards-using-deep-recurrent-neural?comment=1774"]Got inspired to try rolling my own net architecture. I'm not totally sure what it's capable of (in particular, I'm not sure if it's capable of anything interesting), but I think it could have potential if anyone wants to mess with a different method of generation (and I actually put together the stuff I need for it).
Right now, it's written in pure Python. The basic concept is that the propagation of information through the network varies according to the structure of what was passed in. The idea is to have an order-agnostic input setup that takes partial information, and fills in the details. No directionality to memory, optimized for whispering. So far, I've only established that it can learn answers, and you have to train it on, at least in the simple case I've got, the questions you want answered in order to answer them. I think what I have right now is lacking some crucial randomness in the unspecified data, but I'm not sure how to inject that just yet.
By way of example, if I had a version of this network that could understand "1/1 Creature for G", a given iteration of the net would always produce the same creature for that query.
So yeah, not sure if this can be scaled up, but it might represent an alternative method of generation, if it goes anywhere.
I'd be very interested in seeing how that turns out. Did you draw inspiration for your NN architecture from any previous works? I'm interested in the details.
Further experimentation (not hard to have, this was basically a weekend project) reveals that the network is stupidly sensitive to initial conditions. Can't really hope to scale it up until it's capable of producing acceptable results for any seed on my toy data-set. Got some ideas for how to do this (basically, I need to weight some of the errors in the middle of back propagation), but I'm going to wait until I get off work to try it out.
The internals are inspired/copy-pasted from this public domain code. The big difference in doing things is, instead of having a fixed set of layers, I've got a matrix of weights that the code rearranges into a network with three hidden layers based on the shape of the input. The network is a basic multi-layer perceptron that trains by back-propagation, but the weights determined during training could end up anywhere in the network, depending on the input.
The end goal, if I can get it into a state where scaling up looks plausible, is customize the network somewhat to the training set: given structured data, produce a linear fixed-width encoding (I'm not confident this is practical, but I want to try), and instantiate a network big enough to handle the encoded data. At some point, it'll make sense to convert this stuff to Numpy, but I need a decent algorithm first. What I have now is... it works sometimes? Better than just using the random module directly?
The big challenge is I'm getting some kind of mis-fitting situation where the network just puts things together wrong. Maybe what I need to do is try customizing the weights to the problem manually, and see if that gives me any insight into finding them programmatically.
Gotcha. I'd say what you're using will suffice for now, to get a feel for what works and what doesn't. I might recommend looking into a Python library like scikit-learn or Pybrain, those sorts of things. Libraries like that have a learning curve, but they're helpful because they're heavily optimized and they simplify the task of designing the underlying algorithm.
I used recurrent neural networks instead of simple multi-layer perceptrons primarily because the length of the input can vary. I'm not familiar with the idea of dynamically reshaping a simple feed-forward network in the way that you're suggesting. It seems like it would end up being too chaotic, and the results may end up being very disjointed. At least, from my understanding of what you've explained. Also, what exactly are you trying to get as the output for your network? I'm not entirely clear.
As for a fixed-width encoding (I assume you mean an encoding of an entire card), I'm not sure what you'd do for that. For reference, here's a link to a graph of the distribution of lengths of cards encoded in hardcast_sixdrop's input format. Cards are usually around 140-ish characters in length, but there's a significant amount of variance. If you're set on the idea, I'm sure there are ways.
Supertypes, types, and subtypes can be encoded as a very long vector (e.g. there are 228 creature types in Magic, so have 228 inputs that you can set to one or zero depending on whether the creature satisfies the type). Mana cost can be expressed as a series of numbers that count the number of each symbol that is present. If we ignore names for now, that leaves the body text, which is the primary contributor to the variance in card length. The question is how do we encode that information in a fixed amount of space? You could use padding to make all body text of the same length, not sure how well that'd go over though (hint: cut Dance of the Dead from your input corpus). There are lossy ways of encoding the information too.
EDIT: Oh, by the way, I was listening to some artificially-composed music this morning. The design of the underlying algorithm is interesting because it first tries to generate an overarching structure for the music to obey, which means that you get things like chord progression quite easily. Not that I think that we would want to generate cards in this way, but it does suggest a model for 100% automated set construction (the idea being that the network produces its own design document and then fills it in accordingly, with no human intervention). Just a fun thought.
Traceback (most recent call last):
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/eventlet/wsgi.py", line 454, in handle_one_response
result = self.application(self.environ, start_response)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1836, in __call__
return self.wsgi_app(environ, start_response)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/engineio/middleware.py", line 34, in __call__
return self.wsgi_app(environ, start_response)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1820, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1403, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/_compat.py", line 33, in reraise
raise value
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1477, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1381, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/_compat.py", line 33, in reraise
raise value
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1475, in full_dispatch_request
rv = self.dispatch_request()
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/flask/app.py", line 1461, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/croxis/src/mtgai/app/main/views.py", line 263, in card_select
extra_template_data['urls'] = convert_to_urls(session['cardtext'], cardsep=session['cardsep'])
File "/home/croxis/src/mtgai/venv/lib/python3.4/site-packages/werkzeug/local.py", line 368, in <lambda>
__getitem__ = lambda x, i: x._get_current_object()[i]
KeyError: 'cardtext'
this is what I get when I just press Generate in Croxis' website. Something is wronk and someone is german (Werkzeug?)
Fixed! Also added hardcast's snapshot
Thanks, but now the progress bar goes to 100 in about a second, and no card text or raw text is displayed at all.
Bleh. I was thinking over the kind of stuff I want to happen inside the box in my mental model that just says "NN magic happens here", and it became clear to me that what I'd actually like to see is some kind of reversible vectorized format for cards, that then we can do stuff like PCA to. There are various puzzles to this (Is it possible to encode creature types such that "Treefolf" is possible? Do I handle the absence of a number distinctly from 0? If so, how? How do I handle something that is usually a number, not being a number?) but I find these questions more interesting than "Your network is basically an elaborate coinflip. Cry? Y/Y"
Working out the structure of my crazy network was fun, and it's good knowledge to have in case I ever need to throw together something more legit for some reason, but if my network was the answer to anything, I never found the question.
And temp 0.7 is too high actually, not enough training yet
Grim the Past {4}{b}
Planeswalker 3 -- Liliana
+1: Target opponent reveals his or her hand. You choose a non-Land card from it. That player discards that card.
-2: Target player discards a card. If you do, return the top card of his or her library into his or her graveyard. Destroy that creature at the beginning of the next cleanup step.
-6: Put a 1/1 green Saproling creature token onto the battlefield.
-2: Destroy target creature. Its controller loses 4 life.
-6: You get an emblem with “At the beginning of your upkeep, you may have it deal 1 damage to target creature or player.”
-2: Put a 2/2 black Vampire creature token onto the battlefield that's a copy of that creature.
-6: You get an emblem with “At the beginning of your end step, return that card to the battlefield under your control at the beginning of the next end step.”
-2: Look at the top 4 cards of your library, then put them back in any order.
-7: You may look at a card with converted mana cost 3 or less.
-2: Target creature gets -2/-2 until end of turn.
-8: You get an emblem with “Whenever a player casts a green spell, you may put a +1/+1 counter on Grim the Past.”
-2: You gain 5 life.
-2: Gain control of target creature until end of turn. Untap that creature. It gains [Haste] until end of turn.
-7: You get an emblem with “At the beginning of your end step, if that player didn't attach a creature would deal damage to you this turn, you may top the barined this turn. If you do, untap it.”
I'm very interested to hear how you got this result. As a rule, the networks that I have trained almost never produce planeswalkers with more than two abilities because the network tries to keep card lengths close to the average. How is it that you're getting so many abilities on a single card? Is it that you have some kind of end marker on the ability which prompts the network to follow up with another (like with words such as fuse/transform), or what?
I was thinking over the kind of stuff I want to happen inside the box in my mental model that just says "NN magic happens here", and it became clear to me that what I'd actually like to see is some kind of reversible vectorized format for cards, that then we can do stuff like PCA to. There are various puzzles to this (Is it possible to encode creature types such that "Treefolf" is possible? Do I handle the absence of a number distinctly from 0?
You could try a autoencoder approach. The underlying concept is strongly related to PCA.
If so, how? How do I handle something that is usually a number, not being a number?) but I find these questions more interesting than "Your network is basically an elaborate coinflip. Cry? Y/Y"
Working out the structure of my crazy network was fun, and it's good knowledge to have in case I ever need to throw together something more legit for some reason, but if my network was the answer to anything, I never found the question.
Don't feel bad! It happens. Testing out crazy ideas that may or may not pan out is part of the process.
Testing out crazy ideas that may or may not pan out is part of the process.
I tend to feel that this kind of unbounded and unconstrained research is what usually leads to creative, innovative breakthroughs, even if at the time of discovery their utility might be a tad inscrutable.
The wikipedia page for autoencoder offhandedly lists everything that did go wrong with my network, or that I was afraid would go wrong. (Except for the initial sensitivity. Not sure what happened there, exactly.) What I was trying to do could more or less be summarized as a "variable" autoencoder. Hm. If I can get a proper fixed-width encoding (not a vectorization, just an encoding), it should be possible to rip the useless fancy stuff out of my net and turn it into an autoencoder with just one hidden layer.
Hm. For a start, I might try just giving it all creature types. See if I can find an encoding for that which works with stuff like Treefolf.
ETA: Running the autoencoder against my toy data is kind of weird, because the output is about right, but the hidden layer is doing inscrutable things.
I feel like we're constantly bitten by Clever Hans effects; it's side effects we don't think of all over the place. Why is X 0% when Z is 50+%? If not for a very unlucky random variation, Y was much worse with bias than without, why would that be?
Good point. It cheats at every opportunity. So while for some features the network may possess genuine understanding, for others it's just taking advantage of various cues.
Another idea: since the net's so bad at rote learning, we might as well put several copies of each card in a row and see what happens. Yes, why not encourage bleeding over? I bet this will be fun.
I'm afraid that copying data like that has a tendency of encouraging overfitting.
ETA: Running the autoencoder against my toy data is kind of weird, because the output is about right, but the hidden layer is doing inscrutable things.
That's the magic of it, isn't it? Immense power, but a lack of interpretability.
EDIT: Fun (short) story. I was training a network to analyze computer programs earlier today, and the possible outputs were "error" or "none", depending on whether an error was present or not in the input program. I gave it a very tricky program that looked like it had an error but in fact had none, and the output the network gave was "erone". As in, it was caught between the two decisions and decided to make up a third category. How clever of it.
EDIT: Fun (short) story. I was training a network to analyze computer programs earlier today, and the possible outputs were "error" or "none", depending on whether an error was present or not in the input program. I gave it a very tricky program that looked like it had an error but in fact had none, and the output the network gave was "erone". As in, it was caught between the two decisions and decided to make up a third category. How clever of it.
I'm not sure whether that should be deeply disturbing, or highly comforting.
EDIT: Fun (short) story. I was training a network to analyze computer programs earlier today, and the possible outputs were "error" or "none", depending on whether an error was present or not in the input program. I gave it a very tricky program that looked like it had an error but in fact had none, and the output the network gave was "erone". As in, it was caught between the two decisions and decided to make up a third category. How clever of it.
I'm not sure whether that should be deeply disturbing, or highly comforting.
I find that terribly entertaining. Kind of like Wall-E and the spork.
EDIT: Fun (short) story. I was training a network to analyze computer programs earlier today, and the possible outputs were "error" or "none", depending on whether an error was present or not in the input program. I gave it a very tricky program that looked like it had an error but in fact had none, and the output the network gave was "erone". As in, it was caught between the two decisions and decided to make up a third category. How clever of it.
I'm not sure whether that should be deeply disturbing, or highly comforting.
I find that terribly entertaining. Kind of like Wall-E and the spork.
Everyone's so focused right now on whether automation will steal their jobs that they're overlooking the rapid (and successful) development of autonomous killing machines. It's completely off most people's radar. Not the world governments and their militaries, of course. There's an arms race afoot.
It does make me question sometimes whether the work I've done for the US government may have inadvertently enabled them. I mean, my work is peaceful in nature, I try to make software more reliable. But more reliable software includes more reliable killing machines, so there's that. During my summer at the labs, I once overheard two scientists talking about their research, and one of them said to the other, "You see, that's our problem. The [Department of Defense] used to be satisfied with megadeaths, but now they want gigadeaths."
I pretended not to hear them, but I thought to myself, "Why?". Seriously, any weapon that could take out a China-sized population in one go would probably permanently cripple the planet and leave the survivors envious of the dead, so I didn't really see the point. A few days later, outside of the compound's defensive perimeter, there were anti-nuclear-war protesters who were stretched out along the main road, having outlined their own bodies with chalk, impeding the flow of traffic. I used to think that the people on the outside were overreacting, but after hearing what the people on the inside were saying, I'm much more sympathetic to their concerns.
After that, I started to question why it was that people like me were getting funding from the directorate. They say they support "science and technology in the national interest", but what do they really want? What is the end goal?
Don't get me wrong, I enjoyed my work there immensely, and I got to see some of the most advanced technology known to mankind (that was very cool), but I have no plans to ever return.
EDIT: To those of you who do work in such places, don't take this as me chastising any of you. I'm sure your work is all well and good, and the pay is amazing. It's just that I don't want to have my Wikipedia page years from now say "Talcos was instrumental in the development of murderbots, which overtook malaria as the leading cause of infant mortality in West Africa in the early 2030s." Not my cup of tea.
Yes but don't forget I'm using a rather small net that shows no signs of ever getting into an overfit. So it may be like it would autowhisper for a few cards in a row, then go to something unrelated. Would I get some of the advantages of the repeater string?
It seems like an easier plan would be to up the learning rate slightly, so that it learned more from each card, rather than duplicating the cards (which I'd imagine would achieve a similar effect). If you're not seeing any signs of overfitting, you might be safe to increase the learning rate ever-so-slightly.
I wonder if there's not a bit too much of the idea that "bigger is better" in the deep learning community. When everything's ripe, sure, probably, but look how much we can learn from a smaller net! As we're still groping in the dark it's interesting to do lots of quick experiments and compare results. I think it's a legitimate and probably fruitful research question: can we characterize more of the rnn's behavior at a smaller size regarding various input or hyperparameter changes? Like, are there statistical aspects of later dumps that depend on the random seed used for training? Do you get the same results if you increase seq_length as training goes (I want faster training, so smaller seq_length first would be desirable, but I was afraid it would miss even more on the quotes for instance)? Do you get better results with higher dropout, and likewise can you up it as training goes? Etc, etc. Just ideas in the wild I'd try right away if my CPU wasn't on something else.
[...]
With faster training I could train different types of nets quicker to compare results. Obviously the net could train faster than what I just did; and also, can we push it a bit more at the end of training? Better results in 50 epochs max look achievable from the tables.
It's probably easier to characterize, yes. But I'll warn you that there's a huge gap in our theoretical knowledge when it comes to describing these sorts of systems outside of what we can learn experimentally, on a case by case basis. A lot of the answers I can give you are based mostly on my intuition and my hands-on experience rather than any carefully crafted theoretical insights like I'm accustomed to providing.
But your results are very interesting. I'm wondering what the upper bound is on what we can achieve with the tech we have, and how much farther we could get with some better, more tailored approaches. I'll be doing some experiments with Neural Turing Machines soon, to see how they compare with LSTM networks when it comes to problem solving tasks, and if it looks like it can hold on to long-term dependencies better, I might see how well they handle Magic cards.
0.13!?! What infernal contract did you sign to make that happen, because I want in on it!
And make sure about the overfitting. There could be swarms of look-alikes and subtle variations, not sure. But the prospect is exciting all the same.
EDIT: In any case, I'm interested to see how far we can push things while not specializing or tailoring the architecture (by that, I mean, without having to continually provide context information, I assume you haven't done that yet).
EDIT(2): More important question: How much extra regularization magic was needed? I'd love to consult the relevant literature.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
400.3. If an object would go to any library, graveyard, or hand other than its owner's, it goes to its owner's corresponding zone.
But 101.1 says that if a card contradicts a rule, the card takes precedence. So if 400.3 says 'You can't put an opponent's card into your hand' but a card says 'Put an opponent's card into your hand', isn't 400.3 superseded in that instance?
Starting with black!
Accurity Mast 1B
Enchantment - Aura (Common)
Enchant creature
Whenever enchanted creature becomes tapped, add R to your mana pool for each aura attached to it.
Metalcraft — Accurity Mast has indestructible as long as it has a % counter on it.
At the beginning of your upkeep, put a % counter on Accurity Mast.
Remove three % counters from Accurity Mast: Add B to your mana pool.
~~~I'm impressed by the number of coherent abilities here. It's definitely not a common-level card, but eh. The % counters have a source, a passive use and an active use. It's also incredibly undercosted.
Bodowor Harvest 2B
Enchantment (Rare)
Whenever a creature dies, put the top four cards of your library into your graveyard.
~~~The ultimate Sultai enchantment? Especially with Exploit, this can really provide a boost for Delve. I'd think the Chromantiflayer decks would love this.
Captain Hero 2B
Creature - Human Wizard (Common)
When Captain Hero enters the battlefield, target opponent puts the top X cards of his or her library into his or her graveyard.
2/1
~~~Yeah, that name. I just had to share that name.
Disruptive Diefelit 5BB
Creature - Demon (Mythic Rare)
Flying
At the beginning of your upkeep, flip a coin. If you lose the flip, exile Disruptive Diefelit.
Whenever a permanent an opponent controls deals combat damage to a player, that player draws a card.
5/5
~~~This could even have some red in it, with the coin flip, but I think the demon flavour works well enough for that here. I think it's terrific that this demon turned a bad thing (taking damage) into a good thing (drawing a card). It's pretty spot-on flavour-wise.
Sleeper's Edemoction 3B
Enchantment (Rare)
Players can't play lands with the same name as a card in hand.
~~~Wow. One land at a time, then? Topdecking lands is the name of the game now.
That's black done. Might do other colours tonight; it takes a while to go through hundreds of cards manually.
edit: Mono blue really has nothing exciting. Mostly 'return to hand' cards from graveyard (which is black), simple counter-spells which are just boring, and oddly enough too many direct damage spells. This one is interesting though;
Spawning Loyalty 1U
Instant (Uncommon)
Counter target spell unless its controller pays 2. If you do, you gain control of target permanent.
~~~Vastly undercosted, but interesting. I'll counter your spell and nick your Nissa, please and thank you.
Wild Prey 3U
Instant (Common)
Counter target spell unless its controller pays X, where X is the number of creature cards in its controller’s hand.
~~~A nice variation on a counterspell, and it's unique too. But best of all, X worked out!
He doesn't want to get us all excited by boasting about successes he hasn't proven yet. He'll need to dig through a dump of cards generated by the network to see what's really going on. Remember, if the network is too good, it's possible that something is wrong, haha. Still, I'm excited to see what the results look like either way.
Yes, actually! There was a work not long ago by Google's Deepmind on "Neural Turing Machines" (NTMs) (link here: http://arxiv.org/pdf/1410.5401v2.pdf), which are neural networks that have permanent external memory that they can use as a scratchpad. In fact, the authors of that paper suggest that
That is, you can have a simple-minded network that doesn't have internal memory like our LSTM model, but it writes down everything it sees, and that it can give good results. I'm not completely convinced about that statement. There might be a better way of harmonizing the LSTM strategy where-by the LSTM holds onto short-term things and files long-term information away using the NTM.
But keep in mind that this paper only came out like 10 months ago, so this is all very new, unexplored territory. In May, Facebook researchers put out a similar work describing what they call "Memory Networks" (link here: http://arxiv.org/pdf/1410.3916.pdf), which they applied to the task of question answering. In short, the network has to read a text, record any relevant information about that text in external memory, and then the network has to be able to answer questions about that text on demand. And it does oh so well:
And when the text doesn't tell it exactly the info that it needs, it can fill in missing information using its knowledge of any previous texts that it studied:
Okay, so it's not perfect. Cows exist in places other than Brazil, "milk tastes like milk" is tautological, and the network has a very creative interpretation of English grammar. But it's still amazing to me because it started with no prior knowledge of English, let alone cows or offices. This sort of stuff is eventually going to replace (low-level) call centers and helpdesks. We're reaching the point where you can throw a manual at a machine and within minutes it's able to answer questions about the inner workings of cars and computers.
So yes, external memory is definitely a very exciting prospect. For our purposes, there's evidence that suggests we could generate Magic cards more efficiently using such a strategy, lol.
EDIT: For now I'd suggest using a strategy where we supply the context continually rather than leaving it up to the network to record things in external memory primarily because the span over which the network has to remember stuff is very small. We'd want it to flush its external memory after each card to avoid old ideas from creeping into new cards. On the other hand, external memory would make it easier to make cards that follow a particular theme, because it can keep revisiting the theme in different cards. That in turn forms the basis for things like fully automated set construction. So yeah, I can see why we'd use it in the long run, but in the short term, I think there are easier solutions that will deliver the results we need.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
2) X spells can be regularized by defining X in the rules text of every card. Simply add the clause, "where X is the amount of mana paid," to spells with X in their cost. Now the network will see clearly that all Xs must be defined.
3) P/T and mana cost are very important, relative to their length. They are also very complex to determine correct values. A network striving for "good enough" may deliberately sacrifice accuracy on such complex fields in order to get better at easily predicted things like "enters the battl??????". We can prevent such calculated laziness by strongly penalizing and rewarding P/T and mana cost. If correctly assessing the string "3/4" was worth as much fitness as an entire valid body of rules text, the network would be forced to tackle the problem to achieve fitness.
4R
Artifact Creature ~ Scarecrow
Bloodthirst 1
Trample
Delve
(3/3)
~~~~~~~~
Spined Gelissa (uncommon)
3B
Creature ~ Horror
Whenever an opponent casts a red spell, put a +1/+1 counter on Spined Gelissa.
Threshold ~ as long as seven or more cards are in your graveyard, Spined Gelissa enters the battlefield with two +1/+1 counters on it.
B, remove a +1/+1 counter from Spined Gelissa: put a 2/2 black zombie creature token onto the battlefield.
(3/3)
~~~~~~~~
Airn the Anabandar (uncommon)
6R
Creature ~ Giant
Whenever Airn the Anabandar becomes blocked, prevent all combat damage that would be dealt to Airn the Anabandar by target creature this turn.
(4/4)
~~~~~~~~
x (uncommon)
4R
Enchantment ~ Aura
Enchant Creature
Enchanted creature gets +10/+11 and has haste.
Whenever enchanted creature attacks and isn't blocked, destroy it.
Delve
Whenever a source you control deals damage to an opponent, you may untap it and remove it from combat.
~~~~~~~~
Marsh Leader (rare)
1W
Creature ~ Human Monk
Flash
When Marsh Leader enters the battlefield, put a +1/+1 counter on target creature you control.
When Marsh Leader leaves the battlefield, return it to the battlefield under your control at the beginning of the next end step.
(2/1)
~~~~~~~~
Knight of the Hean Master (rare)
7RR
Creature ~ Avatar
Kicker R
If Knight of the Hean Master was kicked, it enters the battlefield with a +1/+1 counter on it and with "2U: target creature blocks this turn if able."
Threshold ~ 2W, T: put a +1/+1 counter on target creature.
(7/7)
~~~~~~~~
Well, part of the problem was the fact that I gave him dumps from my network trained only on modern cards. I forgot to take into consideration the fact that Odyssey block was not in modern, and the only modern legal card with threshold is the Time Spiral printing of Mystic Enforcer. So when I asked the network for threshold cards, it just made up new meanings on the spot for every single card (my bad).
Meanwhile, the previous networks did decently with threshold, but because the "seven or more cards in your graveyard" language sometimes comes before the relevant text (see Mystic Enforcer) and sometimes comes after (e.g. Barbarian Ring), the machine can inadvertently drop the language. That part's not so bad though because the intent is still clear (the text looks like the original templating of the threshold mechanic). The other problem is that threshold shows up on different card types, and even when a creature card starts with "as long as you have seven or more cards in your graveyard", it can potentially end with text like "@ deals 2 damage to target creature or player", which would only make sense on an instant or sorcery.
In any case, we appreciate the contributions. At this point Elseleth has chosen cards for every slot, but some cards may end up getting replaced if they can't be tweaked to fit the cost curve for the set, so he and I may end up borrowing some of these, we'll see.
We're still making our initial pass over the cards, seeing where we stand. I want to make it through this first round of tweaking and selection before we share it with y'all in a beautified form. The reason is that while design by committee can be a successful process, things will go more quickly and smoothly if we start with a product that's at least half-way done. Not much longer though. I'm sure Elseleth will be eager to share his work with you.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
For numerous reasons, that's not too likely to happen. You'll have to settle for a blog post, maybe with some pictures, I'm afraid!
By way of example, if I had a version of this network that could understand "1/1 Creature for G", a given iteration of the net would always produce the same creature for that query.
I like the idea! However, the paperwork involved in obtaining human test subjects is tedious. I'm very thankful to work in a field where I can do most everything by simulation, haha.
But all the same, I would like to get feedback from those who draft the set.
And we are all indebted to you for your contributions.
I'd be very interested in seeing how that turns out. Did you draw inspiration for your NN architecture from any previous works? I'm interested in the details.
---
Elseleth's set is looking pretty good. It'll need some more tweaking and balancing, but I'm sure he'll make it available to y'all soon enough.
Having a fun draft with machine-generated cards organized into a set was my end goal from day one. Two months and 16 days later and it looks like we'll achieve it.
After that happens (sometime in September, looks like), I'll probably end up scaling back my commitments to the project for the present. Not for lack of interest, mind you, but my machines will be running experiments for me practically 24/7, and I'll be busy writing my dissertation. That and a new semester has started today and I'll have undergraduates to help mentor and so on. All that having been said, I'm able and willing to offer advice and support to anyone who continues the work; my figurative/literal door is always open for those seeking consultation. More improvements can definitely be made. For card generation, there are lots of new and interesting architectures to consider that could dramatically improve upon our current capabilities. Meanwhile, I expect image generation capabilities to improve in the near future, and that opens up new possibilities as well.
For those of you who have been avid readers thus far, this is just the beginning. I don't just mean for Magic either. The sorts of things we've been discussing are having or will have a tremendous impact on diverse range of industries including transportation, manufacturing, healthcare, finance, law, etc. And yes, game design and development too. It's a very exciting time to be alive.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Fixed! Also added hardcast's snapshot
Further experimentation (not hard to have, this was basically a weekend project) reveals that the network is stupidly sensitive to initial conditions. Can't really hope to scale it up until it's capable of producing acceptable results for any seed on my toy data-set. Got some ideas for how to do this (basically, I need to weight some of the errors in the middle of back propagation), but I'm going to wait until I get off work to try it out.
The end goal, if I can get it into a state where scaling up looks plausible, is customize the network somewhat to the training set: given structured data, produce a linear fixed-width encoding (I'm not confident this is practical, but I want to try), and instantiate a network big enough to handle the encoded data. At some point, it'll make sense to convert this stuff to Numpy, but I need a decent algorithm first. What I have now is... it works sometimes? Better than just using the random module directly?
The big challenge is I'm getting some kind of mis-fitting situation where the network just puts things together wrong. Maybe what I need to do is try customizing the weights to the problem manually, and see if that gives me any insight into finding them programmatically.
I used recurrent neural networks instead of simple multi-layer perceptrons primarily because the length of the input can vary. I'm not familiar with the idea of dynamically reshaping a simple feed-forward network in the way that you're suggesting. It seems like it would end up being too chaotic, and the results may end up being very disjointed. At least, from my understanding of what you've explained. Also, what exactly are you trying to get as the output for your network? I'm not entirely clear.
As for a fixed-width encoding (I assume you mean an encoding of an entire card), I'm not sure what you'd do for that. For reference, here's a link to a graph of the distribution of lengths of cards encoded in hardcast_sixdrop's input format. Cards are usually around 140-ish characters in length, but there's a significant amount of variance. If you're set on the idea, I'm sure there are ways.
Supertypes, types, and subtypes can be encoded as a very long vector (e.g. there are 228 creature types in Magic, so have 228 inputs that you can set to one or zero depending on whether the creature satisfies the type). Mana cost can be expressed as a series of numbers that count the number of each symbol that is present. If we ignore names for now, that leaves the body text, which is the primary contributor to the variance in card length. The question is how do we encode that information in a fixed amount of space? You could use padding to make all body text of the same length, not sure how well that'd go over though (hint: cut Dance of the Dead from your input corpus). There are lossy ways of encoding the information too.
EDIT: Oh, by the way, I was listening to some artificially-composed music this morning. The design of the underlying algorithm is interesting because it first tries to generate an overarching structure for the music to obey, which means that you get things like chord progression quite easily. Not that I think that we would want to generate cards in this way, but it does suggest a model for 100% automated set construction (the idea being that the network produces its own design document and then fills it in accordingly, with no human intervention). Just a fun thought.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Working out the structure of my crazy network was fun, and it's good knowledge to have in case I ever need to throw together something more legit for some reason, but if my network was the answer to anything, I never found the question.
Oh, don't worry. If I have anything useful to contribute, I'll be sure to let you know.
I'm very interested to hear how you got this result. As a rule, the networks that I have trained almost never produce planeswalkers with more than two abilities because the network tries to keep card lengths close to the average. How is it that you're getting so many abilities on a single card? Is it that you have some kind of end marker on the ability which prompts the network to follow up with another (like with words such as fuse/transform), or what?
You could try a autoencoder approach. The underlying concept is strongly related to PCA.
Don't feel bad! It happens. Testing out crazy ideas that may or may not pan out is part of the process.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
I tend to feel that this kind of unbounded and unconstrained research is what usually leads to creative, innovative breakthroughs, even if at the time of discovery their utility might be a tad inscrutable.
Hm. For a start, I might try just giving it all creature types. See if I can find an encoding for that which works with stuff like Treefolf.
ETA: Running the autoencoder against my toy data is kind of weird, because the output is about right, but the hidden layer is doing inscrutable things.
Good point. It cheats at every opportunity. So while for some features the network may possess genuine understanding, for others it's just taking advantage of various cues.
I'm afraid that copying data like that has a tendency of encouraging overfitting.
That's the magic of it, isn't it? Immense power, but a lack of interpretability.
EDIT: Fun (short) story. I was training a network to analyze computer programs earlier today, and the possible outputs were "error" or "none", depending on whether an error was present or not in the input program. I gave it a very tricky program that looked like it had an error but in fact had none, and the output the network gave was "erone". As in, it was caught between the two decisions and decided to make up a third category. How clever of it.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
I find that terribly entertaining. Kind of like Wall-E and the spork.
And this is why neural networks on missiles would be good/bad/entertaining: https://plus.google.com/u/0/118230849166467604082/posts/S7B1FCJUaMB
Actually, that's a thing now. Intelligent missiles. Intelligent bullets. Not to mention the numerous war automata being developed for land, air, and sea.
Everyone's so focused right now on whether automation will steal their jobs that they're overlooking the rapid (and successful) development of autonomous killing machines. It's completely off most people's radar. Not the world governments and their militaries, of course. There's an arms race afoot.
It does make me question sometimes whether the work I've done for the US government may have inadvertently enabled them. I mean, my work is peaceful in nature, I try to make software more reliable. But more reliable software includes more reliable killing machines, so there's that. During my summer at the labs, I once overheard two scientists talking about their research, and one of them said to the other, "You see, that's our problem. The [Department of Defense] used to be satisfied with megadeaths, but now they want gigadeaths."
I pretended not to hear them, but I thought to myself, "Why?". Seriously, any weapon that could take out a China-sized population in one go would probably permanently cripple the planet and leave the survivors envious of the dead, so I didn't really see the point. A few days later, outside of the compound's defensive perimeter, there were anti-nuclear-war protesters who were stretched out along the main road, having outlined their own bodies with chalk, impeding the flow of traffic. I used to think that the people on the outside were overreacting, but after hearing what the people on the inside were saying, I'm much more sympathetic to their concerns.
After that, I started to question why it was that people like me were getting funding from the directorate. They say they support "science and technology in the national interest", but what do they really want? What is the end goal?
Don't get me wrong, I enjoyed my work there immensely, and I got to see some of the most advanced technology known to mankind (that was very cool), but I have no plans to ever return.
EDIT: To those of you who do work in such places, don't take this as me chastising any of you. I'm sure your work is all well and good, and the pay is amazing. It's just that I don't want to have my Wikipedia page years from now say "Talcos was instrumental in the development of murderbots, which overtook malaria as the leading cause of infant mortality in West Africa in the early 2030s." Not my cup of tea.
---
But anyway, Magic cards!
It seems like an easier plan would be to up the learning rate slightly, so that it learned more from each card, rather than duplicating the cards (which I'd imagine would achieve a similar effect). If you're not seeing any signs of overfitting, you might be safe to increase the learning rate ever-so-slightly.
It's probably easier to characterize, yes. But I'll warn you that there's a huge gap in our theoretical knowledge when it comes to describing these sorts of systems outside of what we can learn experimentally, on a case by case basis. A lot of the answers I can give you are based mostly on my intuition and my hands-on experience rather than any carefully crafted theoretical insights like I'm accustomed to providing.
But your results are very interesting. I'm wondering what the upper bound is on what we can achieve with the tech we have, and how much farther we could get with some better, more tailored approaches. I'll be doing some experiments with Neural Turing Machines soon, to see how they compare with LSTM networks when it comes to problem solving tasks, and if it looks like it can hold on to long-term dependencies better, I might see how well they handle Magic cards.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.