And another victory. That's 3-0 for AlphaGo; two more rounds to go, but that does mean that AlphaGo is officially the winner of the challenge. The folks at Deepmind say their next great challenge is to make an AI that can play Starcraft competitively. My only question is how long down the line is Magic: The Gathering. Unlike chess, Go, or Starcraft, both the game and the metagame are continuously evolving. I think that solving Magic in a meaningful way will require good transfer learning. But we'll cross that bridge when we get to it.
Anyway, I ran into a run-time memory error about five epochs in to the training yesterday. Since it worked for the first five epochs, it probably has something to do with the code not cleaning up after itself. Or something like that. I'll get it figured out.
And I almost have everything working for neural-doodle. Just a small compatibility issue. I'll let y'all know how that goes.
EDIT: I think I have doodle working. Doing some test runs to see how the images turn out.
My only question is how long down the line is Magic: The Gathering. Unlike chess, Go, or Starcraft, both the game and the metagame are continuously evolving. I think that solving Magic in a meaningful way will require good transfer learning. But we'll cross that bridge when we get to it.
Yeah. The ruleset for Magic is much more complex and fuzzier, game states are more variable, and the value evaluation for each action is probably a few orders of magnitude more complex than Go because of all that. That's not to say anything about the fact that Magic is a game of incomplete information and with a component of randomness, what makes things even more complex.
On the left is a painting by Monet, and on the right is a color-coded segmentation of that image into its constituent pieces. This is what we call semantic segmentation. This is actually a common step in image processing when you want to figure out what's what in an image. There are automated ways of doing, though systems like that are usually trained to work with photographs of the real world rather than paintings.
Next, I found a picture of some disembodied hands online. I decided to have some fun with them. I labeled them with colors corresponding to those in the semantic segmentation of the Monet image.
Then I applied the semantic style transfer. This is like neural style transfer (and neural-doodle can be used like the neural-style package), but now we can opt to direct the layering of the style onto the content. For this, I discarded the content all together, but you get the idea. It extracts patches from the original Monet and reshapes them as needed to create new art that follows the direction that I want.
What I've given is a very unambitious example. There's much more that could be done with this sort of thing; I need to do more testing with this. What's most interesting is the possibility of using automatic semantic segmentation to guide style transfer.
And I'm looking into fixing that training issue that I had with the card image generation. I'll get that fixed though.
Could there be a way set up a neural network to play a game of magic? And/or build and it's own deck?
Oh absolutely. Now, to get a bot that can play well, that's another story, and it depends heavily on how we approach the problem and the data that we have. Same goes for deck-building. In many ways, these are highly related topics, but I'll focus on just the gameplay aspect for now.
As we've seen with bots like AlphaGo and Giraffe, it's possible to integrate a neural-network-style approach into a conventional framework. You can have a "game tree" of sorts, where player A makes a move, player B responds to A, A responds to B responding to A, and so on, branching out into all the different possible realities (taking into account randomness and unknown information). A bot looks at each state and says "what is the value of getting to that game state?", and then tries to steer the game towards the path that is most likely to lead to victory. For this, you need an evaluation function, a way of comparing game states, and oftentimes, these are hard-coded.
For example, if you go into the source code for Forge, you'll find all kinds of evaluation rules that the AI. For example,
public static int evaluateCreature(final Card c) {
int value = 100;
if (c.isToken()) {
value = 80; // tokens should be worth less than actual cards
/* ... */
value += power * 15;
value += toughness * 10;
value += c.getCMC() * 5;
// Evasion keywords
if (c.hasKeyword("Flying")) {
value += power * 10;
}
/* ... */
if (c.hasKeyword("Unblockable")) {
value += power * 10;
}
/* ... */
return value;
}
So let's say that the AI has a Doom Blade in hand that it wants to cast, and it sees the opponent has a Storm Crow and a token that is a copy of Tidal Kraken. The bot would like to take out the biggest, most pressing threat.
According to Forge's creature evaluation function, Storm Crow is worth 155 points, and the Tidal Kraken token is worth 270 points. Killing the token would maximize the opponent's losses and minimize the bot's, so the bot kills the Kraken.
In this case, the choice was very clear cut (though some experts might argue that the Storm Crow was inherently more threatening than the Tidal Kraken), but there are limitations to this kind of approach. First, it's not a very empirical approach; we just arbitrarily decided that each point of power was worth 15 points and each point of toughness was worth 10. Second, hand-crafted solutions tend to be very brittle and render the AI unable to respond well to unforeseen situations.
For example, if the AI has an unanswerable Moat on the board, that reduces the Kraken's value to 176 points (I counted), which is still greater than the Crow's. Now, it's possible that there are guards in the code that I didn't see that would prevent the bad decision from being made, but that adds more layers of complexity, and more opportunities for failure.
One way of incorporating machine learning into this process is to replace parts of the evaluation function with learned models (e.g. neural networks). What's nice about these sorts of systems is that their responses have a stronger empirical basis and, in the case of neural networks, can be very flexible in the face of totally unforeseen situations. Of course, adding in magical black boxes can create other issues, like a lack of interpretability. A common example I bring up is when you have a visual question answering system like [url=http://cloudcv.org/vqa/]this one[/url], operating on [url=
http://cloudcv.org/app/media/pictures/vqaDemo/COCO_test2014_0000004518761453763791_36.jpg]this image[/url]. We ask the system the following questions:
Q: "What game is being played?" A: Tennis (confidence 99%)
Q: "What color is the man's shirt?" A: White (confidence 25%, blue in second place with confidence 18%)
Q: "What is the sex of the person?" A: Female (confidence 65%)
Q: "Is his wife cheating on him?" A: No (confidence 96%)
In the first two cases, we get back answers that are correct, and that we can verify using the image. We can even understand why the bot thought the shirt might be blue. In the third case, the bot makes a mistake, but we can forgive it because humans are one of the least sexually dimorphic species on the planet.
But why is the bot so sure that the tennis player's wife is faithful to him? He's always away at tournaments, all he thinks about is improving his game and not his relationship, and she feels oppressed always standing in the shadow of his success. We don't know why the bot is so convinced that she'd be happy in such a terrible relationship. On the bright side, at least it was able to respond with an answer to a question that it was never asked before. That shows flexibility... even if sometimes we don't get interpretability.
---
I need to go back and finish fixing that memory issue with the image generator training process, but it might have to wait. I have a 10 AM meeting tomorrow with someone who has come to give a lecture, a 1:30 PM meeting with an economist interested in doing population projections for our state, and a 3 PM meeting with my team. I have some preparations to make. Oh, and I just got word that another one of my papers was accepted for publication in an international journal, so I have to go celebrate.
But I promise I will return to it! And I'll release all the code needed to do the image generation on your own (including the most recent trained model).
P.S. I hear that AlphaGo lost a round against its human opponent. Interesting!
maplesmall - I second psycrow11's suggestion to add the flip text, maybe with something like // in between the two sides' text. Also would be helpful if the image of the card expanded larger on hover over or clicking the image itself. Also, also, not sure where you are on link throughs to TCG but Abbey Gargoyles didn't work when I clicked on it. Great work though!
AlphaGo's loss in round 4 is really interesting, as it illuminates the differences in what it means to play and understand a game for human and for a computer, even one backed by a deep NN. (I'm relying heavily on what the pro commentators were saying, especially on the American Go Association channel, so credit where it's due and errors are all mine.)
During game 3 and especially after it was looking lost, Lee Sedol spent a lot of time testing AlphaGo's responses, especially how it handles situations called ko: a no-infinite-loops rule that, in practice, means that you have to make a bigger threat somewhere else on the board before being able to take back the capture of a single stone. Computer tree search methods often have trouble with ko because it effectively doubles the number of moves required to look ahead, and because moves that are ordinarily not worthwhile can be valuable as ko threats. In game 4, as I understand the commentary, Lee Sedol then created a situation in the upper center part of the board in which the better sequences for AlphaGo all involved creating a ko, and it looks like AlphaGo was unable to see far enough down through the ko to have confidence about the result and instead played to avoid it. Once AlphaGo realized that it was behind, it started playing some baldly bad moves -- like "this only works if my opponent doesn't make the obvious answer" moves. This, too, is a characteristic failure mode for go AI: when it doesn't have statistical confidence in any particular move, then all moves that don't immediately lose the game are equal.
If this were a real match between pro players, they would have spent time leading up to it studying each other's styles, and so in a sense AlphaGo was cheating: it has studied Lee Sedol's games as part of its training set, but there were no game records available for Lee Sedol to study. The fact that he was able to dissect the play style in just three games is an amazing feat in itself and the kind of thing the AlphaGo team said they didn't even attempt to implement.
To bring this back to Magic, what it means is that the state of the art represented in AlphaGo would be able to tune the parameters that calculate the worth of a particular game state and even discover deep calculations that human players don't immediately see, but a human player would have the opportunity to probe the AI to find out what it prefers and adjust to match. In a game with such a deep appreciation for the metagame, I don't think an AI is going to be capable of sustained competitive play until there's been further revolutions in AI techniques of comparable magnitude to what brought us from Deep Blue to AlphaGo.
AlphaGo's loss in round 4 is really interesting, as it illuminates the differences in what it means to play and understand a game for human and for a computer, even one backed by a deep NN. (I'm relying heavily on what the pro commentators were saying, especially on the American Go Association channel, so credit where it's due and errors are all mine.)
Thank you for summarizing the commentary! I was too busy to follow the games very closely; what you've shared is very informative for me.
To bring this back to Magic, what it means is that the state of the art represented in AlphaGo would be able to tune the parameters that calculate the worth of a particular game state and even discover deep calculations that human players don't immediately see, but a human player would have the opportunity to probe the AI to find out what it prefers and adjust to match. In a game with such a deep appreciation for the metagame, I don't think an AI is going to be capable of sustained competitive play until there's been further revolutions in AI techniques of comparable magnitude to what brought us from Deep Blue to AlphaGo.
I think that's why Demis Hassabis of DeepMind said that their next target could be something like Starcraft. Mind you, beating the Koreans at all of their favorite games isn't the mission of DeepMind, but a game like Starcraft would be an interesting testbed for metagame analysis. And even if DeepMind doesn't pursue that right away, I assure that others are already thinking along similar lines, just as Facebook and Google's DeepMind have been concurrently working on mastering Go.
I think there are two parts to that analysis. The first is being able to identify a metagame, and the other part is being able to independently come up with possible metagames. These two reinforce each other, of course - generative models and predictive models being two sides of the same coin. I'm convinced that if we can overcome the challenge of metagame analysis for a game with fixed elements like Starcraft, we can do it for a game with evolving elements like Magic.
---
I'm in the process of figuring out how to organize all the image-generating scripts for release. I'd like to set everything up so you can just install the necessary packages, and play with it right away. Some things could be streamlined. I'll see about having all that ready in a day or two.
The best place to start would likely be with drafting. It's a much easier dataset to train for because the decisions involved are always the same, while an AI made to play the game would have to do several different tasks. It's also a much smaller dataset, as you only need to record the collector's number of each card picked, the order they're picked in, and the player picking - everything else can be extrapolated from that. The format would look like this:
:player: :collector's number:
And be stored in chronological order.
1 83
2 26
...
8 153
And then player one's next pick.
We have plenty of data available in the form of completed decks and their records that show what a good or bad deck looks like, so we can use a competitive approach with eight "player" networks that draft, and then a discriminator that compares them to player-drafted decks to determine a "winner."
That's... really interesting. With smaller (2- or 3-player) drafts, you'd even have the opportunity to encourage the AI to model the opponent's intentions based on what cards were taken.
EDIT: forgive my ignorance here, but is there a fairly large set of results of draft tournaments that could be used as training data? There's no absolute measure of quality of a drafted deck, you'd need to score it in pairings against the other drafted decks (via an NN that compares the decklists, as you describe).
But we want the AI to be capable of making choices that humans might not. So training merely on the collector's numbers would have the AI seeing one player pick green and red cards, while another player picks green and blue. If the AI just looked at the collector's numbers it would be unlikely for it to realize that three (or four) colours is probably not as smart as two, and so pick the red and blue cards. Especially since it can't even tell whether a card is red, green, or red/green.
Forgive my ignorance, but when I follow the install tutorial I get to this point: "luarocks install nngraph"
And hit this snag: "Your user does not have write permissions in[...]"
Halp?
EDIT: I tried "sudo chmod -R ugo+rw /usr/local" and now I just receive "Error no results matching query" when I try "luarocks install nngraph"
EDIT 2:
Went through this workaround, no idea what it does, just found it on google (after much, much searching):
Strange. Well, at least you found a workaround. I'm not sure why it wasn't working for you. Well, the permissions problem I understand, but not the "no results matching query problem."
and replace the X's with the numbers in the name of a checkpoint. But what numbers are those? The one after epoch? But there's only one number after epoch and two sets of X's. :S
The numbers themselves aren't especially important, it's just that you need to feed in a checkpoint file, preferably the most recent one, whichever that is. The numbers at the end should tell you the epoch.
What would decreasing the dropout to 45% do? Just make it more like the existing ones, or totally like the existing ones? And how would one do this?
Dropout is a regularization technique whereby we randomly disable a certain percentage of neurons at each time step during the training. The idea is that it prevents excessive co-adaptation of neurons that is symptomatic of overfitting.
As a purely hypothetical example, imagine that you had a neuron (or group of neurons) that received signals from a bunch of other neurons in previous layers and fires if it is expected that a creature card will have a mana-producing ability, based on the text preceding the body. The majority of cards we train that match that description are green, and many of those are elves, and many are druids. During training, when we ask it to predict the body text of Llanowar Elves, Joraga Treespeaker, Leaf Gilder, etc. it receives the signals
[creature, green, elf, druid]
and the outcome is "makes mana". At first, it makes mistakes, but overtime, through backpropagation, it connects the dots and it weights the incoming connections such that it will fire when it sees an "elf druid creature that is green". But not all elf druids make mana, and plenty of creatures make mana that aren't elf druids. If it learns an overly specific rules that give it the best performance on the training set, it's less likely to do well on the test set and, in general, less likely to give us interesting cards. What dropout does is randomly disable some neurons and their connections, so instead of seeing all those inputs, instead it sees inputs like
and in doing so, we're forcing the network to separate out the factors that cause creatures to make mana. Green cards can make mana, creatures can make mana, elves can make mana, druids can make mana, etc. In effect, a network with dropout of 25% acts like an optimal combination of "thinner" networks that are 75% of its size.
Obviously there are limits. If the dropout is greater than 50%, then the network can't count on any neuron being reliable, and this can hurt performance.
Dropout is set using the "-dropout" parameter when you call the training script. If you call the training script with "-dropout 0.25", you get 25% dropout, and so forth.
But we want the AI to be capable of making choices that humans might not. So training merely on the collector's numbers would have the AI seeing one player pick green and red cards, while another player picks green and blue. If the AI just looked at the collector's numbers it would be unlikely for it to realize that three (or four) colours is probably not as smart as two, and so pick the red and blue cards. Especially since it can't even tell whether a card is red, green, or red/green.
The AI will have access to the full set and be able to "see" that info.
Hit my first checkpoint!
...
And another problem.
/home/user/torch/install/bin/luajot: cannot open <cv/custom_format-256//lm_lstm_epoch12.61_0.3874.tz> in mode w at /home/user/torch/pkg/torch/lib/TH/THDiskFile.c:640
Whatever that means.
stack traceback:
{C}: at 0x7f48b4c00e80
{C}: in function 'DiskFile'
/home/user/torch/install/share/lua/5.1/torch/File.lua:385: in function 'save'
train.lua:363: in main chunk
{C}: in function 'dofile'
...aker/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
{C}: at 0x00405d70
Do those directories exist? The cv subdirectory, followed by the custom_format-256 directory? If not, you could create them with the mkdir command. That might be why you are having an issue.
---
Earlier in the week, I said I'd release a Magic-art-image-generation model and script for playing with it. Then I got hit by an avalanche of work. On the bright side, it's Spring break next week, so that frees up my schedule somewhat. I'm going to look into retraining the system to run with tags (images are passed to the network along with tags like "red" or "goblin", so that way the tags give us a more gentle way of controlling the output image rather than operating on the latent space directly.
Oh, and, as usual, the techinques that I've been using are already obsolete (probably). I just read a fascinating paper by Wang and Gupta entitled "Generative Image Modeling using Style and Structure Adversarial Networks". The novel aspect of their work is that they split up the generator network into two parts. The first is a structure generator that creates the geometry of the image (surface normals). Then that geometry is passed to the style generator that handles textures, lighting, appearance, etc.
Right now, with the setup we've got, style and structure are intermixed. That's why when I change the lighting of a scene, the geometry warps and shifts. Wang and Gupta are able to control these two aspects independently.
For the training data, you'd have to have surface normals for the artwork, but that's actually quite feasible. There are trained models out there for predicting the geometry of 2D images (example). Those systems are usually trained on photographs, but I suspect that they could work decently on artwork. Murk Dwellers no, Rafiq of the Many yes.
Not that I'll mess with that at the moment, but it's nice to know that these options exist.
EDIT: Oh, almost forgot! A team from Google made yet another interesting breakthrough in a paper entitled "One-Shot Generalization in Deep Generative Models". One-shot learning is where you have one or two images of a thing and you then have to arrive at a understanding of what that thing is (or in this case, be able to generate new and interesting versions of it). It's sort of a "holy grail" in computer vision. They're one step closer to achieving that. I've attached an example where the system is presented with a strange alphabet that it has never been trained on, and has to learn to write letters in that alphabet from just those single examples. They did another similar test with faces. Just a few faces, make new faces, that sort of thing. And it does decently, actually. Not bad. There's a lot more that needs to be done in this area, but the pay-off could be great. An example application would be to show a picture of an apple to a robot that has never seen apples before and say "This is an apple. Go find me more of these things. Bring them back to me. Kthxbai."
Since you're pasting a lot of output, you should run ./decode.py with the -f flag on your output file to make it have forum tags, so that it's real easy to see what you've pasted.
When I run training, does it straight afresh, or build off of what it's already learnt?
It can do either. You can either use an existing checkpoint as a starting point, or you can start all over. To start training from an existing checkpoint, use the "-init_from" parameter followed by the path to the checkpoint that you want to use as a starting point.
|5creature|4|6human berserker|7|8&^^/&^|9{WW}, sacrifice @: target creature gets -&^/-&^ until end of turn.\{^^^RR}: put a % counter on @.\countertype % fide|3{^^RR}|0O|1gorgon camph|
The % represents a counter type, which is defined elsewhere in the card. In the case of the card you generated, it should be read as...
Gorgon Camph 2R
Creature - Human Berserker (Common) W, Sacrifice Gorgon Camph: target creature gets -1/-1 until end of turn. 3R: Put a Fide counter on Gorgon Camph.
2/1
Interesting. I wonder what ongoing training at varied dropout values would do.
One way to find out I guess. (or two if you consider asking someone, but where's the fun in that?)
Another thing to test is to vary the way the representation is organized. The networks I train use a fixed order for all the fields like these:
|undead lior||creature||homunculus|N|&^^/&^|{GU^}|{^^UU}, T: put target creature on top of its owner's library.|
|orochi agent||creature||snake archer|N|&^^/&^^^|{^^GG^}|reach \flash\{WW}: put a +&^/+&^ counter on @.|
|fleetfire elemental||creature||elemental|O|&^^^^/&^|{^^RR^}|haste\{^BB^}: @ gets +&^^/+&^^ until end of turn.|
|master of the master||creature||human knight|A|&^^/&^^|{^WW^^}|first strike, protection from white and from black\as long as @ is on the stack, each creature has "{^WW}, T: @ deals &^ damage to target creature or player."|
You can control how the input is organized by passing different parameters to the mtgencode program.
By the way, Master of the Master is... a very unusual card.
Master of the Master 3W
Creature - Human Knight (Rare)
First strike, protection from white and from black.
As long as Master of the Master is on the stack, each creature has "1W, T: This creature deals 1 damage to target creature or player."
2/2
Are there any cards that resemble it?
Anyway, on the subject of dropout, note that overfitting (where the network copies the cards it studies too closely) can come in many forms. For example, this is what I call a pseudo-clone:
Mesa Enchantress GW
Creature - Human Druid (Rare)
Whenever an enchantment is put into a graveyard from the battlefield, you may put a +1/+1 counter on Mesa Enchantress.
1/2
It has Mesa Enchantress's name and Femeref Enchantress's cost, body, and trigger clause. Just something to look out for when you are comparing results taken from networks trained under different conditions.
I'm interested in modifying this to help me generate random monsters for D&D (5e), or at least generate new special abilities, as I'm already able to generate very good base stat blocks. Is the code locked into the given json format and fields, or could it be tweaked for any input/output? If so, where would the changes need to be made?
First off, I really enjoy the random monsters you can already create. That's some pretty nifty programming right there.
Second off, I'm not the resident expert here, but the key thing for good neural networks is training, training, training. You need a good base data set to even consider trying to train a network on it. For example, the ~16,000 magic cards in existence right now are considered a 'small' data set. I tried making a Haiku-generating network and I scraped together a few tens of thousands of Haikus, and that didn't give the best results (though certainly it produced somerealgems). So to make monsters, you'd have to train it... on a lot of monsters. I know there have been plenty of monster manuals out there, but they probably don't even come to more than 2000 creatures total. Not to mention through the editions of D&D there have been many reprints of monsters. And older editions had different templating which would have to be taken into account (4e to 5e alone would be a nightmare).
So, short answer... no, this code isn't locked into anything. But getting the training data and preparing it is the onerous part. The neural network parameters could probably be tweaked to adjust for longer blocks of text (the average magic card is a few lines, average monster block is considerably more) but that's not tricky. Tricky is getting existing monsters from whatever source you choose and sanitizing/preparing that data for input (to the point where for magic cards we have the stupendous mtgencode library made by hardcast_sixdrop to do that for us, and that's from ONE source only).
However, since you can already generate monsters en masse, if you're happy with those being the sort of thing the neural network generates, you can just use your current program's output as training fodder for the network. Bear in mind, lack of diversity in training will make it harder for the network to come up with off-the-wall monsters by itself (unless you increase the 'temperature' of the sampling, which increases randomness and creativity but also increases the chances of mangled text like 'high priest of aoutgarigmpaswemkfpa').
Hope that helps, and I'll wait at this point for one of the actual experts to chime in and give their opinion, because neural-network generated D&D monsters would be absolutely fantastic.
I'm interested in modifying this to help me generate random monsters for D&D (5e), or at least generate new special abilities, as I'm already able to generate very good base stat blocks. Is the code locked into the given json format and fields, or could it be tweaked for any input/output? If so, where would the changes need to be made?
Took me longer to reply than I had intended, but hello and welcome! I had a look at your source code. I'm most impressed! I admire very deep and nested procedural generations; I can tell there's a lot of attention paid to detail.
I concur with everything that our resident expert (he's modest) maplesmall has told you. There are challenges with using character-level generative models, and it's not always the best fit for every situation. At the very least, I'm not sure that I'd recommend it as an end-to-end solution for you. However, there are plenty of ways that you could incorporate ML tools and techniques into your generation process (piece-wise).
For example, a lot of the stats can be drawn from learned distributions if needed. That being said, you've spent a lot of time and energy calibrating your stat-purchasing model, and I see no reason to throw that out if it works well for you.
Abilities could be generated in a way similar to what we do, though I might recommend a word-level model rather than our character-level one. Or perhaps even some kind of clause-level model, so it'd be like generating an abstract syntax tree of sorts... but I'm not know whether there are any good implementations of that sort available online.
Now, description text, there's something that you're going to be hard-pressed to do in a purely procedural way (unless you want it to come off as very artificial sounding). A generative architecture like ours could churn out monster description text just fine (and you have plenty of data for that), but it wouldn't deliver what you wanted because what it generated would not be conditioned upon or keyed to the monsters, so you'd get fascinating but arbitrary text attached to your monsters. Instead, you'd want to use a conditional neural language models, like the one used here. But instead of picture in -> text out, it'd be monster in -> text out. Same sort of idea. Of course, the technology for that is still maturing, so the results will probably be something of a mixed bag.
If you'd like, you can PM me or send me an email at rmmilewi (at) gmail (dot) com and we can talk more about this in detail. I can see about directing you towards the resources that you'd need.
By the way, while we're on the subject, Alec Radford just put out a lecture entitled "Deep Advances in Generative Modeling". It's a 40 minute presentation that covers virtually all the algorithms that we've been talking about in this thread and then some. If you skip ahead to 34:50, you can see some unpublished results where they condition an image generator on text.
-----
As for me, I've just about got the art-generation-conditioned-on-card-vectors thing coded, but I'm having to wait to run it. Right now I'm training a bunch of LSTM networks to do population forecasting for my state on behalf of some economists and census folks. But once that's done with, I'll be sure to train a new image generation model. Once I get that working, I can see about writing some scripts that'll integrate the card and art generation processes.
EDIT: On a related note, I just got some new hardware in, so that may mean that I'll be able to run Magic experiments in parallel with others. That'll speed things up, lol.
Not me, that's for sure. I recognize some of the names though, like Grefenstette. It looks like the Google Deepmind folks found a way to improve upon our work. I had recommended the use of a bidirectional LSTM, but I've been too busy to get around to testing that. The use of code compression is novel, however. I'm happy that I was able to attract people with adequate funding/time to investigate this topic in more detail.
I'm gonna have to take a look at their implementation whenever that becomes available. I might e-mail them about that later. The numbers look really good.
Thank you for sharing this! This is most helpful.
EDIT(1): I sent an e-mail to the lead authors with my congratulations and inquiries.
EDIT(2): I had a lovely conversation with the folks at DeepMind. Their work has a lot of fun tricks that I could co-opt for our purposes. They're not sure about a timeline for releasing their source code though, as there's a bureaucratic process for all that. I'd have to re-implement it. Which is fine.
EDIT(3): By the way, when I get the art renders working at higher resolutions (and one way or another, I will eventually), I'm going to have so much fun with it. I've found that, when I choose to guide the art generation process with an example, if it doesn't recognize an object in the scene, it tends to replace it with something else that conforms to the geometry. But it's not an exact match, it has artistic freedom (e.g. adding in feet, tail feathers, a mouth, and a dot for an eye). I think this process closely resembles what surrealist André Breton called "pure psychic automatism".
Oh, and I also attached a sample output that I made for one Thraximundar_ of reddit. The art was created by interpolation/inspiration from the art of semantically similar cards. Bit hard to make out, but like I said, that's one of the things I'm working on. The flavor text I kinda cheated on simply because I produced ten of them and hand-picked the best one; I'll need to work out an automated system for that so I don't contaminate RoboRosewater's cards with my ideas, lol.
@Talcos, looking forward to awesome images in 44 hours
https://www.youtube.com/watch?v=qUAmTYHEyM8
And another victory. That's 3-0 for AlphaGo; two more rounds to go, but that does mean that AlphaGo is officially the winner of the challenge. The folks at Deepmind say their next great challenge is to make an AI that can play Starcraft competitively. My only question is how long down the line is Magic: The Gathering. Unlike chess, Go, or Starcraft, both the game and the metagame are continuously evolving. I think that solving Magic in a meaningful way will require good transfer learning. But we'll cross that bridge when we get to it.
Anyway, I ran into a run-time memory error about five epochs in to the training yesterday. Since it worked for the first five epochs, it probably has something to do with the code not cleaning up after itself. Or something like that. I'll get it figured out.
And I almost have everything working for neural-doodle. Just a small compatibility issue. I'll let y'all know how that goes.
EDIT: I think I have doodle working. Doing some test runs to see how the images turn out.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Yeah. The ruleset for Magic is much more complex and fuzzier, game states are more variable, and the value evaluation for each action is probably a few orders of magnitude more complex than Go because of all that. That's not to say anything about the fact that Magic is a game of incomplete information and with a component of randomness, what makes things even more complex.
I think we've got a ways to go, still.
Next, I found a picture of some disembodied hands online. I decided to have some fun with them. I labeled them with colors corresponding to those in the semantic segmentation of the Monet image.
Then I applied the semantic style transfer. This is like neural style transfer (and neural-doodle can be used like the neural-style package), but now we can opt to direct the layering of the style onto the content. For this, I discarded the content all together, but you get the idea. It extracts patches from the original Monet and reshapes them as needed to create new art that follows the direction that I want.
What I've given is a very unambitious example. There's much more that could be done with this sort of thing; I need to do more testing with this. What's most interesting is the possibility of using automatic semantic segmentation to guide style transfer.
And I'm looking into fixing that training issue that I had with the card image generation. I'll get that fixed though.
P.S. I'm not sure what an "imprion" is.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Oh absolutely. Now, to get a bot that can play well, that's another story, and it depends heavily on how we approach the problem and the data that we have. Same goes for deck-building. In many ways, these are highly related topics, but I'll focus on just the gameplay aspect for now.
As we've seen with bots like AlphaGo and Giraffe, it's possible to integrate a neural-network-style approach into a conventional framework. You can have a "game tree" of sorts, where player A makes a move, player B responds to A, A responds to B responding to A, and so on, branching out into all the different possible realities (taking into account randomness and unknown information). A bot looks at each state and says "what is the value of getting to that game state?", and then tries to steer the game towards the path that is most likely to lead to victory. For this, you need an evaluation function, a way of comparing game states, and oftentimes, these are hard-coded.
For example, if you go into the source code for Forge, you'll find all kinds of evaluation rules that the AI. For example,
So let's say that the AI has a Doom Blade in hand that it wants to cast, and it sees the opponent has a Storm Crow and a token that is a copy of Tidal Kraken. The bot would like to take out the biggest, most pressing threat.
According to Forge's creature evaluation function, Storm Crow is worth 155 points, and the Tidal Kraken token is worth 270 points. Killing the token would maximize the opponent's losses and minimize the bot's, so the bot kills the Kraken.
In this case, the choice was very clear cut (though some experts might argue that the Storm Crow was inherently more threatening than the Tidal Kraken), but there are limitations to this kind of approach. First, it's not a very empirical approach; we just arbitrarily decided that each point of power was worth 15 points and each point of toughness was worth 10. Second, hand-crafted solutions tend to be very brittle and render the AI unable to respond well to unforeseen situations.
For example, if the AI has an unanswerable Moat on the board, that reduces the Kraken's value to 176 points (I counted), which is still greater than the Crow's. Now, it's possible that there are guards in the code that I didn't see that would prevent the bad decision from being made, but that adds more layers of complexity, and more opportunities for failure.
One way of incorporating machine learning into this process is to replace parts of the evaluation function with learned models (e.g. neural networks). What's nice about these sorts of systems is that their responses have a stronger empirical basis and, in the case of neural networks, can be very flexible in the face of totally unforeseen situations. Of course, adding in magical black boxes can create other issues, like a lack of interpretability. A common example I bring up is when you have a visual question answering system like [url=http://cloudcv.org/vqa/]this one[/url], operating on [url=
http://cloudcv.org/app/media/pictures/vqaDemo/COCO_test2014_0000004518761453763791_36.jpg]this image[/url]. We ask the system the following questions:
Q: "What game is being played?" A: Tennis (confidence 99%)
Q: "What color is the man's shirt?" A: White (confidence 25%, blue in second place with confidence 18%)
Q: "What is the sex of the person?" A: Female (confidence 65%)
Q: "Is his wife cheating on him?" A: No (confidence 96%)
In the first two cases, we get back answers that are correct, and that we can verify using the image. We can even understand why the bot thought the shirt might be blue. In the third case, the bot makes a mistake, but we can forgive it because humans are one of the least sexually dimorphic species on the planet.
But why is the bot so sure that the tennis player's wife is faithful to him? He's always away at tournaments, all he thinks about is improving his game and not his relationship, and she feels oppressed always standing in the shadow of his success. We don't know why the bot is so convinced that she'd be happy in such a terrible relationship. On the bright side, at least it was able to respond with an answer to a question that it was never asked before. That shows flexibility... even if sometimes we don't get interpretability.
---
I need to go back and finish fixing that memory issue with the image generator training process, but it might have to wait. I have a 10 AM meeting tomorrow with someone who has come to give a lecture, a 1:30 PM meeting with an economist interested in doing population projections for our state, and a 3 PM meeting with my team. I have some preparations to make. Oh, and I just got word that another one of my papers was accepted for publication in an international journal, so I have to go celebrate.
But I promise I will return to it! And I'll release all the code needed to do the image generation on your own (including the most recent trained model).
P.S. I hear that AlphaGo lost a round against its human opponent. Interesting!
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
During game 3 and especially after it was looking lost, Lee Sedol spent a lot of time testing AlphaGo's responses, especially how it handles situations called ko: a no-infinite-loops rule that, in practice, means that you have to make a bigger threat somewhere else on the board before being able to take back the capture of a single stone. Computer tree search methods often have trouble with ko because it effectively doubles the number of moves required to look ahead, and because moves that are ordinarily not worthwhile can be valuable as ko threats. In game 4, as I understand the commentary, Lee Sedol then created a situation in the upper center part of the board in which the better sequences for AlphaGo all involved creating a ko, and it looks like AlphaGo was unable to see far enough down through the ko to have confidence about the result and instead played to avoid it. Once AlphaGo realized that it was behind, it started playing some baldly bad moves -- like "this only works if my opponent doesn't make the obvious answer" moves. This, too, is a characteristic failure mode for go AI: when it doesn't have statistical confidence in any particular move, then all moves that don't immediately lose the game are equal.
If this were a real match between pro players, they would have spent time leading up to it studying each other's styles, and so in a sense AlphaGo was cheating: it has studied Lee Sedol's games as part of its training set, but there were no game records available for Lee Sedol to study. The fact that he was able to dissect the play style in just three games is an amazing feat in itself and the kind of thing the AlphaGo team said they didn't even attempt to implement.
To bring this back to Magic, what it means is that the state of the art represented in AlphaGo would be able to tune the parameters that calculate the worth of a particular game state and even discover deep calculations that human players don't immediately see, but a human player would have the opportunity to probe the AI to find out what it prefers and adjust to match. In a game with such a deep appreciation for the metagame, I don't think an AI is going to be capable of sustained competitive play until there's been further revolutions in AI techniques of comparable magnitude to what brought us from Deep Blue to AlphaGo.
Thank you for summarizing the commentary! I was too busy to follow the games very closely; what you've shared is very informative for me.
I think that's why Demis Hassabis of DeepMind said that their next target could be something like Starcraft. Mind you, beating the Koreans at all of their favorite games isn't the mission of DeepMind, but a game like Starcraft would be an interesting testbed for metagame analysis. And even if DeepMind doesn't pursue that right away, I assure that others are already thinking along similar lines, just as Facebook and Google's DeepMind have been concurrently working on mastering Go.
I think there are two parts to that analysis. The first is being able to identify a metagame, and the other part is being able to independently come up with possible metagames. These two reinforce each other, of course - generative models and predictive models being two sides of the same coin. I'm convinced that if we can overcome the challenge of metagame analysis for a game with fixed elements like Starcraft, we can do it for a game with evolving elements like Magic.
---
I'm in the process of figuring out how to organize all the image-generating scripts for release. I'd like to set everything up so you can just install the necessary packages, and play with it right away. Some things could be streamlined. I'll see about having all that ready in a day or two.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
:player: :collector's number:
And be stored in chronological order.
1 83
2 26
...
8 153
And then player one's next pick.
We have plenty of data available in the form of completed decks and their records that show what a good or bad deck looks like, so we can use a competitive approach with eight "player" networks that draft, and then a discriminator that compares them to player-drafted decks to determine a "winner."
EDIT: forgive my ignorance here, but is there a fairly large set of results of draft tournaments that could be used as training data? There's no absolute measure of quality of a drafted deck, you'd need to score it in pairings against the other drafted decks (via an NN that compares the decklists, as you describe).
Strange. Well, at least you found a workaround. I'm not sure why it wasn't working for you. Well, the permissions problem I understand, but not the "no results matching query problem."
The numbers themselves aren't especially important, it's just that you need to feed in a checkpoint file, preferably the most recent one, whichever that is. The numbers at the end should tell you the epoch.
Dropout is a regularization technique whereby we randomly disable a certain percentage of neurons at each time step during the training. The idea is that it prevents excessive co-adaptation of neurons that is symptomatic of overfitting.
As a purely hypothetical example, imagine that you had a neuron (or group of neurons) that received signals from a bunch of other neurons in previous layers and fires if it is expected that a creature card will have a mana-producing ability, based on the text preceding the body. The majority of cards we train that match that description are green, and many of those are elves, and many are druids. During training, when we ask it to predict the body text of Llanowar Elves, Joraga Treespeaker,
Leaf Gilder, etc. it receives the signals
[creature, green, elf, druid]
and the outcome is "makes mana". At first, it makes mistakes, but overtime, through backpropagation, it connects the dots and it weights the incoming connections such that it will fire when it sees an "elf druid creature that is green". But not all elf druids make mana, and plenty of creatures make mana that aren't elf druids. If it learns an overly specific rules that give it the best performance on the training set, it's less likely to do well on the test set and, in general, less likely to give us interesting cards. What dropout does is randomly disable some neurons and their connections, so instead of seeing all those inputs, instead it sees inputs like
[creature, green, elf]
[green,elf,druid]
[creature,elf,druid]
[elf,creature]
[creature,green]
[druid]
and in doing so, we're forcing the network to separate out the factors that cause creatures to make mana. Green cards can make mana, creatures can make mana, elves can make mana, druids can make mana, etc. In effect, a network with dropout of 25% acts like an optimal combination of "thinner" networks that are 75% of its size.
Obviously there are limits. If the dropout is greater than 50%, then the network can't count on any neuron being reliable, and this can hurt performance.
Dropout is set using the "-dropout" parameter when you call the training script. If you call the training script with "-dropout 0.25", you get 25% dropout, and so forth.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
The AI will have access to the full set and be able to "see" that info.
Do those directories exist? The cv subdirectory, followed by the custom_format-256 directory? If not, you could create them with the mkdir command. That might be why you are having an issue.
---
Earlier in the week, I said I'd release a Magic-art-image-generation model and script for playing with it. Then I got hit by an avalanche of work. On the bright side, it's Spring break next week, so that frees up my schedule somewhat. I'm going to look into retraining the system to run with tags (images are passed to the network along with tags like "red" or "goblin", so that way the tags give us a more gentle way of controlling the output image rather than operating on the latent space directly.
Oh, and, as usual, the techinques that I've been using are already obsolete (probably). I just read a fascinating paper by Wang and Gupta entitled "Generative Image Modeling using Style and Structure Adversarial Networks". The novel aspect of their work is that they split up the generator network into two parts. The first is a structure generator that creates the geometry of the image (surface normals). Then that geometry is passed to the style generator that handles textures, lighting, appearance, etc.
Right now, with the setup we've got, style and structure are intermixed. That's why when I change the lighting of a scene, the geometry warps and shifts. Wang and Gupta are able to control these two aspects independently.
For the training data, you'd have to have surface normals for the artwork, but that's actually quite feasible. There are trained models out there for predicting the geometry of 2D images (example). Those systems are usually trained on photographs, but I suspect that they could work decently on artwork. Murk Dwellers no, Rafiq of the Many yes.
Not that I'll mess with that at the moment, but it's nice to know that these options exist.
EDIT: Oh, almost forgot! A team from Google made yet another interesting breakthrough in a paper entitled "One-Shot Generalization in Deep Generative Models". One-shot learning is where you have one or two images of a thing and you then have to arrive at a understanding of what that thing is (or in this case, be able to generate new and interesting versions of it). It's sort of a "holy grail" in computer vision. They're one step closer to achieving that. I've attached an example where the system is presented with a strange alphabet that it has never been trained on, and has to learn to write letters in that alphabet from just those single examples. They did another similar test with faces. Just a few faces, make new faces, that sort of thing. And it does decently, actually. Not bad. There's a lot more that needs to be done in this area, but the pay-off could be great. An example application would be to show a picture of an apple to a robot that has never seen apples before and say "This is an apple. Go find me more of these things. Bring them back to me. Kthxbai."
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
It can do either. You can either use an existing checkpoint as a starting point, or you can start all over. To start training from an existing checkpoint, use the "-init_from" parameter followed by the path to the checkpoint that you want to use as a starting point.
The % represents a counter type, which is defined elsewhere in the card. In the case of the card you generated, it should be read as...
Gorgon Camph
2R
Creature - Human Berserker (Common)
W, Sacrifice Gorgon Camph: target creature gets -1/-1 until end of turn.
3R: Put a Fide counter on Gorgon Camph.
2/1
How delightfully bizarre.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Another thing to test is to vary the way the representation is organized. The networks I train use a fixed order for all the fields like these:
You can control how the input is organized by passing different parameters to the mtgencode program.
By the way, Master of the Master is... a very unusual card.
Master of the Master
3W
Creature - Human Knight (Rare)
First strike, protection from white and from black.
As long as Master of the Master is on the stack, each creature has "1W, T: This creature deals 1 damage to target creature or player."
2/2
Are there any cards that resemble it?
Anyway, on the subject of dropout, note that overfitting (where the network copies the cards it studies too closely) can come in many forms. For example, this is what I call a pseudo-clone:
Mesa Enchantress
GW
Creature - Human Druid (Rare)
Whenever an enchantment is put into a graveyard from the battlefield, you may put a +1/+1 counter on Mesa Enchantress.
1/2
It has Mesa Enchantress's name and Femeref Enchantress's cost, body, and trigger clause. Just something to look out for when you are comparing results taken from networks trained under different conditions.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Second off, I'm not the resident expert here, but the key thing for good neural networks is training, training, training. You need a good base data set to even consider trying to train a network on it. For example, the ~16,000 magic cards in existence right now are considered a 'small' data set. I tried making a Haiku-generating network and I scraped together a few tens of thousands of Haikus, and that didn't give the best results (though certainly it produced some real gems). So to make monsters, you'd have to train it... on a lot of monsters. I know there have been plenty of monster manuals out there, but they probably don't even come to more than 2000 creatures total. Not to mention through the editions of D&D there have been many reprints of monsters. And older editions had different templating which would have to be taken into account (4e to 5e alone would be a nightmare).
So, short answer... no, this code isn't locked into anything. But getting the training data and preparing it is the onerous part. The neural network parameters could probably be tweaked to adjust for longer blocks of text (the average magic card is a few lines, average monster block is considerably more) but that's not tricky. Tricky is getting existing monsters from whatever source you choose and sanitizing/preparing that data for input (to the point where for magic cards we have the stupendous mtgencode library made by hardcast_sixdrop to do that for us, and that's from ONE source only).
However, since you can already generate monsters en masse, if you're happy with those being the sort of thing the neural network generates, you can just use your current program's output as training fodder for the network. Bear in mind, lack of diversity in training will make it harder for the network to come up with off-the-wall monsters by itself (unless you increase the 'temperature' of the sampling, which increases randomness and creativity but also increases the chances of mangled text like 'high priest of aoutgarigmpaswemkfpa').
Hope that helps, and I'll wait at this point for one of the actual experts to chime in and give their opinion, because neural-network generated D&D monsters would be absolutely fantastic.
Took me longer to reply than I had intended, but hello and welcome! I had a look at your source code. I'm most impressed! I admire very deep and nested procedural generations; I can tell there's a lot of attention paid to detail.
I concur with everything that our resident expert (he's modest) maplesmall has told you. There are challenges with using character-level generative models, and it's not always the best fit for every situation. At the very least, I'm not sure that I'd recommend it as an end-to-end solution for you. However, there are plenty of ways that you could incorporate ML tools and techniques into your generation process (piece-wise).
For example, a lot of the stats can be drawn from learned distributions if needed. That being said, you've spent a lot of time and energy calibrating your stat-purchasing model, and I see no reason to throw that out if it works well for you.
Abilities could be generated in a way similar to what we do, though I might recommend a word-level model rather than our character-level one. Or perhaps even some kind of clause-level model, so it'd be like generating an abstract syntax tree of sorts... but I'm not know whether there are any good implementations of that sort available online.
Now, description text, there's something that you're going to be hard-pressed to do in a purely procedural way (unless you want it to come off as very artificial sounding). A generative architecture like ours could churn out monster description text just fine (and you have plenty of data for that), but it wouldn't deliver what you wanted because what it generated would not be conditioned upon or keyed to the monsters, so you'd get fascinating but arbitrary text attached to your monsters. Instead, you'd want to use a conditional neural language models, like the one used here. But instead of picture in -> text out, it'd be monster in -> text out. Same sort of idea. Of course, the technology for that is still maturing, so the results will probably be something of a mixed bag.
If you'd like, you can PM me or send me an email at rmmilewi (at) gmail (dot) com and we can talk more about this in detail. I can see about directing you towards the resources that you'd need.
By the way, while we're on the subject, Alec Radford just put out a lecture entitled "Deep Advances in Generative Modeling". It's a 40 minute presentation that covers virtually all the algorithms that we've been talking about in this thread and then some. If you skip ahead to 34:50, you can see some unpublished results where they condition an image generator on text.
-----
As for me, I've just about got the art-generation-conditioned-on-card-vectors thing coded, but I'm having to wait to run it. Right now I'm training a bunch of LSTM networks to do population forecasting for my state on behalf of some economists and census folks. But once that's done with, I'll be sure to train a new image generation model. Once I get that working, I can see about writing some scripts that'll integrate the card and art generation processes.
EDIT: On a related note, I just got some new hardware in, so that may mean that I'll be able to run Magic experiments in parallel with others. That'll speed things up, lol.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Not me, that's for sure. I recognize some of the names though, like Grefenstette. It looks like the Google Deepmind folks found a way to improve upon our work. I had recommended the use of a bidirectional LSTM, but I've been too busy to get around to testing that. The use of code compression is novel, however. I'm happy that I was able to attract people with adequate funding/time to investigate this topic in more detail.
I'm gonna have to take a look at their implementation whenever that becomes available. I might e-mail them about that later. The numbers look really good.
Thank you for sharing this! This is most helpful.
EDIT(1): I sent an e-mail to the lead authors with my congratulations and inquiries.
EDIT(2): I had a lovely conversation with the folks at DeepMind. Their work has a lot of fun tricks that I could co-opt for our purposes. They're not sure about a timeline for releasing their source code though, as there's a bureaucratic process for all that. I'd have to re-implement it. Which is fine.
EDIT(3): By the way, when I get the art renders working at higher resolutions (and one way or another, I will eventually), I'm going to have so much fun with it. I've found that, when I choose to guide the art generation process with an example, if it doesn't recognize an object in the scene, it tends to replace it with something else that conforms to the geometry. But it's not an exact match, it has artistic freedom (e.g. adding in feet, tail feathers, a mouth, and a dot for an eye). I think this process closely resembles what surrealist André Breton called "pure psychic automatism".
Oh, and I also attached a sample output that I made for one Thraximundar_ of reddit. The art was created by interpolation/inspiration from the art of semantically similar cards. Bit hard to make out, but like I said, that's one of the things I'm working on. The flavor text I kinda cheated on simply because I produced ten of them and hand-picked the best one; I'll need to work out an automated system for that so I don't contaminate RoboRosewater's cards with my ideas, lol.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.