I'll be getting back to training some NNs myself when I figure out how best to optimize... CPU through VM is proving to be too damn slow (I also wanna have 512 rnn-size 3-layer training happening in a few seconds), but GPU through VM is impossible (without upgrading my motherboard + cpu) and neural network software on Windows that uses CUDA seems to be a thing that does not exist. If I can't find any other solutions I'll have to dual-boot into Ubuntu so it accesses my GPU natively, but... I really don't like that option.
Alright, it's training. Estimated completion time in 23 hours.
I have no idea at all what will happen. The format looks pretty gnarly and isn't very human readable:
|4|7|1energy tap|3{UU}|6|5sorcery|9tap target untapped creature you control. if you do, add {XX} to your mana pool, where X is that creature's converted mana cost.|8
|5creature|3{RR}|9@ attacks each turn if able.\at the beginning of your upkeep, put a +&^/+&^ counter on @. then you may pay {XX}, where X is the number of +&^/+&^ counters on it. if you don't, tap @ and it deals X damage to you.|4|1primordial ooze|8&^/&^|6ooze|7
|3{^^WW}|6human wizard|5creature|9kicker {GG^} and/or {^^UU} \when @ enters the battlefield, if it was kicked with its {^GG} kicker, destroy target creature with flying.\when @ enters the battlefield, if it was kicked with its {^^UU} kicker, draw two cards.|1sunscape battlemage|8&^^/&^^|7|4
|6|1pyroblast|4|9[&^ = uncast target spell if it's blue. = destroy target permanent if it's blue.]|8|5instant|3{RR}|7
I could get rid of the leading '|' characters since the field separators indicate where fields start (I did get rid of the closing one), but this gives the structure a bit more consistency.
My scripts for encoding have gotten much, much better - you should be able to generate this format using the code in my github by tweaking a few command line arguments, no manual hacking of the code necessary. I still have to work on the decoding side, and the documentation. All in good time.
EDIT:
Oh, a thought about Ubuntu. You could try making a live CD / live USB stick, picking the 'try ubuntu' option, installing CUDA and Torch, and training from there. I don't know if that would work or not. I've seen some options for live USBs to keep files / installed software on them, and you should have enough space on like an 8GB usb thumb drive. Of course the disk will be very slow, and I have no idea what Cuda would do if you tried to install it on a live USB... but it could be worth a try.
Man, been a few days since I was able to do an update. Long story short I've got set up with some photoshop templates and I'm currently working at converting a bunch of cards into proper printable proxies. Eventually I figure I'll set up a dropbox or something where other people can download them as well. Here are a few examples so far:
Two things; wowza. Sinder the Tempest is awesome, I'd totally play her. Did the neural net really generate something so great by itself? Also is her casting cost meant to be one blue and one green? If yes that means the frame colours are off.
Which leads me to thing 2; for creating cards, you don't need to make custom photoshop templates, just download Magic Set Editor 2. It produces flawless cards.
Edit: @hardcast; what parameters are you training with, and what hardware?
Intel i7-3770K @ 4.4GHz
32 GB RAM @ 1866
GeForce GTX Titan, 6GB RAM, EVGA Superclocked Signature edition (it's the very original)
SSD and some 2TB drives in RAID 1 (don't do this on Ubuntu...)
Gigabyte Micro ATX mobo
The Most Hilariously Tiny Case You Ever Saw a Noctua NH-D14 in.
Dual boot Windows 8.1 / Ubuntu 14.04 LTS.
After spending the best part of two hours digging through this thread, I can savely say that this is one of the most awesomely interesting and hilarious projects here. My flat mate just got home and found me slumped deep in my armchair, deliriously giggling at some of your creations. My personal favorites were the crazy ones poste by ParadoxRegained. I nearly peed twice.
What are the minimum requirements to get this running to some extent?
For instance, I've got an old Acer Chromebook I could get Ubuntu running on, don't use it for much anymore (screen's too small for much work).
Underwhelming Specs: Intel Celeron 2955U Dual-core 1.40 GHz 11.6" HD (1366 x 768) 16:9 Intel HD Graphics with Shared Memory 2 GB, DDR3L SDRAM 16 GB SSD
Private Mod Note
():
Rollback Post to RevisionRollBack
Why did I ever think a signature in comic sans was a good idea?
Why only 2 epochs for the entire training? And why those particular values for batch size and sequence length?
I'm only doing two epochs because the dataset has every card duplicated 100 times in random orders and with randomized field orders. So, it's actually more similar to doing 200 epochs.
Those particular batch and sequence lengths just barely fit in the Titan's RAM. In fact it looks system variation caused it to run out of RAM and crash... Continuing from last checkpoint, hopefully it doesn't happen again.
EDIT: oh, and with that chromebook you'll probably be most limited by the RAM... You could probably fit a size 128 network on there but not a whole lot larger.
Ahh I see. Does the dataset duplication have any advantages over just one set?
What exactly defines a batch and sequence length, too? I haven't been able to find good definitions of those. I assume sequence length should be roughly the average character count of a card?
Two things; wowza. Sinder the Tempest is awesome, I'd totally play her. Did the neural net really generate something so great by itself? Also is her casting cost meant to be one blue and one green? If yes that means the frame colours are off.
Whoops you're right, I did totally get the frame colors off. I had looked at Theros block for an example, but I guess I should have looked for the dual-color ones in Nyx. I'll get on that.
All of the cards that I've posted earlier were all generated by RNN. With the art, I'm cleaning up some of the actual wordings to make them legal/more playable. Sinder has always been a legendary creature - god, but I did add "Indestructable" myself; for some reason my RNN doesn't like to make gods Indestructable. All of those abilities were generated by the RNN, but the text was not as "legal". For example, I think the original text for her tap ability was something like "Return a creature you control to target player's hand" and the top part was "when a creature is returned from the battlefield to your hand, reveal the top card of your library, put that creature onto the battle".
'Diamond Herging' and 'Sci, the Dragonglass' are both 100% RNN generated cards; I only modified the casting cost because previously Sci was a 1-drop.
The RNN does come up with some really unique abilities all by itself, it just needs a bit of editing by a human being who understands how to properly word the text to be compliant with magic rulings.
Just to let you know, I uploaded a second version of my sampling script to Google Drive for use with hardcast_sixdrop's input format (assuming fixed order). You can download it here. I'll also put a link up on the original post. This sampling script is much like the first, except now you can set the loyalty cost of planeswalkers with the "-loyalty" option, and I split the "-bodytext" option into a "-bodytext_prepend" and a "-bodytext_append" option, so that you can both prepend and append text to the body. This is useful for cards with multiple, connected clauses. The append function works by having the program detect the closing "|" symbol indicating the end of the body text, then rolling back the state of the network to the previous timestep, then priming at that point. Examples:
Riddshup 4R
Creature - Elemental
When Riddshup enters the battlefield, return target instant or sorcery card from your graveyard to your hand.
Evoke 4U
4/4
#I can't promise that the evoke will be useful (considering in this case that it's the same price as the creature), but it's useful to aid in our understanding of what the network is learning about various mechanics.
options: -types "instant" -bodytext_prepend "kicker" -bodytext_append "if @ was kicked,"
Carrion Ward 2R
Instant
Kicker - sacrifice a creature
Target creature gets +2/+2. If Carrion Ward was kicked, that creature gets -2/-2 instead.
#Similarly, it's interesting to see where the network goes with abilities like kicker. I will warn you that there are far too many instants generated by the network that want to put +1/+1 counters on instants and sorceries when they (never) enter the battlefield. Some improvements could be made there.
Land Strike 3R
Sorcery
Land Strike deals 5 damage to target creature. It's controller discards a card.
Hellbent - If you have no cards in hand, you may pay 1. If you do, put a 1/1 white spirit creature token with flying onto the battlefield.
#Color discipline is still weak in the version of the network that I've been testing, though I haven't had a chance to look at the most recent ones (though I definitely will as soon as I have my GPU set up as I'd like).
I'm waiting on hardcast_sixdrop's most recent experiments to see the effects of his recent modifications to his input format, and we'll see how to move forward from there. The recent batch of GPU-trained networks have been turning out very well (better than the one that generated the cards you see above), and I'll have that GPU-to-CPU script perfected some time early next week, so others who do not have access to appropriate hardware can sample from those most networks.
Ahh I see. Does the dataset duplication have any advantages over just one set?
Well, this is interesting. Let me give a quick explanation of how it works.
When the NN training code reads the data set, before doing anything else, it first partitions it a training set and a validation set. These are totally separate: if you have data in the validation set, you'll never train on it. Ever. Because otherwise it wouldn't be validation data.
The idea behind duplicating the data is that we might want to teach the NN that things are allowed to vary: for example, when randomizing the mana costs we want it to learn that {^RR} and {RR^} are the same thing. So we put in multiple cards that are otherwise identical, but have randomized mana costs. Then it can learn to associate whatever other pattern is in each of those cards with each variation of the mana cost, and hopefully make the connection that the mana costs are equivalent despite the order.
The same idea applies much more to the new format, which allows any ordering of the fields but distinguishes them with label characters. There are about 40k possible orderings for 8 fields, but that's to much data, so I choose 100 at random and call that good enough. The NN will see many cards with the order mixed up but the fields the same, and hopefully learn that the order doesn't matter while the labels do.
But, but, but, this wreaks havoc on the validation process. Since we duplicated the data, the validation set (5% of the data by default) is actually (well, probably, it's random so it could be anything) 5 field-randomized versions of the base dataset, not a subset of the base dataset that we avoided training on. So the validation loss is going to be really low, as observed. I believe it's lower than the training loss because of dropout - I have fairly large dropout, but it doesn't do that for validation. Someone tell me what's actually going on if you know.
There are tradeoffs. On the one hand, we don't really care about validation numbers as long as we get good cards, so this approach is essentially the same as setting the validation set to nothing and just training on as much data as possible. The validation loss is still measuring something though - it tells us how good the network is at figuring out that the order doesn't matter. If it's a perfect student, the validation loss should go to 0. We just have to be careful not to use it as a measure of overfitting - in fact, low validation loss is a pretty good indicator of overfitting.
What exactly defines a batch and sequence length, too? I haven't been able to find good definitions of those. I assume sequence length should be roughly the average character count of a card?
seq_length is how many characters are in a 'batch,' and batch_size is how many batches it trains on in parallel. I don't know if batch_size effects the results of the training or just the speed. Basically what it does is chop the data up into batches of size batch_size * seq_length, and then use those as the units it reports per batch stats like training loss for. I try to use a long seq length to prevent cards from being chopped up, but it doesn't make as big a difference as you might think.
Understanding exactly what the batching code does is probably a valuable endeavor, as it would make the whole data duplication issue go away. What we want is for the batcher to understand that cards are delimited objects, break the data into cards after reading it, partition the card set to get a validation set, then do the duplication and randomization of costs / field orders dynamically as it creates batches of whole cards. I mean to look at this at some point, but I'm not a lua expert so it will take some time. And I really have no idea how the batching code works - it could require the batches to be precisely aligned to make the matrix math it's doing work, which would complicate things.
First, Hi! My first post here and I've been a casual magic player off and on all the way back to 4th edition.
VirtualBox does support using hardware gpu. It requires installing some drivers into the guest os and flicking a switch in the virtual machine options. This should allow for cuda in the guest os. Installing linux to a usb stick, either as a live usb or a real install, is another option.
Any thought to packaging all of this up as a docker container? (Or whatever your favourite container formate is). It would make distribution a lot simpler as everything, including car-rnn and hardcast's scripts, is pre installed. Heck some checkpoints can be included as well. Docker has a windows client as well, it is basically a linux virtual machine. I don't know what kind of virtual machine however. A docker/whatever container also has some interesting cloud computing possibilities too *taps fingers together*.
It might also be possible to install torch in windows using cygwin as well, although a docker container would probably be more sane.
At the top of all my outputs I have the following text:
^[[0musing OpenCL on GPU 0...^[[0m
Using NVIDIA Corporation platform: NVIDIA CUDA
Using device: GeForce GTX 660
^[[0mcreating an LSTM...^[[0m
^[[0mmissing seed text, using uniform probability over first character^[[0m
^[[0m--------------------------^[[0m
lensary creature tokens onto the battlefield.|
Some debug output and the end of a card. Is this normal or is something odd happening on my end?
@hardcast_sixdrop, it looks like you are just doing string manipulation with your encoder/decoder. Have you thought about using a dict or object as an intermediary format? It would open up converting the output to all sorts of formats (forums, json, yaml, html, heaven forbid even xml). Could even set it up to recognize and properly handle old nn card formats. I have selfish reasons to request this too
I'm thinking of writing a simple website, probably in flask and python, that can take nn generated cards, rendering them as images (does MTGCardsmith have an API?), the user can select how many (including 0) of the cards they want, and the site will render a pdf for the user to print out and make their cube/deck/pack war/fancy hat. I would also really like to user Talcos idea and script to generate customized sets from a selection of brains, but that would be down the road. (is it possible to use sample.lua over multiple brains at once? say one brain developed tromple, another developed a new card type, etc? my impression is nn do not work that way?).
And some of my more humorous cards that I've generated from my own training (psst, how do I do click to show? I'll edit this post when I find out!). Looks like there is a new card type:
|shizo, long of junger||planemarker||creature||goblin scout|2/1|{1, R}|haste
whenever a player casts an artifact spell, you may pay {3}. if you control a red permanent you control. you gain 5 life for each zombion on the battlefield.|
# A new card type? a Planemarker
|sinscentuke charminfusion||sorcery||||{Blue}|each player skips his or her library in any order.|
|creet of hilled||creature||snion|1/1|{2 colorless and 1 black or blue}|whenever creet of hilled or another enchantment spell, creet of hilled deals 2 damage to you unless you pins into his or her graveyard.|
#Pin like a tack, pen, or wrestling?
|genomul pose||artifact||||{3 colorless}|genomul pose enters the battlefield tapped.\Tap: add {Blue} to your mana pool.|
#One of the more sane ones
|zameshot||enchantment||aura||{1,Blue}|enchant creature\whenever equipped creature dies, put two 1/1 white bird creature tokens with trample onto the battlefield. if {Black} was spent to cast it, copy target instant or sorcery spell this turn.|
# 1/1 birds with trample you say?
|shrike liege||creature||elf scout|0/4|{2, Black}|whenever a player casts a spell, look at the top of your library until you reveal it. exile one of those cards. if itn's gains haste, that player reveals three cards, then discards those cards.|
#Nonsense but I liked where it was going
|corrust dragon||creature||dragon|5/5|{5, R, R}|flying\at the beginning of each player's upkeep, if there [they?] are tied for least power, you gain life equal to its converted mana cost.|
# Another one where I liked where it was going
|robing faaries||creature||dragon|4/4|{3, W}|flying\
{R, R}, Tap: prevent the next 1 damage that would be dealt by target creature this turn.
{W}, Tap: put a 2/2 black bat creature token with defender until end of turn.|
# Dragon faaries in a bath robe...
|agorius blood||instant||||{1, Blue, Black}|spells you cast cost 1 less to cast if you target player.|
# I don't know my cards but I have not seen a mechanic like this before
|bioshif*** myr||artifact creature||construct|2/3|{^^^^}|impry {4, R, R} \when bioshif*** myr is turned face up, all lands you control gain flying until end of turn.|
# Yes, that is the f word. Flying lands seem to fit the name....
|phyrexian the goat||creature||giant warrior|6/6|{6, Green}|whenever phyrexian the goat blocks or becomes blocked, it gets +3/+3 until end of turn.|
|hellkite cathan||creature||giant|7/7|{6, G, G}|whenever hellkite cathan attacks, creatures without flying can't be blocked this turn.|
# I win!
|red spider||creature||sliver|2/2|{2, Blue}|all sliver creatures have defender.
{2, Blue}: target creature you control gets +1/+1 until end of turn.
draw a card.|
# Not a shabby anti sliver card if there is a card to transfer control to another player.
Do you have instructions as to how to get CUDA going inside a VirtualBox vm? I've searched the entire internet for such a thing but not found it. Well, I have, but that's for PCI Passthrough which I can't do because my cpu doesn't support VT-d.
Alternately, how would I set up Ubuntu on a stick so that it could run CUDA as well?
Phyrexian goat is hilarious, by the way. Also, click to show uses the spoiler tags. Also also, that text at the top of the output is normal, I'm told.
First, Hi! My first post here and I've been a casual magic player off and on all the way back to 4th edition.
VirtualBox does support using hardware gpu. It requires installing some drivers into the guest os and flicking a switch in the virtual machine options. This should allow for cuda in the guest os. Installing linux to a usb stick, either as a live usb or a real install, is another option.
Any thought to packaging all of this up as a docker container? (Or whatever your favourite container formate is). It would make distribution a lot simpler as everything, including car-rnn and hardcast's scripts, is pre installed. Heck some checkpoints can be included as well. Docker has a windows client as well, it is basically a linux virtual machine. I don't know what kind of virtual machine however. A docker/whatever container also has some interesting cloud computing possibilities too *taps fingers together*.
Hi! Welcome!
Yes, some kind of docker container or other form of self-contained package is definitely worth looking into in the near future. We're still trying to get the network performing as we would like, and once we've figured more about that, I'll turn my attention to packaging everything up for ease of use. But if you want to get a head start on that, go right ahead, haha.
I'm thinking of writing a simple website, probably in flask and python, that can take nn generated cards, rendering them as images (does MTGCardsmith have an API?), the user can select how many (including 0) of the cards they want, and the site will render a pdf for the user to print out and make their cube/deck/pack war/fancy hat. I would also really like to user Talcos idea and script to generate customized sets from a selection of brains, but that would be down the road. (is it possible to use sample.lua over multiple brains at once? say one brain developed tromple, another developed a new card type, etc? my impression is nn do not work that way?).
I'd eventually like a mapping function from content to reasonable art. That's actually well within the realm of possibility at this point. The art can either be fixed or generated on the fly. In the latter case, something something convolutional neural networks something something deepdream-esque iterative creation. But that's down the road for me.
And you can sample over as many specialized networks as you'd like, but highly specialized networks are prone to overfitting. That is, they do one thing very well, and only in very limited, repetitive ways. We've been developing specialized networks for the purpose of expanding the input corpus for a general purpose one (e.g. I made a post about a network that only generated planeswalkers a few days ago.)
Thinking about the X problem, it seems to me that we have two very different formats of X abilities:
The first is where we have an X in the cost. This might be long distance (X in the cost of a spell) or short distance (X in the cost of an ability). An example would be Fireball. However, this doesn't have to appear in the same line as the ability - Abandon Hope is an example of this.
The second is where we have an X value explained after the ability. An example would be Accelerated Mutation.
The problem here is that the RNN sometimes will put an X in, thinking of the second type, and then forget and use the first type instead, failing to define it. Additionally, there are cards with more than 2 X effects with the same X: Æther Tide. I'm thinking that the solution (if possible) would be to reword all of the second type of X abilities to read like the first type. For example, Accelerated Mutation would read:
Where X is the highest converted man cost among creatures you control, target creature gets +X/+X until end of turn.
A little weird for us, but the RNN would always be confident of ending the card after it puts in a use for the X, but never before. We'd just want to flip those two phrases; Akroan Hoplite should still start with "Whenever Akroan Hoplite attacks, [...]".
This wouldn't solve Aspect of Wolf (At least not its errata'd text), but it should help for most cases, at least enough for it to use X abilities mostly correctly.
Another possibility would be to replace X that's used in costs by Y. This would have to be done manually and it would be easier to replace this type of X since you can search for it better.
VirtualBox does support using hardware gpu. It requires installing some drivers into the guest os and flicking a switch in the virtual machine options. This should allow for cuda in the guest os. Installing linux to a usb stick, either as a live usb or a real install, is another option.
I was not aware of this, having tried very hard to set it up a few years ago and failing miserably. Can you provide a link to some instructions?
@hardcast_sixdrop, it looks like you are just doing string manipulation with your encoder/decoder. Have you thought about using a dict or object as an intermediary format? It would open up converting the output to all sorts of formats (forums, json, yaml, html, heaven forbid even xml). Could even set it up to recognize and properly handle old nn card formats. I have selfish reasons to request this too
Currently the unscrambler is a total hack and about to be replaced. The encoder (and the future unscrambler) is intended to take either json or any of my encoded formats and produce an internal Card python object that knows everything about the card, and can be used for data analysis or smart reencoding. There is a lot of code hiding in lib/. It mostly works, but everything is still in active development. I'm planning to convert the unscrambler over tonight so there's a more or less fully featured toolchain. Adding the ability to output xml / whatever would be straightforward.
I'm planning to support a few human readable formats for copying onto the forum. If anybody has a recommendation for something more formal (like MSE formats or MTGCardSmith) that would be really cool too. Then we just need to link up another NN to do the art and we can make full visual spoilers automatically.
I looked into it some more and it looks like only CUDA works with virtualbox on linux hosts. However it does work in massthrough mode if one uses Hyper-V (withdows 8 pro or enterprise versions has it) instead. OpenCl might be a different story.
What's your source for this information? I'm desperate to get CUDA working in a VM, so if you know how to make that happen (or know someone who has) I'd like to know... though I was pretty sure I couldn't do it because of the VT-d instructions missing on my CPU. Regarding the chapter on hardware acceleration, I don't think that counts as being able to use CUDA drivers, because I installed the CUDA drivers on an old VM a few days back, and I had to make a fresh one since that toasted the old VM completely.
Cool. Installing torch on Windows is the right way to go anyway.
I did my own digging, and I couldn't find anything that claimed Hyper-V could do passthrough for GPUs. There are plenty of claims to the contrary. VirtualBox might be able to do it, but only from a Linux host, which completely defeats the purpose because then you could just install Torch natively. For good GPU virtualization it looks like you have to go to enterprise solutions like Citrix Xenserver + NVIDIA stuff, which I doubt is free.
Training is coming along for the reordered field dataset. I'll post some samples from it once I finish the card unscrambler.
Installing Torch was the easy part (since it was an installer). Turns out the training script wants two dependencies that are missing from the install, and I have no idea how to add them in. One is luafilesystem, which I've got installed on my 'other' luarocks install. The other is nngraph, which apparently does graph stuff. I wonder if I remove that dependency if the training script would fail... it doesn't seem necessary, since we don't make graphs?
It looks like its required. nngraph does graph tree math for nn which is probably needed for basic functionality, it doent actually draw graphs from what I am seeing.
Private Mod Note
():
Rollback Post to RevisionRollBack
Proud to be saving the world since 1984 -- I also have an open source website to make AI generated magic cards. Source code
To post a comment, please login or register a new account.
I'll be getting back to training some NNs myself when I figure out how best to optimize... CPU through VM is proving to be too damn slow (I also wanna have 512 rnn-size 3-layer training happening in a few seconds), but GPU through VM is impossible (without upgrading my motherboard + cpu) and neural network software on Windows that uses CUDA seems to be a thing that does not exist. If I can't find any other solutions I'll have to dual-boot into Ubuntu so it accesses my GPU natively, but... I really don't like that option.
I have no idea at all what will happen. The format looks pretty gnarly and isn't very human readable:
I could get rid of the leading '|' characters since the field separators indicate where fields start (I did get rid of the closing one), but this gives the structure a bit more consistency.
My scripts for encoding have gotten much, much better - you should be able to generate this format using the code in my github by tweaking a few command line arguments, no manual hacking of the code necessary. I still have to work on the decoding side, and the documentation. All in good time.
EDIT:
Oh, a thought about Ubuntu. You could try making a live CD / live USB stick, picking the 'try ubuntu' option, installing CUDA and Torch, and training from there. I don't know if that would work or not. I've seen some options for live USBs to keep files / installed software on them, and you should have enough space on like an 8GB usb thumb drive. Of course the disk will be very slow, and I have no idea what Cuda would do if you tried to install it on a live USB... but it could be worth a try.
Which leads me to thing 2; for creating cards, you don't need to make custom photoshop templates, just download Magic Set Editor 2. It produces flawless cards.
Edit: @hardcast; what parameters are you training with, and what hardware?
32 GB RAM @ 1866
GeForce GTX Titan, 6GB RAM, EVGA Superclocked Signature edition (it's the very original)
SSD and some 2TB drives in RAID 1 (don't do this on Ubuntu...)
Gigabyte Micro ATX mobo
The Most Hilariously Tiny Case You Ever Saw a Noctua NH-D14 in.
Dual boot Windows 8.1 / Ubuntu 14.04 LTS.
Why only 2 epochs for the entire training? And why those particular values for batch size and sequence length?
Godspeed, you guys.
UR Mizzix of the Izmagnus ~~~ Build your own win-condition: Finite Spellslinging
UR Brudiclad, Telchor Engineer ~~~ We are the Borg. We will add your biological and technological distinctiveness to our own.
WUB Oloro, Ageless Ascetic ~~~ A Guide to dying slowly
UBR Marchesa, the Black Rose ~~~ Marchesa's undying Marionettes
RGW Mayael the Anima ~~~ All Hail the Big Chungus
GWU Chulane, Teller of Tales ~~~ Permanents Only ETB Shenanigans
BGU Sidisi, Brood Tyrant ~~~ Sidisi's Restless Servants
WUBRG The Ur-Dragon ~~~ Dragons eat your face
For instance, I've got an old Acer Chromebook I could get Ubuntu running on, don't use it for much anymore (screen's too small for much work).
Underwhelming Specs: Intel Celeron 2955U Dual-core 1.40 GHz 11.6" HD (1366 x 768) 16:9 Intel HD Graphics with Shared Memory 2 GB, DDR3L SDRAM 16 GB SSD
I'm only doing two epochs because the dataset has every card duplicated 100 times in random orders and with randomized field orders. So, it's actually more similar to doing 200 epochs.
Those particular batch and sequence lengths just barely fit in the Titan's RAM. In fact it looks system variation caused it to run out of RAM and crash... Continuing from last checkpoint, hopefully it doesn't happen again.
EDIT: oh, and with that chromebook you'll probably be most limited by the RAM... You could probably fit a size 128 network on there but not a whole lot larger.
What exactly defines a batch and sequence length, too? I haven't been able to find good definitions of those. I assume sequence length should be roughly the average character count of a card?
Whoops you're right, I did totally get the frame colors off. I had looked at Theros block for an example, but I guess I should have looked for the dual-color ones in Nyx. I'll get on that.
All of the cards that I've posted earlier were all generated by RNN. With the art, I'm cleaning up some of the actual wordings to make them legal/more playable. Sinder has always been a legendary creature - god, but I did add "Indestructable" myself; for some reason my RNN doesn't like to make gods Indestructable. All of those abilities were generated by the RNN, but the text was not as "legal". For example, I think the original text for her tap ability was something like "Return a creature you control to target player's hand" and the top part was "when a creature is returned from the battlefield to your hand, reveal the top card of your library, put that creature onto the battle".
'Diamond Herging' and 'Sci, the Dragonglass' are both 100% RNN generated cards; I only modified the casting cost because previously Sci was a 1-drop.
The RNN does come up with some really unique abilities all by itself, it just needs a bit of editing by a human being who understands how to properly word the text to be compliant with magic rulings.
options: -types "creature" -subtypes "elemental" -bodytext_prepend "when @ enters the battlefield" -bodytext_append "\\evoke "
Riddshup
4R
Creature - Elemental
When Riddshup enters the battlefield, return target instant or sorcery card from your graveyard to your hand.
Evoke 4U
4/4
#I can't promise that the evoke will be useful (considering in this case that it's the same price as the creature), but it's useful to aid in our understanding of what the network is learning about various mechanics.
options: -types "instant" -bodytext_prepend "kicker" -bodytext_append "if @ was kicked,"
Carrion Ward
2R
Instant
Kicker - sacrifice a creature
Target creature gets +2/+2. If Carrion Ward was kicked, that creature gets -2/-2 instead.
#Similarly, it's interesting to see where the network goes with abilities like kicker. I will warn you that there are far too many instants generated by the network that want to put +1/+1 counters on instants and sorceries when they (never) enter the battlefield. Some improvements could be made there.
options: -types "sorcery" -bodytext_append "\\hellbent ~"
Land Strike
3R
Sorcery
Land Strike deals 5 damage to target creature. It's controller discards a card.
Hellbent - If you have no cards in hand, you may pay 1. If you do, put a 1/1 white spirit creature token with flying onto the battlefield.
#Color discipline is still weak in the version of the network that I've been testing, though I haven't had a chance to look at the most recent ones (though I definitely will as soon as I have my GPU set up as I'd like).
I'm waiting on hardcast_sixdrop's most recent experiments to see the effects of his recent modifications to his input format, and we'll see how to move forward from there. The recent batch of GPU-trained networks have been turning out very well (better than the one that generated the cards you see above), and I'll have that GPU-to-CPU script perfected some time early next week, so others who do not have access to appropriate hardware can sample from those most networks.
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Well, this is interesting. Let me give a quick explanation of how it works.
When the NN training code reads the data set, before doing anything else, it first partitions it a training set and a validation set. These are totally separate: if you have data in the validation set, you'll never train on it. Ever. Because otherwise it wouldn't be validation data.
The idea behind duplicating the data is that we might want to teach the NN that things are allowed to vary: for example, when randomizing the mana costs we want it to learn that {^RR} and {RR^} are the same thing. So we put in multiple cards that are otherwise identical, but have randomized mana costs. Then it can learn to associate whatever other pattern is in each of those cards with each variation of the mana cost, and hopefully make the connection that the mana costs are equivalent despite the order.
The same idea applies much more to the new format, which allows any ordering of the fields but distinguishes them with label characters. There are about 40k possible orderings for 8 fields, but that's to much data, so I choose 100 at random and call that good enough. The NN will see many cards with the order mixed up but the fields the same, and hopefully learn that the order doesn't matter while the labels do.
But, but, but, this wreaks havoc on the validation process. Since we duplicated the data, the validation set (5% of the data by default) is actually (well, probably, it's random so it could be anything) 5 field-randomized versions of the base dataset, not a subset of the base dataset that we avoided training on. So the validation loss is going to be really low, as observed. I believe it's lower than the training loss because of dropout - I have fairly large dropout, but it doesn't do that for validation. Someone tell me what's actually going on if you know.
There are tradeoffs. On the one hand, we don't really care about validation numbers as long as we get good cards, so this approach is essentially the same as setting the validation set to nothing and just training on as much data as possible. The validation loss is still measuring something though - it tells us how good the network is at figuring out that the order doesn't matter. If it's a perfect student, the validation loss should go to 0. We just have to be careful not to use it as a measure of overfitting - in fact, low validation loss is a pretty good indicator of overfitting.
seq_length is how many characters are in a 'batch,' and batch_size is how many batches it trains on in parallel. I don't know if batch_size effects the results of the training or just the speed. Basically what it does is chop the data up into batches of size batch_size * seq_length, and then use those as the units it reports per batch stats like training loss for. I try to use a long seq length to prevent cards from being chopped up, but it doesn't make as big a difference as you might think.
Understanding exactly what the batching code does is probably a valuable endeavor, as it would make the whole data duplication issue go away. What we want is for the batcher to understand that cards are delimited objects, break the data into cards after reading it, partition the card set to get a validation set, then do the duplication and randomization of costs / field orders dynamically as it creates batches of whole cards. I mean to look at this at some point, but I'm not a lua expert so it will take some time. And I really have no idea how the batching code works - it could require the batches to be precisely aligned to make the matrix math it's doing work, which would complicate things.
VirtualBox does support using hardware gpu. It requires installing some drivers into the guest os and flicking a switch in the virtual machine options. This should allow for cuda in the guest os. Installing linux to a usb stick, either as a live usb or a real install, is another option.
Any thought to packaging all of this up as a docker container? (Or whatever your favourite container formate is). It would make distribution a lot simpler as everything, including car-rnn and hardcast's scripts, is pre installed. Heck some checkpoints can be included as well. Docker has a windows client as well, it is basically a linux virtual machine. I don't know what kind of virtual machine however. A docker/whatever container also has some interesting cloud computing possibilities too *taps fingers together*.
It might also be possible to install torch in windows using cygwin as well, although a docker container would probably be more sane.
At the top of all my outputs I have the following text:
Some debug output and the end of a card. Is this normal or is something odd happening on my end?
@hardcast_sixdrop, it looks like you are just doing string manipulation with your encoder/decoder. Have you thought about using a dict or object as an intermediary format? It would open up converting the output to all sorts of formats (forums, json, yaml, html, heaven forbid even xml). Could even set it up to recognize and properly handle old nn card formats. I have selfish reasons to request this too
I'm thinking of writing a simple website, probably in flask and python, that can take nn generated cards, rendering them as images (does MTGCardsmith have an API?), the user can select how many (including 0) of the cards they want, and the site will render a pdf for the user to print out and make their cube/deck/pack war/fancy hat. I would also really like to user Talcos idea and script to generate customized sets from a selection of brains, but that would be down the road. (is it possible to use sample.lua over multiple brains at once? say one brain developed tromple, another developed a new card type, etc? my impression is nn do not work that way?).
And some of my more humorous cards that I've generated from my own training (psst, how do I do click to show? I'll edit this post when I find out!). Looks like there is a new card type:
|shizo, long of junger||planemarker||creature||goblin scout|2/1|{1, R}|haste
whenever a player casts an artifact spell, you may pay {3}. if you control a red permanent you control. you gain 5 life for each zombion on the battlefield.|
# A new card type? a Planemarker
|sinscentuke charminfusion||sorcery||||{Blue}|each player skips his or her library in any order.|
|creet of hilled||creature||snion|1/1|{2 colorless and 1 black or blue}|whenever creet of hilled or another enchantment spell, creet of hilled deals 2 damage to you unless you pins into his or her graveyard.|
#Pin like a tack, pen, or wrestling?
|mheru elecrea||creature||griffin|2/1|{2 Colorless, 1 Black}|Tap: destroy target nonblack beast permanent or sorcery spell.|
|genomul pose||artifact||||{3 colorless}|genomul pose enters the battlefield tapped.\Tap: add {Blue} to your mana pool.|
#One of the more sane ones
|zameshot||enchantment||aura||{1,Blue}|enchant creature\whenever equipped creature dies, put two 1/1 white bird creature tokens with trample onto the battlefield. if {Black} was spent to cast it, copy target instant or sorcery spell this turn.|
# 1/1 birds with trample you say?
|shrike liege||creature||elf scout|0/4|{2, Black}|whenever a player casts a spell, look at the top of your library until you reveal it. exile one of those cards. if itn's gains haste, that player reveals three cards, then discards those cards.|
#Nonsense but I liked where it was going
|corrust dragon||creature||dragon|5/5|{5, R, R}|flying\at the beginning of each player's upkeep, if there [they?] are tied for least power, you gain life equal to its converted mana cost.|
# Another one where I liked where it was going
|robing faaries||creature||dragon|4/4|{3, W}|flying\
{R, R}, Tap: prevent the next 1 damage that would be dealt by target creature this turn.
{W}, Tap: put a 2/2 black bat creature token with defender until end of turn.|
# Dragon faaries in a bath robe...
|agorius blood||instant||||{1, Blue, Black}|spells you cast cost 1 less to cast if you target player.|
# I don't know my cards but I have not seen a mechanic like this before
|bioshif*** myr||artifact creature||construct|2/3|{^^^^}|impry {4, R, R} \when bioshif*** myr is turned face up, all lands you control gain flying until end of turn.|
# Yes, that is the f word. Flying lands seem to fit the name....
|phyrexian the goat||creature||giant warrior|6/6|{6, Green}|whenever phyrexian the goat blocks or becomes blocked, it gets +3/+3 until end of turn.|
|hellkite cathan||creature||giant|7/7|{6, G, G}|whenever hellkite cathan attacks, creatures without flying can't be blocked this turn.|
# I win!
|red spider||creature||sliver|2/2|{2, Blue}|all sliver creatures have defender.
{2, Blue}: target creature you control gets +1/+1 until end of turn.
draw a card.|
# Not a shabby anti sliver card if there is a card to transfer control to another player.
Alternately, how would I set up Ubuntu on a stick so that it could run CUDA as well?
Phyrexian goat is hilarious, by the way. Also, click to show uses the spoiler tags. Also also, that text at the top of the output is normal, I'm told.
Hi! Welcome!
Yes, some kind of docker container or other form of self-contained package is definitely worth looking into in the near future. We're still trying to get the network performing as we would like, and once we've figured more about that, I'll turn my attention to packaging everything up for ease of use. But if you want to get a head start on that, go right ahead, haha.
That's normal. The network begins in a random position, finishes a garbage card, and then gets into its rhythm and starts generating complete cards.
I'd eventually like a mapping function from content to reasonable art. That's actually well within the realm of possibility at this point. The art can either be fixed or generated on the fly. In the latter case, something something convolutional neural networks something something deepdream-esque iterative creation. But that's down the road for me.
And you can sample over as many specialized networks as you'd like, but highly specialized networks are prone to overfitting. That is, they do one thing very well, and only in very limited, repetitive ways. We've been developing specialized networks for the purpose of expanding the input corpus for a general purpose one (e.g. I made a post about a network that only generated planeswalkers a few days ago.)
My LinkedIn profile... thing (I have one of those now!).
My research team's webpage.
The mtg-rnn repo and the mtg-encode repo.
Another possibility would be to replace X that's used in costs by Y. This would have to be done manually and it would be easier to replace this type of X since you can search for it better.
I was not aware of this, having tried very hard to set it up a few years ago and failing miserably. Can you provide a link to some instructions?
Currently the unscrambler is a total hack and about to be replaced. The encoder (and the future unscrambler) is intended to take either json or any of my encoded formats and produce an internal Card python object that knows everything about the card, and can be used for data analysis or smart reencoding. There is a lot of code hiding in lib/. It mostly works, but everything is still in active development. I'm planning to convert the unscrambler over tonight so there's a more or less fully featured toolchain. Adding the ability to output xml / whatever would be straightforward.
I'm planning to support a few human readable formats for copying onto the forum. If anybody has a recommendation for something more formal (like MSE formats or MTGCardSmith) that would be really cool too. Then we just need to link up another NN to do the art and we can make full visual spoilers automatically.
For virtualbox if you want to test it anyways:
https://www.virtualbox.org/manual/ch04.html#idp96235792
https://www.virtualbox.org/manual/ch04.html#guestadd-3d
edit: I FOUND A WINDOWS TORCH VERSION. https://groups.google.com/forum/#!searchin/torch7/windows/torch7/A5XUU4u9Tjw/yvSJCBuLj4oJ. Scroll down to the bottom for installer. Just installed it, gonna try to get it working the same way as it does on Ubuntu.
I did my own digging, and I couldn't find anything that claimed Hyper-V could do passthrough for GPUs. There are plenty of claims to the contrary. VirtualBox might be able to do it, but only from a Linux host, which completely defeats the purpose because then you could just install Torch natively. For good GPU virtualization it looks like you have to go to enterprise solutions like Citrix Xenserver + NVIDIA stuff, which I doubt is free.
Training is coming along for the reordered field dataset. I'll post some samples from it once I finish the card unscrambler.