Quote from Talcos »
To be fair, you are expanding the network more than you think, because the network units at one layer are fully connected to the network units in the next layer, so if you have N nodes at each layer, that's N^2 new connections and N^2 new weights that we now have to take into consideration.
And worse how, exactly?
EDIT: Just a thought, have you tried using dropout to help train the network? like, if you do -dropout 0.5, then on each pass, we disable half of the network units. It's a very powerful regularization technique.
Worse as in outside of those, tons more gibberish and just meaningless cards that drag off halfway through text.
I'll try dropout in a future run - still playing with parameters, I didn't expect it would be as much of an expansion. Makes sense though. I'll get to that after I tweak the VB settings as:
Quote from SirSpunky »Some performance tips for people running VirtualBox on Windows:
Settings -> Storage -> Controller: SATA -> Check "Use Host I/O Cache". This alone made my generation twice as fast. However, I read that it will increase the risk of data corruption in your virtual OS, just so you know. I haven't had any problems so far though.
Settings -> System -> Processor -> Processor(s) -> Increase to max (within green color). Requires 64-bit Linux installation.
To be able to install 64-bit Linux in VirtualBox, you must enable virtualization in your normal BIOS. Using a 64-bit OS should also provide a speed boost.
There are other settings that you can experiment with, like Base Memory (I put mine at 2048 MB), Video Memory and 3D Acceleration (I enabled it and set video memory to max 128 MB), and if you have your virtual HD on an SSD you should probably check "Solid-state Drive" in Storage settings for your virtual HD.
On my 4 year old Intel i5-2500K, I'm getting around 0.3 sec/batch on default settings (128 rnn_size, 2 num_layers) on a virtual Ubuntu.
And that's likely the issue - I jammed up the RAM for the VB, but it looks like I'm still on a single processor when there's 4 available. Virtualization and 64 bit is enabled as well (to ensure I was able to use all the memory I wanted to). I'll try that after this current run.
Edit: Killed run and restarted, 3 layers went from 18s -> 12s per batch at size 600 with 4 CPUs and host cache.
Not seeing any performacne increase with dropout enabled. I'm reducing rnn size and getting 3s/batch at 300 and 3 layers. I'll see how this goes for the next day or so.