nvidia apex
Scarecrow 1123 wrote a trainer for allennlp that uses nvidia's apex package to enable mixed precision training.
The full gist is available here.
This is a copy of the trainer provided..
I find that my models are more often successful if I specify "O1" instead of "O2" for amp. This uses only a set of whitelisted operations in half precision mode.
This trainer has the change already made.
To use this during training include a snippet like this in your training json config.
{
// ....
"trainer": {
"type": "fp16-trainer",
"mixed_precision": true,
// other options
}
// ....
}
and make sure the trainer is in a directory that you are including using --include-package
.
For a bert model I was training, it ran out of VRAM on a single GTX 1070 without apex configured. However with apex configured the model was only using 4.5GB. There was no discernable penalty with regard to the number of epochs required though I haven't investigated a ton.
No Comments