Sunday, 22 September 2019

How to train model on large batches when GPU can’t hold more than a few samples

How can you train model on large batches when GPU can’t hold more than a few samples?
There are some solutions:
  - Gradient Accumulation
  - Gradient Checkpointing
  - Distributed training: training on several machines

0 comments:

Post a Comment