February 2020

https://www.proxmox.com/en/training/video-tutorials/item/bond-configuration

Infiniband notes

  • Infiniband is a good idea for multi-node gpu training as it can use RDMA. This allows exchanges between the gpus on each node without first copying the data from the gpu into system ram.
  • QDR:
    • 40 gbps per port
    • 1 micro second latency
    • $40 for a dual port card
  • Bonding of infiniband interfaces is not possible?

Purchases

  • Bought 3 quad intel nics for $35 on ebay. This should help with lizardfs bandwidth.
  • Bought an m40 gpu with 24gb of vram.
    • Needs a fan and the 3d printed bracket for adding fans to the passive card.