All Tags

#multi-gpu

3 posts tagged with "multi-gpu"

Tensor vs Pipeline Parallelism: When Each Wins

Tensor parallelism cuts latency by splitting layers across GPUs. Pipeline parallelism increases throughput by splitting the model into stages. Choose based on your constraint.

Adding GPUs Without Linear Speedup

Four GPUs don't give you 4x throughput. Communication overhead, load imbalance, and synchronization eat into gains. Know the scaling curve before you buy.