Attention That Fits in Memory
Standard attention needs O(n²) memory. Memory-efficient variants need O(n). Same output, 10x less peak memory.
1 post tagged with "xformers"
Standard attention needs O(n²) memory. Memory-efficient variants need O(n). Same output, 10x less peak memory.