SuperSpaceEye t1_iwht6hf wrote on November 15, 2022 at 7:12 PM

Reply to comment by Jordan117 in ELI5: Why such a big difference in compute cost for different types of media? by Jordan117

Two different tasks. Language model in SD just encodes text to some abstract representation that diffusion part of the model then uses. Text-to-text model such as GPT-J does different task which is much harder. Also, GPT-J is 6B parameters, which will only take like 12GB or VRAM, not hundreds.

Jordan117 OP t1_iwhtnxu wrote on November 15, 2022 at 7:15 PM

Thanks for the clarification, I must have misread an older post talking about CPU memory requirements instead of GPU.