Viewing a single comment thread. View all comments

SuperSpaceEye t1_iwht6hf wrote

Two different tasks. Language model in SD just encodes text to some abstract representation that diffusion part of the model then uses. Text-to-text model such as GPT-J does different task which is much harder. Also, GPT-J is 6B parameters, which will only take like 12GB or VRAM, not hundreds.

3

Jordan117 OP t1_iwhtnxu wrote

Thanks for the clarification, I must have misread an older post talking about CPU memory requirements instead of GPU.

2