LetterRip t1_itchnjl wrote on October 22, 2022 at 4:19 PM

Reply to comment by cygn in [R] Scaling Instruction-Finetuned Language Models - Flan-PaLM- Google 2022 - 75.2% on five-shot MMLU / Forecasters expected this SOTA would need until 2024! - Public checkpoints! by Singularian2501

I assume you mean 24GB of VRAM? Deepspeed with enough CPU RAM and mapping to hard drive as needed, might let you run it. Note that 540B parameters is more than 2 TB for float 32. Even going 8 bit, you are looking at 512 GB. Consumer hardware RAM is typically max 128 GB. So the vast majority of it is going to have to be mapped to the hard drive. The size can probably be reduced a lot using both quantization and compression, but you will either have to do the work yourself or wait till someone else does.

farmingvillein t1_itefjav wrote on October 23, 2022 at 12:42 AM

> Note that 540B parameters is more than 2 TB for float 32

They only provide checkpoints up to the 11B model, however (unless I'm reading things wrong), so this is a moot point, at the moment.

[deleted] t1_itef881 wrote on October 23, 2022 at 12:40 AM

[deleted]