ortegaalfredo t1_je5urre wrote on March 29, 2023 at 5:03 PM

I run a discord with all models. Currently only 30B and 65B because nobody uses the smaller LLMs.

Even if superficially they both can answer questions, in complex topics 65B is much better than 30B, so not even compares with 7B.

deck4242 OP t1_je5zprk wrote on March 29, 2023 at 5:34 PM

Any chance i could access to your discord to try out the 65b ?

[removed]

What GPUs are you using to run them? Are you using any compression (i.e. quantization)?

2x3090, 65B is using int4, 30B is using int8 (required for LoRA)

can i discord?

[removed]