Submitted by deck4242 t3_125q87z in MachineLearning
ortegaalfredo t1_je5urre wrote
I run a discord with all models. Currently only 30B and 65B because nobody uses the smaller LLMs.
Even if superficially they both can answer questions, in complex topics 65B is much better than 30B, so not even compares with 7B.
deck4242 OP t1_je5zprk wrote
Any chance i could access to your discord to try out the 65b ?
[deleted] t1_jegnnie wrote
[removed]
machineko t1_je86hwt wrote
What GPUs are you using to run them? Are you using any compression (i.e. quantization)?
ortegaalfredo t1_jegn9zu wrote
2x3090, 65B is using int4, 30B is using int8 (required for LoRA)
nirehtylsotstniop t1_je95z0i wrote
can i discord?
[deleted] t1_jegnnpr wrote
[removed]
Viewing a single comment thread. View all comments