Hello

what are we talking in term of diminishing returns between the 2 models ?

do the 65b really improve a lot ?

bonus question: how to train the 7b model to learn specific field on my computer ? (makin it tailored to my needs)

Comments

You must log in or register to comment.

RedditLovingSun t1_je5m80p wrote on March 29, 2023 at 4:09 PM

I guess it would be interesting to see if the performance difference gets wider or narrower after self-instruct optimizations like alpaca

I run a discord with all models. Currently only 30B and 65B because nobody uses the smaller LLMs.

Even if superficially they both can answer questions, in complex topics 65B is much better than 30B, so not even compares with 7B.

Any chance i could access to your discord to try out the 65b ?

[removed]

What GPUs are you using to run them? Are you using any compression (i.e. quantization)?

can i discord?

2x3090, 65B is using int4, 30B is using int8 (required for LoRA)

[removed]

[removed]