Submitted by deck4242 t3_125q87z in MachineLearning
Hello
what are we talking in term of diminishing returns between the 2 models ?
do the 65b really improve a lot ?
bonus question: how to train the 7b model to learn specific field on my computer ? (makin it tailored to my needs)
RedditLovingSun t1_je5m80p wrote
Significantly better
I guess it would be interesting to see if the performance difference gets wider or narrower after self-instruct optimizations like alpaca