Submitted by ortegaalfredo t3_11kr20f in MachineLearning
ortegaalfredo OP t1_jbov7dl wrote
Reply to comment by SpaceCockatoo in [R] Created a Discord server with LLaMA 13B by ortegaalfredo
Tried the 8bit, 4bit for some reason don't work yet for me.
Problem is, those are very very slow, about 1 token/sec, compared with 13B I'm getting 100 tokens/s
Viewing a single comment thread. View all comments