Viewing a single comment thread. View all comments

Hands0L0 t1_jck1yvf wrote

Reply to comment by Akimbo333 in Those who know... by Destiny_Knight

I got 30b running on a 3090 machine, but the token return is very limited

1

Akimbo333 t1_jck2koh wrote

Oh ok. How many tokens are returned

1

Hands0L0 t1_jck3lfv wrote

Depends on prompt size which is going to dictate that quality of the return. 300 tokens?

1

Akimbo333 t1_jck53wv wrote

Well, actually, that's not bad! That's about 50-70 words. Which in the English lesson is essentially 3-5 sentences. Essentially, it's a paragraph. It's a good amount for a chatbot! Let me know what you think?

2

Hands0L0 t1_jck5cyd wrote

Considering you can explore context with ChatGPT and bing through multiple returns, not exactly. You need to hit it on your first attempt

2

Akimbo333 t1_jck73ph wrote

Well you could always ask it to continue the sentence

2

Hands0L0 t1_jck7ifi wrote

Not if there is a token limit.

I'm sorry, I don't think I was being clear. The token limit is tied to VRAM. You can load the 30b on a 3090 but it shallows up 20/24 gb of VRAM for the model and prompt alone. That gives you 4gb for returns

2

Akimbo333 t1_jcka9ef wrote

Oh ok. So you can't make it keep talking?

1

Hands0L0 t1_jckbm7h wrote

No, because the predictive text needs the entire conversation history context to predict what to say next, and the only way to store the conversation history is in RAM. If you run out of RAM you run out of room for returns.

2

Akimbo333 t1_jckc9iu wrote

Damn! There's gotta be a better way to store conversations!!! Maybe one day

1

Hands0L0 t1_jcknz03 wrote

Study CS and come up with a solution and you can be very rich

1