martianunlimited

martianunlimited t1_jegwj6v wrote

ChatGPT (and other GPT3.5 based transformers) has 175 billion parameters and wouldn't even fit in a dedicated RTX4090. and before you say why not run just a smaller model, the performance of ChatGPT is highly dependant on it's size, (which is why people outside the machine learning community don't hear much about GPT-1 and GPT-2. And while there are efforts to make the model smaller (see ALPACA) , you would still need a top-of-the-line GPU to fit these smaller models, taking away from things that people are more concerned about, the graphics.

So the practical implementation of incorporating ChatGPT in to games would be to have send the chat response to a server, and suffer a whole lot of latency for the response. It's possible, but it wouldn't be a good gaming experience. Wait (at least) 10 years, when consumer grade hardware has the capacity of datacenter grade hardware (that's assuming we don't hit the end of Moore's law first) then you might find it more common place.

3

martianunlimited t1_je3lmsp wrote

Relevant publication: https://cdn.openai.com/papers/gpt-4.pdf

I can take comfort in knowing that while GPT-4 is 10-percentile better than me in GRE Verbal, I still score (slightly) better than GPT-4 in GRE Quantitative and very similarly in GRE-Writing. (English is not my first language)

Side note: I am surprised how poorly GPT-4 do in AP English Language and AP English Lit; I thought as a large language model, it would have an advantage in those sort of questions. (Sorry, not an American, i could be misunderstanding what exactly is being tested in those subjects)

2

martianunlimited t1_j9sh43x wrote

Not exactly what you are asking, but there is this paper on scaling law that states that (assuming that the training data is representative of the distribution) for at least large langauge models, how the performance of transformers scale to the amount of data and compare it to other network architecture.... https://arxiv.org/pdf/2001.08361.pdf we don't have anything similar for other types of data.

1

martianunlimited t1_j8tm2qd wrote

This is an ELI5 explanation as to why we use noise and conditionally denoise the noise with the text encoder: Look at the clouds, and I tell you that I see an elephant in the clouds. It is easier to imagine the elephant in the clouds than if i tell you to imagine that there is an elephant in the piece of white paper.

(the less ELI5 explanation is that the entropy going from noise to an image is lower than that of from a uniform image) If you want to see that for yourself, with a bit of programming knowledge you can write your own diffuser pipeline to skip the noise adding stage and try img2img from a blank image. (it's literally just ~3 lines of edits)

(side note: someone brought up a similar question but in a different vein, (removing the random seed)

1

martianunlimited t1_j5qjagc wrote

Hopefully.. but I always worry about what would happen if a Carrington Event (https://en.wikipedia.org/wiki/Carrington_Event ) strikes earth in the modern era. We are so dependent on devices that are sensitive to electromagnetic inferences, it's going to be hard to imagine how difficult it would be to replace our transformers if we are caught unaware by such an event

13