EternalNY1

EternalNY1 t1_jecwmbt wrote

>I've been in IT for 30+ years.

Same here, and remote for over a decade, far before the pandemic.

These return-to-office policies are especially absurd in IT, as literally everything I do is logged.

Every line of code I check in, pull request I complete, comment I make in our item tracker, timestamps on when I log into servers, exactly what I'm doing on said servers, discussions in Teams and Slack, emails ... all day long, every day.

If they think I'm sitting around watching Netflix on the couch all day, they can simply look in our DevOps system and see all the lines of code I've comitted.

Makes no sense.

19

EternalNY1 t1_je94zko wrote

If you want what I'd consider to be hands-down the best explanation of how it works, I'd read Stephen Wolfram's article. It's long (may take up to an hour) and somewhat dense at parts, but it explains fully how it works, including the training and everything else.

What Is ChatGPT Doing … and Why Does It Work?

The amazing thing is they've looked "inside" GPT-3 and have discovered mysterious patterns related to language that they have no explanation for.

The patterns look like this ... they don't understand the clumping of information yet.

So any time someone says "it just fills in the next likely token", that is beyond overly simplistic. The researches themselves don't fully understand some of the emergent behavior it is showing.

2

EternalNY1 t1_j2fkdrq wrote

It is extremely impressive at translating code between languages, code it has never seen before. And obviously that is not data that has been scraped from a web page.

However, a lot of people point out it doesn't write complex code that well. The thing is, it's not advertised as a tool to help out software engineers. The fact that it can actually write code as well as it does I find impressive enough.

1

EternalNY1 t1_j2cd4hm wrote

They still estimate $87,000 per year on the low end to operate it yearly on AWS for 175 billion parameters.

I am assuming that is just the cost to train it though so it would be a "one time" cost every time you decided to train it.

Not exactly cheap, but something can can be budgeted for larger companies.

I asked it specifically how many GPUs it uses, and it replied with:

>For example, the largest version of GPT-3, called "GPT-3 175B," is trained on hundreds of GPUs and requires several dozen GPUs for inference.

57