Maximum t1_jdlqolz wrote on March 25, 2023 at 9:45 AM

Reply to comment by plottwist1 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry

It's community driven, so they are open open.

Maximum t1_jdkepie wrote on March 25, 2023 at 12:57 AM

Reply to comment by mxby7e in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry

Also, it's very shady for a company called OpenAI. They claimed they became for profit because they needed the money to grow, but these restrictions just show that they are filthy liars and only care about keeping the power and making profit. I'm sure they already have a strategy going around that 30B cap, just like they planned stealing money and talent by calling themselves non-profit first.

Maximum t1_jdkdtp2 wrote on March 25, 2023 at 12:51 AM

Reply to [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry

ClosedAI is feeding off of our data. If we start using/supporting Open Assistant instead, it will beat chatgpt in a month or two.

Maximum t1_jdd2hjg wrote on March 23, 2023 at 3:08 PM

Reply to comment by brownmamba94 in [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101

Under which license?

Maximum t1_jc5xeqw wrote on March 14, 2023 at 7:47 AM

Reply to [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef

The title is bs, OP.

Also in terms of writing code it's not even close, feels more like gpt-2 level.

Maximum OP t1_jbdr6zj wrote on March 8, 2023 at 8:50 AM

Reply to comment by Taenk in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__

Not quite. Assuming you have certain compute, if you have a model with 1B parameters, then use a dataset of 20B tokens. Look at the figures in Chinchilla paper, they demonstrate it nicely.

Maximum OP t1_jbdqy5c wrote on March 8, 2023 at 8:47 AM

Reply to comment by adt in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__

Thanks for the links. Looks like RoBERTa did not gain a lot from the additional trainings, only minor improvements, but yeah, it was a tiny model. How was this not a good lesson? Why did people need Chinchilla? Maybe it's just having a lot of data comes easy so people gather as much as possible, even though they know they will go maximum 1 epoch over it.

Maximum OP t1_jbbi89l wrote on March 7, 2023 at 9:14 PM

Reply to comment by _Arsenie_Boca_ in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__

Until looking at loss does not get you excited?

Maximum OP t1_jbb5bzm wrote on March 7, 2023 at 7:53 PM

Reply to comment by CKtalon in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__

Right, I just noticed that LLaMA says they didn't fix their compute. Thanks. I wonder if there is a small architecture that is trained until convergence.

Maximum t1_j8eduow wrote on February 13, 2023 at 6:20 PM

Reply to comment by TLfanbasit in [D] What ML dev tools do you wish you'd discovered earlier? by TikkunCreation

You can save the page

Maximum t1_j4nqvzw wrote on January 17, 2023 at 12:28 AM

Reply to How long until an AI is able to write a book? by Educational_Grab_473

I think what you are asking is how long until researchers find out a way of expanding the input size of the model? Because there ways you can get it write a book right now. For example, you can tell it to write the first page, then a summary of it. Then feed the summary and let it write another page. And so on.

As to increasing the input size, if no new way comes around, it will get bigger as the transformers get more efficient, which they will. Look up flash transformers for example.

Maximum t1_j338wq2 wrote on January 5, 2023 at 7:23 PM

Reply to I asked ChatGPT if it is sentient, and I can't really argue with its point by wtfcommittee

Most of this sub.

Maximum t1_j338ipi wrote on January 5, 2023 at 7:21 PM

Reply to I asked ChatGPT if it is sentient, and I can't really argue with its point by wtfcommittee

The way we define consciousness requires an entity to have self-awareness, whether it's a by product or not is irrelevant. Also, it cannot self reflect. I'm not sure how much this relates to consciousness. Someone educated please enlighten us.

Maximum t1_j2n9cnv wrote on January 2, 2023 at 4:26 PM

Reply to [D] What do you do while you wait for training? by hollow_sets

Everyone is being sarcastic here. In reality, we all pray while checking the loss every 20 seconds.

Maximum t1_j136j97 wrote on December 21, 2022 at 9:29 AM

Reply to comment by master3243 in [R] Nonparametric Masked Language Modeling - MetaAi 2022 - NPM - 500x fewer parameters than GPT-3 while outperforming it on zero-shot tasks by Singularian2501

How about BIG-bench?

Maximum t1_j013myn wrote on December 13, 2022 at 8:51 AM

Reply to comment by AGI_69 in I think this post will be monumentally important for some of you to read. Put it in your brain, think about it, and get ready for the next few years. If you are part of this Subreddit; You are forward thinking, you're already ahead of the curve, you will have one shot to be at an advantage. NOW. by AdditionalPizza

>Even for /r/singularity

You savage!

Seriously though, I posted two polls to understand from what background users of this sub come from, and both got removed. But this post is completely fine.

Maximum t1_izrv6x1 wrote on December 11, 2022 at 11:20 AM

Reply to [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan

Cool idea, but I can see this version wasting my time, especially if I don't pass the code with the error. I can see it very useful with their davinci coding model right now though, it's expensive. Let's hope stabilityAI or someone else publishes an open source model that is as good as openai's

Maximum t1_iyziqwo wrote on December 5, 2022 at 10:11 AM

Reply to comment by Jeffy29 in OpenAI ChatGPT [R] by Sea-Photo5230

If the assignment is a multiplication task to learn multiplication, then, of course, using a calculator is cheating. However, if the task includes multiplication but the goal of the idea has nothing to do with multiplication, then sure, go ahead and use a calculator because it will get you there faster.

These systems, on the other hand, are very capable. You can actually ask them to do a whole assignment or important parts of it, and often, they will do it. Some models scored more than average in multiple tasks. I can imagine that in a year or two, it will be rather easy to access these kinds of systems, but which will be much better than average in all tasks. Let's see what brings us GPT-4, which will be here at the latest February according to rumours.

Maximum t1_iyrp93n wrote on December 3, 2022 at 5:19 PM

Reply to comment by enjinseoul in OpenAI ChatGPT [R] by Sea-Photo5230

The idea of assignments is to force yourself to solve a problem not tell someone else (AI or not) to solve for you, you can name it whatever you want but it's cheating

Maximum t1_iyrkzyc wrote on December 3, 2022 at 4:50 PM

Reply to comment by Nik_uson in [D] This neural network was generated by a neural network by Nik_uson

Okay, but the code that you uploaded, do you think it has value of itself or do you want to demonstrate that this is possible?

Maximum t1_iyqwaqi wrote on December 3, 2022 at 1:42 PM

Reply to [D] This neural network was generated by a neural network by Nik_uson

Okay, I haven't noticed anything interesting in the architecture. Have I missed smth? Have you benchmarked it?

This was the first thing I have tried with gpt3, and for me, it produced simple and typical networks.