__Maximum__
__Maximum__ t1_jdkepie wrote
Reply to comment by mxby7e in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
Also, it's very shady for a company called OpenAI. They claimed they became for profit because they needed the money to grow, but these restrictions just show that they are filthy liars and only care about keeping the power and making profit. I'm sure they already have a strategy going around that 30B cap, just like they planned stealing money and talent by calling themselves non-profit first.
__Maximum__ t1_jdkdtp2 wrote
ClosedAI is feeding off of our data. If we start using/supporting Open Assistant instead, it will beat chatgpt in a month or two.
__Maximum__ t1_jdd2hjg wrote
Reply to comment by brownmamba94 in [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
Under which license?
__Maximum__ t1_jc5xeqw wrote
Reply to [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
The title is bs, OP.
Also in terms of writing code it's not even close, feels more like gpt-2 level.
__Maximum__ OP t1_jbdr6zj wrote
Reply to comment by Taenk in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
Not quite. Assuming you have certain compute, if you have a model with 1B parameters, then use a dataset of 20B tokens. Look at the figures in Chinchilla paper, they demonstrate it nicely.
__Maximum__ OP t1_jbdqy5c wrote
Reply to comment by adt in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
Thanks for the links. Looks like RoBERTa did not gain a lot from the additional trainings, only minor improvements, but yeah, it was a tiny model. How was this not a good lesson? Why did people need Chinchilla? Maybe it's just having a lot of data comes easy so people gather as much as possible, even though they know they will go maximum 1 epoch over it.
__Maximum__ OP t1_jbbi89l wrote
Reply to comment by _Arsenie_Boca_ in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
Until looking at loss does not get you excited?
__Maximum__ OP t1_jbb5bzm wrote
Reply to comment by CKtalon in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
Right, I just noticed that LLaMA says they didn't fix their compute. Thanks. I wonder if there is a small architecture that is trained until convergence.
Submitted by __Maximum__ t3_11l3as6 in MachineLearning
__Maximum__ t1_j8eduow wrote
Reply to comment by TLfanbasit in [D] What ML dev tools do you wish you'd discovered earlier? by TikkunCreation
You can save the page
__Maximum__ t1_j4nqvzw wrote
I think what you are asking is how long until researchers find out a way of expanding the input size of the model? Because there ways you can get it write a book right now. For example, you can tell it to write the first page, then a summary of it. Then feed the summary and let it write another page. And so on.
As to increasing the input size, if no new way comes around, it will get bigger as the transformers get more efficient, which they will. Look up flash transformers for example.
__Maximum__ t1_j338wq2 wrote
Most of this sub.
__Maximum__ t1_j338ipi wrote
The way we define consciousness requires an entity to have self-awareness, whether it's a by product or not is irrelevant. Also, it cannot self reflect. I'm not sure how much this relates to consciousness. Someone educated please enlighten us.
__Maximum__ t1_j2n9cnv wrote
Everyone is being sarcastic here. In reality, we all pray while checking the loss every 20 seconds.
__Maximum__ t1_j136j97 wrote
Reply to comment by master3243 in [R] Nonparametric Masked Language Modeling - MetaAi 2022 - NPM - 500x fewer parameters than GPT-3 while outperforming it on zero-shot tasks by Singularian2501
How about BIG-bench?
__Maximum__ t1_j013myn wrote
Reply to comment by AGI_69 in I think this post will be monumentally important for some of you to read. Put it in your brain, think about it, and get ready for the next few years. If you are part of this Subreddit; You are forward thinking, you're already ahead of the curve, you will have one shot to be at an advantage. NOW. by AdditionalPizza
>Even for /r/singularity
You savage!
Seriously though, I posted two polls to understand from what background users of this sub come from, and both got removed. But this post is completely fine.
__Maximum__ t1_izrv6x1 wrote
Reply to [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan
Cool idea, but I can see this version wasting my time, especially if I don't pass the code with the error. I can see it very useful with their davinci coding model right now though, it's expensive. Let's hope stabilityAI or someone else publishes an open source model that is as good as openai's
__Maximum__ t1_iyziqwo wrote
Reply to comment by Jeffy29 in OpenAI ChatGPT [R] by Sea-Photo5230
If the assignment is a multiplication task to learn multiplication, then, of course, using a calculator is cheating. However, if the task includes multiplication but the goal of the idea has nothing to do with multiplication, then sure, go ahead and use a calculator because it will get you there faster.
These systems, on the other hand, are very capable. You can actually ask them to do a whole assignment or important parts of it, and often, they will do it. Some models scored more than average in multiple tasks. I can imagine that in a year or two, it will be rather easy to access these kinds of systems, but which will be much better than average in all tasks. Let's see what brings us GPT-4, which will be here at the latest February according to rumours.
__Maximum__ t1_iyrp93n wrote
Reply to comment by enjinseoul in OpenAI ChatGPT [R] by Sea-Photo5230
The idea of assignments is to force yourself to solve a problem not tell someone else (AI or not) to solve for you, you can name it whatever you want but it's cheating
__Maximum__ t1_iyrkzyc wrote
Reply to comment by Nik_uson in [D] This neural network was generated by a neural network by Nik_uson
Okay, but the code that you uploaded, do you think it has value of itself or do you want to demonstrate that this is possible?
__Maximum__ t1_iyqwaqi wrote
Okay, I haven't noticed anything interesting in the architecture. Have I missed smth? Have you benchmarked it?
This was the first thing I have tried with gpt3, and for me, it produced simple and typical networks.
__Maximum__ t1_jdlqolz wrote
Reply to comment by plottwist1 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
It's community driven, so they are open open.