FirstOrderCat
FirstOrderCat t1_jdt4oks wrote
Reply to comment by TheStartIs2019 in [D] GPT4 and coding problems by enryu42
on some unrelated benchmark
FirstOrderCat t1_jdpqp5l wrote
Reply to comment by Vegetable-Skill-9700 in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
I do not know, it is hard to say if you will be able to create sufficient dataset for your case.
FirstOrderCat t1_jdlrsxv wrote
Reply to comment by BellyDancerUrgot in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
first and/or most powerful AGI will likely be closed and owned by corp.
FirstOrderCat t1_jdldywl wrote
Reply to comment by BellyDancerUrgot in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
I think people amazed by progress speed, OpenAI got 10B fundning, they built strong team, and they can continue expanding system with missing components.
FirstOrderCat t1_jdl9uwf wrote
gpt is cool because it memorized lots of info, so you can quickly retrieve it. LM memorization abilities likely dependent on the size.
FirstOrderCat t1_jcsjgws wrote
Reply to comment by thesupernoodle in Best GPUs for pretraining roBERTa-size LLMs with a $50K budget, 4x RTX A6000 v.s. 4x A6000 ADA v.s. 2x A100 80GB by AngrEvv
​
they don't have a6000 ada yet
FirstOrderCat t1_jcsjdve wrote
Reply to Best GPUs for pretraining roBERTa-size LLMs with a $50K budget, 4x RTX A6000 v.s. 4x A6000 ADA v.s. 2x A100 80GB by AngrEvv
> Based on my study, A6000 ADA has comparable performance to A100 on DL benchmarks. Is this A100 80GB spec a good choice?
it looks like you answered your questions yourself: 4 x a6000 ada will give you the best performance.
FirstOrderCat t1_j9p55p9 wrote
Reply to comment by iamascii in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
It still doesn't answer second question.
FirstOrderCat t1_j9ma8lr wrote
Reply to comment by sticky_symbols in What are your thoughts on Eliezer Yudkowsky? by DonOfTheDarkNight
> Asimov's rules don't work
you jump to another topic. Initial discussion was that Azimov rules brought much more awareness, and you can't point on similar material results from Yudkovsky.
FirstOrderCat t1_j9m8bhd wrote
Reply to comment by sticky_symbols in What are your thoughts on Eliezer Yudkowsky? by DonOfTheDarkNight
> Not that it wouldn't have happened without him but might've taken many more years to ramp up the same amount.
happened what exactly? what are the material results of his research?
I think Azimov's with his rules produces earlier and much stronger impact.
> I'm now a professional in the field of AGI safety
Lol, you adding AGI makes my bs detector beeping extremely loud.
Which AGI exactly you are testing for safety?
FirstOrderCat t1_j9m6fj2 wrote
Reply to comment by sticky_symbols in What are your thoughts on Eliezer Yudkowsky? by DonOfTheDarkNight
>but those didn't convince anyone to take it seriously
Lol, I totally got the idea that rogue robot can start killing humans long before I learn about Yudkowsky existance.
> Yudkowsky did.
could you support your hand-waving by any verifiable evidence?
FirstOrderCat t1_j9l8xn9 wrote
Reply to comment by EndTimer in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
Yes, and then reproduce results from both papers, check the code to see nothing creative happens in datasets or during training, and there are much more claims in the academia than one has time to verify.
FirstOrderCat t1_j9l4bho wrote
Reply to comment by EndTimer in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
What is IMG for GPT then there?
How come GPT performed better without seeing context compared to seeing text context?..
FirstOrderCat t1_j9krjxb wrote
Reply to comment by EndTimer in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
>which requires substantially fewer parameters while scoring higher, even in text-only domains.
Which tests in paper refer on text-only domains?
FirstOrderCat t1_j9ja0i4 wrote
Reply to comment by Destiny_Knight in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
multimodal here means questions contain pictures.
So, it is obvious gpt would underperform since it doesn't work with pictures, lol?..
FirstOrderCat t1_j9j9qlg wrote
Reply to comment by sticky_symbols in What are your thoughts on Eliezer Yudkowsky? by DonOfTheDarkNight
which field? ai danger awareness? It was in the terminator movie.
FirstOrderCat t1_j9ifrw4 wrote
Reply to comment by sticky_symbols in What are your thoughts on Eliezer Yudkowsky? by DonOfTheDarkNight
I don't know much about his practical achievements in this area.
FirstOrderCat t1_j9hxjzo wrote
Reply to comment by sticky_symbols in What are your thoughts on Eliezer Yudkowsky? by DonOfTheDarkNight
I argued with him on hacker news, and he is very reactive when reading something he doesn't like.
FirstOrderCat t1_j97o5hw wrote
Reply to comment by bass6c in What’s up with DeepMind? by BobbyWOWO
I think my point still stands:
- Google didn't ship LLM as product yet, and now forced to catch up because lost innovation race (even you think they are not interested lol)
- OpenAI shipped multiple generations of LLM products already
FirstOrderCat t1_j97jovv wrote
Reply to comment by bass6c in What’s up with DeepMind? by BobbyWOWO
> Palm beat GPT models in every major beachmark.
palm is much larger, which makes it harder to run in production serving many user's requests, so it is example of enormous waste of resources.
Also, current NLP benchmarks are not reliable, simply because models can be pretrained on them and you can't verify this.
FirstOrderCat t1_j97i6py wrote
Reply to comment by bass6c in What’s up with DeepMind? by BobbyWOWO
> Most of the technologies being used by openai are either from Google or from Deepmind.
it is just indication that google and deepmind create theoretical concepts but can't execute it to complete product.
FirstOrderCat t1_j6ieqjp wrote
Reply to Parsel: A (De-)compositional Framework for Algorithmic Reasoning with Language Models - Stanford University Eric Zelikman et al - Beats prior code generation sota by over 75%! by Singularian2501
> Beats prior code generation sota by over 75%!
but on different metric: pass@50 vs pass@8x16
FirstOrderCat t1_j5aojpd wrote
Reply to comment by Low-Mood3229 in [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229
I am serious, knowledge graph is just many triples, and can be stored in relational DB, and the same is about other data, and then you can join them..
FirstOrderCat t1_j58hvki wrote
Reply to [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229
sql join?..
FirstOrderCat t1_jdt55dh wrote
Reply to comment by imaginethezmell in [D] GPT4 and coding problems by enryu42
you can have it, the question is what will be accumulated errors in final result.