Comments

You must log in or register to comment.

maskedpaki t1_j7nkucz wrote

They weren't lying about 2023 lol

95

Savings-Juice-9517 t1_j7okamx wrote

Honestly, I’ve not seen progress like this in my lifetime

42

Cryptizard t1_j7p13yv wrote

You could have said that any year before now, and will be able to say it every year in the future. That is kind of the point of exponential growth.

25

p3opl3 t1_j7pa9cm wrote

Really?

2019-2021 .. was pretty lax if you ask me..

6

Cryptizard t1_j7pc4vi wrote

DALLE was released in 2021 and GPT-3 in 2020. If posts around here are anything to go by they were kind of a big deal.

15

p3opl3 t1_j7r3sjq wrote

That's a fair point .. I mentioned in my other reply.. releases are an OK indicator of progress through. Technically GPT3 was already well past development stage before 2020...

aaaand DALLE, I don't know how much of an advancement that is.. like it's it not surprising that a tiny startup releasing their first version of Stable Diffusion.. dominated the AI communities.. just because it's open source. There was definatley important releases..

But this year's has literally only seen 6 weeks right? ... Some pretty much moves being made already.. it's exciting.

0

BadassGhost t1_j7pov2k wrote

2019 was GPT-2 which rocked the boat. 2020 was GPT-3 which sank the boat. Those were partially responsible for kicking off this whole scaling up of transformers

There was also LaMDA in 2021, and I'm sure many other big events in that period that I'm forgetting

5

p3opl3 t1_j7r3693 wrote

That's actually a fair point.. although those models had been invented way before 2019.. release date isn't development or discovery date right. It's like GPT4 ..that's already existed for well over 2 years now right..it's just not "ready" yet.

Stable Diffusion 3 is literally microsecond level response time now.. it's insane.

Honestly.. I think the big breakthroughs.. aren't going to be in AI..it's going to be in UK/UX and how people are going to bootstrap these models for building something actually useful.

0

easy_c_5 t1_j7pb4sx wrote

Apparently you haven't seen any of the the uncountable javascript libraries released durint that time, or the hundreds of startups tackling similar subjects or the tens of thousands of research papers on distributed systems, animation etc. (just because there are non-groundbreaking research papers in the list above too) .

The list above is nothing groundbreaking, just copies over copies of the same stuff we've had for quite a while + productivising it.

The real summary of the past months and days:

The good: AI is going public.

The bad: we still don't have any real clue on how to get to AGI.

The worse: AI is getting regulated and people are fighting back.

3

neo101b t1_j7pnw40 wrote

The technological progression on a graph vs time might as well be a verticle line straight up.

4

Neurogence t1_j7p686u wrote

I'm excited about all the news, but do we have any usable AI products yet that has already been released? Not just announced?

2

Sashinii t1_j7no4pu wrote

AI is already changing the world and we're not even at proto-AGI yet.

85

BadassGhost t1_j7pprrd wrote

I feel like an unrestricted LLM-powered chatbot is pretty close to proto-AGI. OpenAI is basically lobotomizing ChatGPT to avoid headlines about it claiming sentience or emotions or making controversial statements, so it's not much to go off of.

We haven't been able to play with PaLM or any next-gen versions of it (Flan-PaLM and U-PaLM), but the benchmark comparisons between that and others seem enormous. If you build PaLM with an embedded dataset and cross-attention like Retro, I think that would probably be proto-AGI.

And then the next step from there to actual AGI would be making a multi-modal version of that, like Gato. The only missing ingredient there is getting the model to use one modality to inform about other modalities, which they did not achieve with Gato but are supposedly actively working on

22

neo101b t1_j7pnq7l wrote

What we had 1 year ago compared to what we have now, is simply amazing.

It feels like AI technology is growing faster than any other technology that's been developed, its going faster than warp 10.

7

Pro_RazE OP t1_j7nj2vi wrote

It's crazy all this happened in a week, I have missed/not added hundreds of papers and probably missed a lot of other updates so this isn't everything.

55

[deleted] t1_j7oqhpz wrote

There's so much news, it's getting hard to keep up. Thank you so much for doing this!

11

grossexistence t1_j7ofvq1 wrote

At this rate, Proto-AGI will be here by the end of the year or the first half of 2024.

24

squareOfTwo t1_j7ooiz6 wrote

just no, the rate is still to damn slow for that. Most of the "progress" is just training with yet unused data (human written text for GPT, Text-Image pairs for the stable diffusions of this world etc. This will end soon if no high quality data is left to train). The end of "scale" is near.

11

zendonium t1_j7ovw92 wrote

But surely that's all it takes? The human brain is just a multimodal network that processes language, visual, audio, and a bunch of other stuff.

Pay 10,000 Kenyans $2 a day to get more training data on more senses and train more networks. We'll have narrow AGIs in almost all areas. Just needs putting together with some clever insight from some genius.

6

Cryptizard t1_j7p26uc wrote

If that was true then we could just train a model on all the AI research we have and get a “narrow AGI” that makes AI models. Singularity next week. Unfortunately, that is not how it is.

4

visarga t1_j7q4313 wrote

If they make GPT-N much larger, it will take longer and cost more to train. Then we can only afford a few trials. Whether they are selected by humans or AI makes little difference. It's going to be a crapshoot anyway, nobody knows what experiment is gonna win. The slow experimentation loop is one reason not even AGI can speed things up everytime.

2

Maksitaxi t1_j7ojz2i wrote

How do you think proto-AGI would look like?

5

turnip_burrito t1_j7okin2 wrote

Probably a box.

Maybe painted black.

And able to understand enough concepts to write improved versions of some of its own code of we asked it to.

Maybe can write some new math proofs in a short and human readable way.

Maybe multimodal.

Large short term memory context window.

Able to update its model in real time for incoming new information.

Maybe running on more specialized hardware, or neuromorphic chips.

19

ecnecn t1_j7pfwdr wrote

>Probably a box.
>
>Maybe painted black.

hey, you signed a Non-Disclosure Agreement on this!

3

kaleNhearty t1_j7purfu wrote

Generative transformer models are not AGI, not even close. We're going to have to come up with some new methodology to handle multi-modality or maybe some kind of synthesis between several different models until we see some kind of Proto-AGI and that's decades away IMO.

2

Hands0L0 t1_j7pdcrw wrote

Not until we understand our own brains

0

controltheweb t1_j7paze2 wrote

Image to Text:

Al Progress of February, 2023 Week 1 (1 Feb - 7 Feb) by pro_raze

  1. Over 1 million researchers have used Deepmind's Alphafold Protein Structure Database
  2. Google Al releases the Flan T5 Language Model Collection
  3. Meta Al trained blind Al agents that can navigate similar to blind humans
  4. ChatGPT Plus announced for $20 per month with waitlist (US only for now) - ChatGPT Users Topped 100 Million in January
  5. Microsoft announces Teams Premium powered by GPT-3.5
  6. Perplexity Ask (Al Search Engine) available as a Chrome extension
  7. Microsoft boosts Viva Sales with new GPT seller experience (integration)
  8. AudioLDM Text to Audio Generation available on Huggingface to use
  9. Meta releases a 30B param "OPT+IML" model fine tuned on 2000 tasks
  10. Google Al Open Sourced Vizier: a scaled blackbox optimization system
  11. Dreamix: Video Diffusion Models are General Video Editors
  12. SceneDreamer: Generating 3D Scenes From 2D Image Collections
  13. SceneScape: Text-Driven Consistent Scene Generation
  14. RobustNeRF: Basically improves quality of NeRFs
  15. OpenAl's New Paper: A proof of concept for using Al-assisted human feedback to scale the supervision of ML systems
  16. Deepmind Paper: Accelerating Large Language Model Decoding with Speculative Sampling (2-2.5x speedup)
  17. Amazon Al: Multimodal-CoT outperforms GPT-3.5 by 16% (75.17% -> 91.68%) on ScienceQA and even surpasses human performance
  18. Sundar Pichai announced: LaMDA language model within "coming weeks and months"
  19. AutumnSynth synthesizes the source code of a 2D video game from seconds of play
  20. Nvidia Paper: Enabling Simulated Characters To Perform Scene Interaction Tasks In Natural/Lifelike Manner
  21. Poe, a ChatGPT like bot launched from the creators of Quora. They are also making API for it. Currently iOS only.
  22. Google invests $300 million in Anthropic Al (Done in 2022, reported now)
  23. BLIP-2 demo available on Huggingface: LLM that can understand images
  24. Humata.ai launched: Basically ChatGPT for your own files
  25. Bing+ GPT integration images leaked
  26. Google's new Real-time tracking of wildfire boundaries using satellite imagery
  27. LAION AI introduces Open Assistant: Chatbot project that understands tasks, interacts with third-party systems, and retrieve information dynamically (open source)
  28. Apple CEO Tim Cook says Al will eventually 'affect every product and service we have'
  29. Epic-Sounds: A Large-scale Dataset of Actions That Sound Released
  30. announcing stable attribution - a tool which lets anyone find the human creators behind a.i generated images
  31. presenting TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes
  32. -Tune-A-Video available to use and also open sourced (turns Al Generated Images into gifs or videos)
  33. Filechat.io now available - ChatGPT for your own data and no limits (with premium tier)
  34. BioGPT-Large by Microsoft now available on Huggingface to try
  35. Google announces Bard, powered by LaMDA coming soon as an Al conversational service. It will be integrated with Search.
  36. Microsoft announces surprise event for tomorrow with Bing ChatGPT expected (Feb 7)
  37. Language Models Secretly Perform Gradient Descent as Meta-Optimizers Paper - In-context-learning, the ability for LLMs to learn new abilities from examples in a prompt alone
  38. Apple to hold in-person 'Al summit' event for employees at Steve Jobs Theater
  39. -Seek Al introduces DeepCuts, the AI SQL app that lets you explore your Spotify data with natural language
  40. KickResume's Al Resume Builder can rewrite, format, and grade a resume
  41. Introducing Polymath: The open-source tool that converts any music-library into a sample-library with machine learning
  42. Microsoft & OpenAI Announce: Bing and Edge + Al: a new way to search starts today
15

Glad_Laugh_5656 t1_j7oo5q1 wrote

To play devils advocate, you can compile a list of AI achievements (even if not as impressive as this one) this long every week, and knowing that dampens the impressive-ness of this list just a bit.

Not to mention the list is seriously inflated by headlines that aren't actually advancements.

Not to say that we didn't see progress last week (of course we did), but I kinda get the feeling you're making it seem bigger than what it actually was.

7

Pro_RazE OP t1_j7owb28 wrote

I'm not making it bigger than what it was. It's actually bigger than that.

7

challengethegods t1_j7oyt2e wrote

[insert reference to the 1000+ ML papers published every week]
"but wait, nobody told me which ones to read? slow day"

4

idranh t1_j7oe0q9 wrote

Incredible

5

Iunaml t1_j7oum8l wrote

2023? Trillions parameters neural networks?

And we get a god damn JPEG of a bulletpoint list upvoted here?

A bulletpoint list that's litterally the list of the title of the most upvoted threads of last week??

What kind of dystopia are we already in?

5

challengethegods t1_j7ovaa4 wrote

You got something against jpegs?
- Over 1 million researchers have used Deepmind's Alphafold Protein Structure Database
- Google Al releases the Flan T5 Language Model Collection
- Meta Al trained blind Al agents that can navigate similar to blind humans
- ChatGPT Plus announced for $20 per month with waitlist (US only for now)
- ChatGPT Users Topped 100 Million in January
- Microsoft announces Teams Premium powered by GPT-3.5
- Perplexity Ask (Al Search Engine) available as a Chrome extension
- Microsoft boosts Viva Sales with new GPT seller experience (integration)
- AudioLDM Text to Audio Generation available on Huggingface to use
- Meta releases a 30B param “OPT+IML” model fine tuned on 2000 tasks
- Google Al Open Sourced Vizier: a scaled blackbox optimization system
- Dreamix: Video Diffusion Models are General Video Editors
- SceneDreamer: Generating 3D Scenes From 2D Image Collections
- SceneScape: Text-Driven Consistent Scene Generation
- RobustNeRF: Basically improves quality of NeRFs
- OpenAl's New Paper: A proof of concept for using Al-assisted human feedback to scale the supervision of ML systems
- Deepmind Paper: Accelerating Large Language Model Decoding with Speculative Sampling (2-2.5x speedup)
- Amazon Al: Multimodal-CoT outperforms GPT-3.5 by 16% (75.17% -> 91.68%) on ScienceQA and even surpasses human
performance
- Sundar Pichai announced: LaMDA language model within "coming weeks and months”
- AutumnSynth synthesizes the source code of a 2D video game from seconds of play
- Nvidia Paper: Enabling Simulated Characters To Perform Scene Interaction Tasks In Natural/Lifelike Manner
- Poe, a ChatGPT like bot launched from the creators of Quora. They are also making API for it. Currently iOS only.
- Google invests $300 million in Anthropic Al (Done in 2022, reported now)
- BLIP-2 demo available on Huggingface: LLM that can understand images
- Humata.ai launched: Basically ChatGPT for your own files
- Bing + GPT integration images leaked
- Google's new Real-time tracking of wildfire boundaries using satellite imagery
- LAION Al introduces Open Assistant: Chatbot project that understands tasks, interacts with third-party systems, and retrieve
information dynamically (open source)
- Apple CEO Tim Cook says Al will eventually ‘affect every product and service we have'
- Epic-Sounds: A Large-scale Dataset of Actions That Sound Released
- announcing stable attribution - a tool which lets anyone find the human creators behind a.i generated images
- presenting TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes
- Tune-A-Video available to use and also open sourced (turns Al Generated Images into gifs or videos)
- Filechat.io now available - ChatGPT for your own data and no limits (with premium tier)
- BioGPT-Large by Microsoft now available on Huggingface to try
- Google announces Bard, powered by LaMDA coming soon as an Al conversational service. It will be integrated with Search.
- Microsoft announces surprise event for tomorrow with Bing ChatGPT expected (Feb 7)
- Language Models Secretly Perform Gradient Descent as Meta-Optimizers Paper - In-context-learning, the ability for LLMs to
learn new abilities from examples in a prompt alone
- Apple to hold in-person ‘Al summit’ event for employees at Steve Jobs Theater
- Seek Al introduces DeepCuts, the Al SQL app that lets you explore your Spotify data with natural language
- KickResume's Al Resume Builder can rewrite, format, and grade a resume
- Introducing Polymath: The open-source tool that converts any music-library into a sample-library with machine learning
- Microsoft & OpenAl: Bing and Edge + Al: a new way to search starts today
- some guy used his self-programming discord bot to grab this list from a jpeg
ftfy

12

Iunaml t1_j7owoa9 wrote

You think it's normal on the internet to share text by using JPEG? A lossy format for text?

Good on you to spend CPU to convert an image to text, real efficiency here.

−6

diviludicrum t1_j7p6w06 wrote

Oh boo hoo, everything sucks, bla bla. Tell it to your therapist after your bad vibes book club.

2

Iunaml t1_j7peqnt wrote

Yeah you really radiate good vibes instead.

2

challengethegods t1_j7oxmgv wrote

Personally I think the jpeg is more useful than text, but I can just as easily convert the text to jpeg so realistically idgaf - it does make it easier to save/share as jpeg, but slightly harder to copy/paste specific lines from it for a search, as example. pro/con I guess, but also as a jpeg the entire list shows from any view, meaning the "look at this big list" aspect is clarified regardless if someone cares to read past the first few lines. However, a jpeg is not as easily indexed by crawlerbots, which might have some unintended effects down the line. On the other hand, a jpeg can have any background color and select its own font which allows its creator to have greater control over the way that it's viewed, but this could be seen as a downside for someone that does not agree with their artistic vision. That being said, a jpeg also has the benefit of...
[I can do this forever lol]

1

Iunaml t1_j7pexov wrote

Text is more easily searchable, editable and accessible compared to jpeg. Jpeg images are not easily indexed by search engines, which can negatively affect their discoverability. Text can be easily copied, pasted, and edited, while with a jpeg, the text cannot be edited and is limited in terms of accessibility. Additionally, a jpeg may not display correctly on all devices, whereas text can be viewed on any device with a compatible software. These advantages of text make it a preferred format for sharing information over jpeg images.

2

proteo73 t1_j7okv5z wrote

Ok i need to contact my Planet ...

4

ecnecn t1_j7pg118 wrote

To late we have AGI Defense force before your invasion fleet can arrive!

1

da_k1ngslaya t1_j7oznl1 wrote

I need AI to make this pretty for me

4

trovaleve t1_j7on2ha wrote

Stable attribution?

3

trovaleve t1_j7ouqa9 wrote

I might be misunderstanding something, but isn't that BS? I appreciate your efforts either way.

5

Pro_RazE OP t1_j7ov10n wrote

Yes I actually read more about it later. It uses CLIP to find similar images in the LAION dataset that's it. Forgot to remove from the list

4

trovaleve t1_j7owfxe wrote

I've never actually put any thought into how image similarity tests work. I wonder if that's how Google's reverse image search works.

2

Pro_RazE OP t1_j7ox1lr wrote

Let's say you generate an image of a cat. CLIP can convert what is in the image into words and then use them to find similar images that was in the LAION dataset which Stable Diffusion uses. So if it was let's say an orange cat, it can find similar images of that cat that was used in the training. Without those original pictures, Stable Diffusion cannot generate pictures of an orange cat (poor example i know lol). It is not always accurate. And also generated images are always different to the original ones. But one recent paper kinda proved it wrong (very rarely happens)

I hope this helps.

3

enkae7317 t1_j7opwl8 wrote

Great list but curious if you have either the article link or some sort of sources for all of these.

3

Pro_RazE OP t1_j7ovjf8 wrote

I can provide you the source. I just don't have links of everything mentioned but screenshots of articles and tweets. Just mention and I'll link it here.

2

crua9 t1_j7pkknd wrote

Just a heads up, this isn't easy to read and it might be best to copy and paste it into the post.

3

Pro_RazE OP t1_j7pkyjz wrote

Yes from next time I'll post in that format with links included.

3

crua9 t1_j7pmlij wrote

Thanks. Also thanks for making this list. I didn't know about half of this.

3

Sotamiro t1_j7s5clz wrote

One day there will be a list this big for a 1 hour period instead of 1 week

3

Baturinsky t1_j7p2ih7 wrote

Could you please do this as a text with references?

2

Pro_RazE OP t1_j7p6wp0 wrote

Not possible for now as it will take a lot of time but in upcoming updates I'll make sure it is in that format too.

2

dutch665 t1_j7p5aqi wrote

Coming this summer

2

dontpet t1_j7qtlhd wrote

I'm a casual reader on this sub with a systems engineering background and don't understand most of those headlines. I do understand a few and the implications of those are huge.

Let's hope humanity somehow manages to adapt to this.

Our current laws and governance structures are so slow they won't be able to do much. It will be like shooting at a bunny based on where it was a week ago.

2

SantoshiEspada t1_j7s3ael wrote

I'd add Runway's Gen-1

2

Pro_RazE OP t1_j7soxij wrote

Missed that one. There's just so much coming 😅

2

SantoshiEspada t1_j7u7vxb wrote

Couple of years ago I used to track progress on arxiv but then it became impossible.
Back then this week's progress would felt like a year or more.
You made a great list here! Thank you

2

r0cket-b0i t1_j7ox3c3 wrote

Misleading, this is similar to tracking product progress by the number of features shipped. Any crypto startup still alive can give you 20 items bullet list of things done past month, but how much of that actually changed a thing....

Unfortunately on this list only the Alpha Fold item is significant

−1

InitialCreature t1_j7p0tmx wrote

I dunno, being able to synthesize game code from a few seconds of play is huge, someone likes how your guns handle in your shooter can take that and tweak it for their own game. Are we trademarking game mechanics yet?

3

expelten t1_j7pdppo wrote

Yes that's what caught my attention on this list. Consider also this will be applied to any software someday which means everything will be like open source...this would be extremely disruptive.

2

InitialCreature t1_j7sfx3x wrote

this is just the first version of the concept as far as I can tell, it's already showing me how simple things are going to change

1