Over the past two and a half years, EleutherAI has grown from a group of hackers on Discord to a thriving open science research community. Today, we are excited to announce the next step in our evolution: the formation of a non-profit research institute.

This will enable us to do much more, and we look forward to building a world class research group for public good! This organization will be lead by long-time contributors to EleutherAI: Stella Biderman (me) as Executive Director and Head of Research, Curtis Huebner as Head of Alignment, and Shiv Purohit as Head of Engineering.

The world has changed quite a lot since we first got started. When EleutherAI was founded, the largest open source GPT-3-style language model in the world had 1.5B parameters. GPT-3 itself was not available for researchers to study without special access from OpenAI, and most NLP researchers had a very minimal understanding of the engineering undertaking required to train such models or their capabilities & limitations. We started as a ragtag group nobody had heard of, and within a year had released the largest OSS GPT-3-style model in the world.

As access to LLMs has increased, our research has shifted to focus more on interpretability, alignment, ethics, and evaluation of AIs. We look forward to continuing to grow and adapt to the needs of researchers and the public

Check out our latest work at www.eleuther.ai or come hang out in our research lab at www.discord.gg/eleutherai

Huge shout out to the donors who have made our work possible: Stability AI, Hugging Face, CoreWeave, Nat Friedman, Lambda Labs, and Canva

Comments

keepthepace t1_janzb1v wrote on March 2, 2023 at 8:22 PM

#2,146,916

Congratulations! The world desperately needs what you are doing! Was thinking about joining a while ago but got distracted by image-oriented research.

> As access to LLMs has increased, our research has shifted to focus more on interpretability, alignment, ethics, and evaluation of AIs.

Does this mean EleutherAI is not working anymore on big language models?

currentscurrents t1_jao0a1x wrote on March 2, 2023 at 8:28 PM

#2,146,965

Congrats! Can't wait until you get your first $10-billion investment from a major tech company.

StellaAthena OP t1_jao8e46 wrote on March 2, 2023 at 9:20 PM

#2,147,286

Replying to keepthepace (#2,146,916)

No it does not. In the past we felt that the best way to achieve our goals was to focus almost exclusively on training large models though, and we no longer feel that’s the case.

starlistener t1_jaojzal wrote on March 2, 2023 at 10:36 PM

#2,147,689

Congratulations on the initiative! Is there a way for people willing to help with the research as entry-level collaborators/volunteers? I am just starting my steps with ML, and I certainly don't have much to add but I'd love to get involved in an open-research initiative and help somehow!

StellaAthena OP t1_jaom9of wrote on March 2, 2023 at 10:52 PM

#2,147,753

Replying to starlistener (#2,147,689)

Definitely! Come check out our discord server and introduce yourself.

starlistener t1_jaoqzu8 wrote on March 2, 2023 at 11:26 PM

#2,147,918

Replying to StellaAthena (#2,147,753)

Will do! Thank you kindly!

EricHallahan t1_jap0tic wrote on March 3, 2023 at 12:38 AM

#2,148,311

Replying to keepthepace (#2,146,916)

To clarify: EleutherAI will continue to work with large language models and train its own when there is a clear research case as it always has—there just happens to be a much larger saturation of suitable models today for the research we would like to conduct than what existed even twelve months ago, and there is no reason to reinvent something when something suitable already exists. Expect new models to be designed and trained to specifically meet certain research requirements, rather than more versatile usage.

Fuehnix t1_japumtw wrote on March 3, 2023 at 4:35 AM

#2,149,512

Replying to StellaAthena (#2,147,753)

The discord seems intimidatingly huge with 20k+ members, and 3000 online...

Is it really feasible to collaborate and communicate with the group?

I have a B.S. in CS+Linguistics from UIUC, but I had some life and financial complications that blocked me from grad school. I sorted those things out recently, but now I'm trying to find people to do NLP research with so I can be competitive when I apply for Fall 2024 in December.

I'm somewhere in between a senior CS student and first year grad student right now probably.

Fuehnix t1_japuycf wrote on March 3, 2023 at 4:38 AM

#2,149,529

Replying to currentscurrents (#2,146,965)

Here's hoping they never become ClosedAI 🥂

xEdwin23x t1_jaq2p2q wrote on March 3, 2023 at 5:55 AM

#2,149,812

Replying to Fuehnix (#2,149,512)

They have a list of projects and / or ideas pinned to some of their channels. If you want something to happen then you're expected to be pro-active and lead (or follow someone else who is leading); it's the only way this kind of collaboration can work. Tbf it's very hard to collaborate among people on different time zones with their own schedules but they somehow make it work.

[deleted] t1_jaqcd0a wrote on March 3, 2023 at 7:52 AM

#2,150,149

Replying to StellaAthena (#2,147,286)

[deleted]

WarAndGeese t1_jaqz31a wrote on March 3, 2023 at 12:44 PM

#2,150,808

Replying to currentscurrents (#2,146,965)

Why? That would be the end of it. If your comment was sarcastic then pardon my overreaction.

WarAndGeese t1_jaqz7b1 wrote on March 3, 2023 at 12:45 PM

#2,150,811

Replying to WarAndGeese (#2,150,808)

These things need to be free and open source, not have some profit motive to them. As soon as that day comes means interest in the project is lost and people will look for some other 'free' or 'eleuther' project.

badabummbadabing t1_jar3uab wrote on March 3, 2023 at 1:27 PM

#2,151,024

Going forward, under which licences are you going to release your code/weights/data?

EricHallahan t1_jar9qm1 wrote on March 3, 2023 at 2:16 PM

#2,151,359

Replying to Fuehnix (#2,149,512)

Yes, come on in! We do not expect contributors to devote massive amounts of time or go out of their way to contribute—they all have lives of their own, and we respect that.

As for collaboration, we make it work. Most communication is asynchronous text, which is quite versatile and hides varied schedules well.

EricHallahan t1_jarcwg0 wrote on March 3, 2023 at 2:41 PM

#2,151,545

Replying to badabummbadabing (#2,151,024)

I do not expect anything other than Apache 2.0 or MIT (our go-to licenses) for code and checkpoints. Data is a bit harder, but we will always try to release everything under the most permissive license we can.

currentscurrents t1_jat9lvg wrote on March 3, 2023 at 10:16 PM

#2,154,726

Replying to WarAndGeese (#2,150,808)

It's a joke. OpenAI was supposed to be a nonprofit too, now they look more like a Microsoft subsidiary.

WarAndGeese t1_jb0rsum wrote on March 5, 2023 at 3:43 PM

#2,165,560

Replying to currentscurrents (#2,154,726)

My mistake, it is a funny and good joke I just overreacted. I see too many non-ironic statements like that and it clouded my vision.