visarga OP t1_jegmcux wrote on March 31, 2023 at 9:36 PM

Reply to comment by mjk1093 in HuggingGPT - Solving AI Tasks with ChatGPT and its Friends in HuggingFace by visarga

I think they spin up a container if there isn't one running. Usually there isn't, so you have to wait a minute or two. Then it works slowly, but it is simpler than downloading the model.

In this paper the HuggingGPT system uses a bunch of local models, and calls on the HuggingFace API for the rest. So they try to run their own tool-models, at least a few of them because HF is so flaky.

I think this paper is pretty significant. It expands the OpenAI Plugin concept with AI-plugins. This is great because you can have a bunch of specialised models combined in countless ways, chatGPT being the orchestrator. It's automated AI pipelines. If nothing else, it could be used to generate training data for a multi-modal model like GPT-4. Could be a good business opportunity for HuggingFace too, their model zoo is impressive.

visarga t1_jegkwr6 wrote on March 31, 2023 at 9:26 PM

Reply to comment by drekmonger in ChatGB: Tony Blair backs push for taxpayer-funded ‘sovereign AI’ to rival ChatGPT by signed7

If you stop the regular people from using AI then only criminals and government will use it. How is that better? And you can't stop it because a good enough AI will run on edge/cheap hardware.

To be practical about disinformation it would be better to work on human+AI solutions. Like a network of journalists flagging stories and then AI extending that information to the rest of the media.

You should see the problem of disinformation as biology, the constant war between organism and viruses, the evolving immune system. Constant war is normal state, we should have the AI tools to bear the disinformation attack. Virus and anti-virus.

visarga OP t1_jegd2cs wrote on March 31, 2023 at 8:33 PM

Reply to comment by spriggankin in HuggingGPT - Solving AI Tasks with ChatGPT and its Friends in HuggingFace by visarga

HuggingFace is the GitHub of AI. It hosts 166,392 AI models and 26,787 datasets. It has implementations for all the models in its own framework and is usually the starting codebase for research papers. You can also interact with many models right on their website in the "spaces" section.

You can also see it like an App Store for AI, you can shop for models and then include them in your project with 5 lines of code.

visarga OP t1_jega0z1 wrote on March 31, 2023 at 8:12 PM

Reply to comment by spriggankin in HuggingGPT - Solving AI Tasks with ChatGPT and its Friends in HuggingFace by visarga

No, it's a serious paper. They can orchestrate hundreds of models from HuggingFace through chatGPT. That's like AI plugins for AI chat.

visarga t1_jefgdew wrote on March 31, 2023 at 4:56 PM

Reply to comment by dr_doug_exeter in Google CEO Sundar Pichai promises Bard AI chatbot upgrades soon: ‘We clearly have more capable models’ - The Verge by Wavesignal

People affectionately call it "Brad"

visarga t1_jeehxgo wrote on March 31, 2023 at 1:02 PM

Reply to comment by NonDescriptfAIth in The only race that matters by Sure_Cicada_4459

You don't understand, even a model well tuned by OpenAI to be safe, if it gets in the hands of the public, will be fine-tuned to do anything they want. It doesn't matter what politicians do to regulate the big players.

The only solution to AGI danger is to release it everywhere at once, to balance out AGI by AGI. For example the solution to AI generated spam and disinformation is AI based detection, humans can't keep up with the bots.

visarga t1_jedl81q wrote on March 31, 2023 at 6:26 AM

Reply to comment by mihaicl1981 in Goddamn it's really happening by BreadManToast

I think the social component of AI is picking up steam. What I mean is the culture around AI - how to train, fine-tune, test and integrate AIs in applications, how to mix and match the AI modules, this used to be the domain of experts. Now everyone is assimilating this culture and we see an explosion of creativity.

The rapid rate of AI advancement is overlapping with the rapid rate of social adoption of AI and making it seem to advance even faster.

12h later edit: This paper comes out HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace . AI is orchestrating AI by itself. What can you say?

visarga t1_jedjvrn wrote on March 31, 2023 at 6:09 AM

Reply to comment by sideways in Goddamn it's really happening by BreadManToast

> The next phase shift happens when artificial systems start doing science and research more or less autonomously. That's the goal. And when that happens, what we're currently experiencing will seem like a lazy Sunday morning.

At CERN in Geneva they have 17500 PhD's working on physics research. Each of them GPT-5 or higher level, and yet it takes years and huge investments to get one discovery out. Science requires testing in the real world, and that is slow and expensive. Even AGI needs to use the same scientific method with people, it can't theorize without experimental validation. Including the world in your experimental loop slows down progress speed.

I am reminding people about this because we see lots of magical thinking along the lines of "AGI to ASI in one day" ignoring the experimental validation steps that are necessary to achieve this transition. Not even OpenAI researchers can guess what will happen before they start training, scaling laws are our best attempt, but they are very vague. They can't tell us what content is more useful, or how to improve a specific task. Experimental validation is needed at all levels of science.

Another good example of what I said - the COVID vaccine was ready in one week but took six months to validate. With all the doctors focusing on this one single question, it took half a year, while people were dying left and right. We can't predict complex systems in general, we really need experimental validation in the loop.

visarga t1_je6kqvw wrote on March 29, 2023 at 7:47 PM

Reply to comment by EverythingGoodWas in [D] The best way to train an LLM on company data by jaxolingo

I'd rather fine-tune the LLM on company documentations than feeding it through retrieval. Does anyone have experience with fine-tuning GPT-3 on a new text? Can it answer questions or freely use information from this text?

visarga t1_je6k74d wrote on March 29, 2023 at 7:44 PM

Reply to comment by Im_Unlucky in [D] The best way to train an LLM on company data by jaxolingo

Often it seems that the model can't properly synthesise information from a bunch of snippets, it lacks the context of those snippets, so it will combine the information incorrectly or hallucinate an explanation.

Retrieval + loading data in the context is far from solved.

visarga t1_je6a6w8 wrote on March 29, 2023 at 6:40 PM

Reply to comment by harharveryfunny in [Discussion] IsItBS: asking GPT to reflect x times will create a feedback loop that causes it to scrutinize itself x times? by RedditPolluter

> its own output is its only working memory

All the fantastic feats LLMs can do are thanks to context conditioning.

visarga t1_je0zqxm wrote on March 28, 2023 at 5:06 PM

Reply to comment by truchisoft in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

ML people spend all day thinking about model limitations and errors, it's only normal that we are not so easily swayed by a non-peer reviewed paper declaring first contact with AGI. Especially from MS who owns 50% of OpenAI

visarga t1_jdzu6az wrote on March 28, 2023 at 12:16 PM

Reply to comment by spiritus_dei in [D] FOMO on the rapid pace of LLMs by 00001746

Let the critics critique, it's better to have an adversarial take for everything, when you take a survey you get better calibration that way.

He's angry for the forced Gallactica retraction, followed by chatGPT success. Both models had hallucination issues but his model was not tolerated well by the public.

visarga t1_jdztq3o wrote on March 28, 2023 at 12:12 PM

Reply to comment by CriticalTemperature1 in [D] FOMO on the rapid pace of LLMs by 00001746

In short, build around LLMs and with LLMs, but don't compete directly with them.

visarga t1_jdzt9gd wrote on March 28, 2023 at 12:08 PM

Reply to comment by Craksy in [D] FOMO on the rapid pace of LLMs by 00001746

> Generalized 😓

visarga t1_jdzr4tp wrote on March 28, 2023 at 11:47 AM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

This paper scared me more than any other ML paper. I hoped we have 2-3 more years until what they show in there.

visarga t1_jdu1fgf wrote on March 27, 2023 at 4:44 AM

Reply to comment by LifeScientist123 in [D] GPT4 and coding problems by enryu42

> Does this mean developers/humans don't have AGI?

The intellect of our species isn't universal, we're merely experts at self-preservation and propagation. Take, for instance, chess – it isn't our forte, and even a small calculator could outperform us. Our minds are incapable of 5-D visualization, and we struggle to maintain over 10 unrelated items in our immediate memory. Generally, we falter when addressing problems where the initial move relies on the final steps, or situations that don't allow for linear progression, such as chess or mathematical quandaries. It took us centuries to decipher many of these enigmas. Our specialization lies in tackling human-centric challenges, rather than all-encompassing ones. Evolution simply hasn't had sufficient time to adapt our cerebral cortex for mathematical prowess.

visarga t1_jdtz3wx wrote on March 27, 2023 at 4:19 AM

Reply to comment by boaking69 in [D] GPT4 and coding problems by enryu42

The original title of the "Sparks of AGI" paper was "First Contact With an AGI System" (line 8). If you carefully read the paper it suggests GPT-4 is stronger than what seems to be our consensus.

visarga t1_jdtypz6 wrote on March 27, 2023 at 4:15 AM

Reply to comment by Haycart in [D] GPT4 and coding problems by enryu42

Doesn't autoregressive decoding cache the states for the previous tokens when decoding a new token?

visarga t1_jdtyd0c wrote on March 27, 2023 at 4:11 AM

Reply to comment by trajo123 in [D] GPT4 and coding problems by enryu42

> Perhaps get augmented with some sort of LSTM architecture where state can be built up from a theoretically infinite amount of input

That would be sweet, infinite input. Does RWKV do it?

visarga t1_jdtxxfd wrote on March 27, 2023 at 4:07 AM

Reply to comment by blose1 in [D] GPT4 and coding problems by enryu42

You're mistaken, Olympiad problems require bespoke tricks that don't generalise from problem to problem. It's not a problem of breadth of knowledge, they don't test memorisation.

visarga t1_jdtwr3g wrote on March 27, 2023 at 3:55 AM

Reply to comment by yaosio in [D] GPT4 and coding problems by enryu42

> I am saying we don't know what consciousness is because we're missing information and we don't know what information we're missing

I take a practical definition - without it we can't even find the mouth with the hand to eat.

visarga t1_jdlpf0h wrote on March 25, 2023 at 9:26 AM

Reply to comment by mxby7e in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry

What about data generated from Alpaca, is that unrestricted?

visarga t1_jdlpae7 wrote on March 25, 2023 at 9:24 AM

Reply to comment by WarAndGeese in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry

OpenAI has first hand RLHF data. Alpaca has second hand. Wondering if third hand is good enough and free of any restrictions.