buzzbuzzimafuzz
buzzbuzzimafuzz t1_j7lz92s wrote
Reply to [N] Microsoft announces new "next-generation" LLM, will be integrated with Bing and Edge by currentscurrents
A quote from the Verge liveblog:
>This is an important part of the presentation, but I just want to note that Microsoft is having to carefully explain how its new search engine will be prevented from helping to plan school shootings.
>
>"Early red teaming showed that the model could help plan attacks" on things like schools. "We don't want to aid in illegal activity." So the model is used to act as a bad actor to test the model itself.
The safety system proposed sounds interesting but given how simple prompt engineering attacks still work on ChatGPT, I'm not feeling optimistic about how well this will work out in the real world.
buzzbuzzimafuzz t1_j4u5jrz wrote
Reply to [D] RLHF - What type of rewards to use? by JClub
I think what OpenAI and Anthropic typically do is providing evaluators with two possible responses and having them select which one is better. If you have numerical ratings, it might be hard to calibrate them. From the original paper "Deep reinforcement learning from human feedback" (2017):
>We ask the human to compare short video clips of the agent’s
behavior, rather than to supply an absolute numerical score. We found comparisons to be easier for humans to provide in some domains, while being
equally useful for learning human preferences.
Comparing short video clips is nearly as fast as
comparing individual states, but we show that
the resulting comparisons are significantly more
helpful
ChatGPT seems to be trained from a combination of expert-written examples and upvotes and downvotes on individual messages.
buzzbuzzimafuzz t1_j1nn350 wrote
Reply to comment by _underlines_ in [D] When chatGPT stops being free: Run SOTA LLM in cloud by _underlines_
Video unavailable :(
How is Open Assistant trained and how good is it so far?
buzzbuzzimafuzz t1_j1nmqw2 wrote
Reply to comment by CriticalTemperature1 in [D] Has anyone integrated ChatGPT with scientific papers? by justrandomtourist
You can't copy a whole paper into ChatGPT, at least not in one message. Is there a specific part that you copy in?
buzzbuzzimafuzz t1_j8zafoo wrote
Reply to [D] What are the worst ethical considerations of large language models? by BronzeArcher
The mess that has been Bing Chat/Sydney, but instead of just verbally threatening users, it's connected with APIs that let it take arbitrary actions on the internet to carry out them out.
I really don't want to see what happens if you connect a deranged language model like Sydney with a competent version of Adept AI's action transformer to let it use a web browser.