rafgro

rafgro t1_j8cc6ne wrote

Agreed. The quality of discussions under posts is also pretty bad.

IMO it's the result of outdated rules and lax moderation. On the rules, there's definitely a need to address low-effort chatgpt posts and comments. Some of them are straight scam posts! On the moderation, it's not about quality but about the quantity, realistically this sub has just a few moderators (because some/most of these 9 lads are very busy engineers), with no new moderators added in the last two years, while it has seen enormous huge growth in members.

11

rafgro t1_j7eq9ek wrote

Nah, it's not engineering vs science or OS vs closed. It's much simpler:

>FAIR's Galactica. People crucified it because it could generate nonsense. ChatGPT does the same thing.

YLC threw a fit over the whole Galactica debacle. He had lovely aggressive tweets such as "Galactica demo is off line for now. It’s no longer possible to have some fun by casually misusing it. Happy?" or describing people who disliked Galactica as "easily scared of new technology". To see the success of ChatGPT just a few weeks later must have been really painful.

42

rafgro t1_j47678z wrote

Reply to comment by nohat in [D] Bitter lesson 2.0? by Tea_Pearce

See, it's not bitter lesson 1.0 when you replace "leverage computation" with "leverage large models that require hundreds of GPUs and entire internet". Sutton definitely did not write in his original essay that every bitter cycle ends with:

>breakthrough progress eventually arrives by an approach based on scaling computation

17

rafgro t1_j0fpg80 wrote

Do you embed some special clauses or verification to limit hallucination? In my experiences with splicing primary sources into input, sometimes it can even induce more hallucinations (which can be more believable with sources but still false!). To test it out here, I consciously asked a few questions with no obvious answers - such as "What genes cause brain cancer?" - and got nice response in the form of "there's no answer yet".

2

rafgro t1_j0amjj9 wrote

In case of such articles, I cannot escape the feeling that the authors do not interact with these models at length and mainly argue with their imagined form of interaction. Here, it is the premise of the significant part of the paper:

>a fictional question-answering system based on a large language model

...with imagined conversations and discussion of its imagined flaws, eg. the author criticizes it for lack of communicative intent, no awareness of the situation, no ability to "know anything", or that it "cannot participate fully in the human language game of truth" (self-citation a.d. 2010, in "Embodiment" presented as, roughly, everyday use of words and adjusting the use to the context). Thanks, I guess? How about interacting with actual models that beat you in the game of truth and are sometimes too nosy in their communicative intent?

1

rafgro t1_j05enj4 wrote

Example tokenizer: https://github.com/josephrocca/gpt-2-3-tokenizer, in the most vanilla version you could count occurrences of tokens from the question/task in the document and jump to that place, eg. if the task is about lung cancer, jump to the book chapter with most occurrences of "lung" and "cancer". It works fine enough but you can make it more robust by building a timid scoring system (eg. higher weight assigned to lung than to cancer), finding words related to task words with word2vector equivalent and looking for them with appropriate weights as well, or even splicing a few different places with high score into one prompt.

2

rafgro t1_j00tamz wrote

I've found another, much cheaper approach - tokenize long text and task (client-side without costly API calls), find the highest density of task token matches in the long text, slide content window there while retaining general summary of the document, and answer from this prompt.

2

rafgro t1_izyke4t wrote

I've been sliding content window, summarizing chunks, chaining summaries, summarizing chained summaries, all while guiding attention (focus on X, ignore Y). I've had also limited success with storing all summaries separately, choosing most relevant summary based on task/question, and then answering with opened relevant context window in addition to the summaries, but it was too much pain (also financial) for very small gain in my case (but I imagine in legal environment it may be much more important to get every detail right).

6