Viewing a single comment thread. View all comments

SoylentRox t1_j4nckq8 wrote

So here's the feature I think you need to make the tool work:

right now, the machine works by:

<current symbol buffer> + neural network -> <current symbol buffer> + 1 symbol

It needs to become

f(all previous sessions symbols) = salient context

<salient context> + <current symbol buffer> + neural network -> <current symbol buffer> + 1 symbol + <updated salient context>

"Salient context" is whatever the machine needs to continue generating text to match to something like a detective story. So it needs to remember the instructions, the main character's names, and so on. It does not need to remember every last word previously in the story.

To make it really good it needs to be aware of metrics of quality. Amazon/royalroad number of reviews and review ratings. Number of copies sold of the novel on the market. Etc. This way it can weight what it learns from text by how much humans liked that particular structure of text.

After that you'll need the AI to generate many stories, get user feedback, and iterate. I think eventually they will be good, and at some point past that it may discover ways to make them REALLY good that humans have not.

9

graham_fyffe t1_j4ndc3j wrote

You can ask chatGPT to write a summary of the story first, then the chapter names and chapter summaries, then each chapter one at a time. Try it! This hierarchical method can already achieve some of what you’re talking about.

6

SoylentRox t1_j4ndg65 wrote

So essentially it would just do something similar automatically.

3

FoxOwliegirl t1_j4vnsly wrote

That is also used by human writers, it is called the snowflake method.

1

graham_fyffe t1_j4ne8yf wrote

Oh and by the way, using human ratings of the model output is exactly how ChatGPT is trained. Human-in-the-loop reinforcement learning.

4

SoylentRox t1_j4neivp wrote

Correct but this was done at a small scale by chatGPT employees. I am saying we look at every novel that has data on its sales, every story on a site that has metrics of views or other measurements of quality and popularity, etc.

This might give the machine more information on what elements work that people like. Maybe enough to construct good stories.

3

red75prime t1_j4pogqn wrote

> <salient context> + <current symbol buffer> + neural network

That's RNN (recurrent neural network). As far as I know LSTM is still state of the art for them. And it struggles with long-term dependencies.

[He checks papers]

It looks like combination of transformer and LSTM does provide some benefits, but nothing groundbreaking yet.

1