Glycerine t1_j2ddrg8 wrote on December 31, 2022 at 12:27 PM

This went viral pretty quickly. I'm pretty sure that was posted on reddit only a few days ago about going open source with the project: https://github.com/lucidrains/PaLM-rlhf-pytorch

https://old.reddit.com/r/artificial/comments/zy6swx/palm_with_rlhf_is_now_opensource/

I starred it this week at ~50stars, now it's 3.3k

It looks really exciting, but yes it's not easy to run. Knowing I'm underpowered for most ML work I still gave it a shot on my AMD 4.0Ghz - 32GB ram - 1080GTX.

The moment I knew it was out of reach to process wikipedia:

training:   0% 36/100000 [1:01:05&lt;2581:58:40, 92.98s/it]training loss: 2.976300001144409

That shows it took 1 hour to reach epoch 36 (of 100K). Which estimates about 3 months (24/7) of training...

Secondly it's not built for daily driving yet, the source is still in dev mode and needs a intermediate python dev to execute it - just due to the implementation after the training step.

It would be fun to have a slow input system, or some documentation on how to load super thin datasets as an example. A finished model I can run immediately would be awesome - but I guess that's what the other team are doing.

The future of talky speaky machines is getting very exiting; I can't wait to see what happens two more papers down the line... I'm 101% looking forward to my speaky toaster!!!

comefromspace t1_j2dgqhz wrote on December 31, 2022 at 1:01 PM

> The moment I knew it was out of reach to process wikipedia:

training:   0%| 274/100000 [10:06&lt;55:51:29,  2.02s/it] training loss: 1.4352326393127441

on GTX1650

Disastrous_Elk_6375 t1_j2e6d6d wrote on December 31, 2022 at 4:29 PM

> 92.98s/it

Are your CPUs fully used when training? You might want to check if this is running on GPU or not, those numbers are generally found on CPU training.

Glycerine t1_j2frh3o wrote on December 31, 2022 at 11:13 PM

You're right it's poor. All 8 CPU's hit 100%.

As an update though:

I made a bunch of changes and reduces the dataset to 5 lines from wikipedia; reduced the PaLM size to about 25% of the original, and reduced the epoch times to 8.

It's phenomenal. Within < 30 minutes and a bunch of poking it can easily generate sensible sentences.

I dropped it onto lambda GPU A100 instance - it's silly fast

Edit:

As an example; I trained the model on 5 sentences, with a optimal length of ~128 chars. I ask for a word and see what it constructs.

The goal here is to see if it produces sensible sentences from real words:

With a known word the response is fairly stable:

 qu('example')
'example, wrote of violence as a necessary and some'
&gt;&gt;&gt; qu('example')
'example, wrote of violence as a necessary and some'
&gt;&gt;&gt; qu('example', 20)
'example, wrote of vi'
&gt;&gt;&gt; qu('example', 10)
'example, w'
&gt;&gt;&gt; qu('example', 50)
'example, wrote of violence as a necessary and some'

untrained words produce some interesting results. Prior to the <100 epochs of training it was saying nonsense:

tensor(0.0431, grad_fn=&lt;NllLoss2DBackward0&gt;)
&gt;&gt;&gt; qu('when')
'whent he wher a arevo-pociaty on indiviolent resis'
&gt;&gt;&gt; qu('when')
'whent he refuted Nechaev).  Other anarchists, some'
&gt;&gt;&gt; qu('but')
'but. how a free society might be brought about.  H'
&gt;&gt;&gt; qu('but')
'but.  The there is also ofowerat; there is no [[co'

Disastrous_Elk_6375 t1_j2ft5zo wrote on December 31, 2022 at 11:26 PM

> You're right it's poor. All 8 CPU's hit 100%.

Yeah, you're probably not using the gpu. Make sure that your pytorch & cuda stuff are compatible and properly installed. To test, go into a python session, and do


torch.cuda.is_available()

If the output is false it will train on CPU.