Submitted by lambolifeofficial t3_zzn35o in MachineLearning
Glycerine t1_j2ddrg8 wrote
This went viral pretty quickly. I'm pretty sure that was posted on reddit only a few days ago about going open source with the project: https://github.com/lucidrains/PaLM-rlhf-pytorch
https://old.reddit.com/r/artificial/comments/zy6swx/palm_with_rlhf_is_now_opensource/
I starred it this week at ~50stars, now it's 3.3k
It looks really exciting, but yes it's not easy to run. Knowing I'm underpowered for most ML work I still gave it a shot on my AMD 4.0Ghz - 32GB ram - 1080GTX.
The moment I knew it was out of reach to process wikipedia:
training: 0% 36/100000 [1:01:05<2581:58:40, 92.98s/it]training loss: 2.976300001144409
That shows it took 1 hour to reach epoch 36 (of 100K). Which estimates about 3 months (24/7) of training...
Secondly it's not built for daily driving yet, the source is still in dev mode and needs a intermediate python dev to execute it - just due to the implementation after the training step.
It would be fun to have a slow input system, or some documentation on how to load super thin datasets as an example. A finished model I can run immediately would be awesome - but I guess that's what the other team are doing.
The future of talky speaky machines is getting very exiting; I can't wait to see what happens two more papers down the line... I'm 101% looking forward to my speaky toaster!!!
comefromspace t1_j2dgqhz wrote
> The moment I knew it was out of reach to process wikipedia:
training: 0%| 274/100000 [10:06<55:51:29, 2.02s/it] training loss: 1.4352326393127441
on GTX1650
Disastrous_Elk_6375 t1_j2e6d6d wrote
> 92.98s/it
Are your CPUs fully used when training? You might want to check if this is running on GPU or not, those numbers are generally found on CPU training.
Glycerine t1_j2frh3o wrote
You're right it's poor. All 8 CPU's hit 100%.
As an update though:
I made a bunch of changes and reduces the dataset to 5 lines from wikipedia; reduced the PaLM size to about 25% of the original, and reduced the epoch times to 8.
It's phenomenal. Within < 30 minutes and a bunch of poking it can easily generate sensible sentences.
I dropped it onto lambda GPU A100 instance - it's silly fast
Edit:
As an example; I trained the model on 5 sentences, with a optimal length of ~128 chars. I ask for a word and see what it constructs.
The goal here is to see if it produces sensible sentences from real words:
With a known word the response is fairly stable:
qu('example')
'example, wrote of violence as a necessary and some'
>>> qu('example')
'example, wrote of violence as a necessary and some'
>>> qu('example', 20)
'example, wrote of vi'
>>> qu('example', 10)
'example, w'
>>> qu('example', 50)
'example, wrote of violence as a necessary and some'
untrained words produce some interesting results. Prior to the <100 epochs of training it was saying nonsense:
tensor(0.0431, grad_fn=<NllLoss2DBackward0>)
>>> qu('when')
'whent he wher a arevo-pociaty on indiviolent resis'
>>> qu('when')
'whent he refuted Nechaev). Other anarchists, some'
>>> qu('but')
'but. how a free society might be brought about. H'
>>> qu('but')
'but. The there is also ofowerat; there is no [[co'
Disastrous_Elk_6375 t1_j2ft5zo wrote
> You're right it's poor. All 8 CPU's hit 100%.
Yeah, you're probably not using the gpu. Make sure that your pytorch & cuda stuff are compatible and properly installed. To test, go into a python session, and do
torch.cuda.is_available()
If the output is false it will train on CPU.
Viewing a single comment thread. View all comments