arhetorical

arhetorical t1_j6nhean wrote

Hiya, great work again! Maybe I'm outing myself a little here, but the code doesn't work on Windows machines, apparently because the processes are spawned instead of forked. I'm not sure it's an easy fix and maybe not worth the time (it works fine on WSL) but just thought I'd mention in case you weren't aware!

On the ML side, should this scale up pretty straightforwardly to CIFAR100 or are there things to be aware of?

2

arhetorical t1_iz3pwak wrote

This has terrifying implications for the future of humanity.

> Once upon a time in the Mushroom Kingdom, Sonic the Hedgehog and Mario had a wild night of partying. Sonic, in a drunken stupor, stumbled into Princess Peach's castle and stumbled upon her sleeping in her bed. Sonic, being the mischievous hedgehog that he is, decided to have some fun and began tickling her feet.

> To his surprise, Princess Peach woke up and instead of being angry, she found Sonic's antics amusing. The two of them laughed and played together for hours, eventually forming a close bond.

> Sonic, feeling bold, decided to confess his love for Princess Peach. But to his dismay, she rejected him and explained that she was already in a committed relationship with Mario.

> Heartbroken, Sonic retreated to his home in Green Hill Zone. He spent his days moping and feeling sorry for himself, until one day a portal opened up and whisked him away to another dimension.

> In this new world, Sonic discovered that he had the ability to transform into a werehog. He used his new powers to fight against evil and protect the innocent, but he could never shake the feeling of unrequited love for Princess Peach.

> Sonic continued to live in this new world, never truly happy but always doing what he could to help others. The end.

5

arhetorical t1_iw16x4q wrote

It looks like a lot but there's nothing especially weird in there. If you spend some time tuning your model you'll probably end up with something like that too.

Adam - standard.

Linear warmup and decay - warmup and decay is very common. The exact shape might vary but cosine decay is often used.

Decreasing the update frequency - probably something you'd come up with after inspecting the training curve and trying to get a little more performance out of it.

Clipping the gradients - pretty common solution for "why isn't my model training properly". Maybe a bit hacky but if it works, it works.

The numbers themselves are usually just a matter of hand tuning and/or hyperparameter search.

5

arhetorical t1_ivb2sl1 wrote

I haven't tried doing that but if it's a similar resource requirement to prototyping (like if you'll be working with a pretrained model, not training one) then it should be fine. Again though, the biggest factor is whether you like it and if it works for you - since you bought a laptop instead of a workstation, you must have had a very good reason for needing one and none of us can answer that question for you. If you're not training, as long as your stuff fits in memory the specs don't matter that much.

1

arhetorical t1_iv8nb2t wrote

The only thing that matters is if you like it. The specs really don't matter that much. Either you'll be prototyping your model, in which case you'll just be training for an epoch or two and having better specs will only save you a little bit of time, or you'll be training it in which case a laptop is not going to cut it. An external GPU will just make your setup less portable without actually giving you the performance of a workstation.

1

arhetorical t1_iv8fays wrote

You already got the advice not to buy a laptop for deep learning. But if you're determined and understand that it's not a great idea to begin with, then any laptop with a compatible GPU is fine. You're prototyping, not actually training on it. If you like the one you got then just stick with it.

3