Nerveregenerator

Nerveregenerator t1_j16yfnw wrote

Write all the equations out one paper, then do one forward and backward pass on paper as well with a simple mlp. I believe bias can be easily incorporated using an extra 1 in the input and using an extra weight as the bias, so it’s updated the same as any other weight. Also learn the basics of matrix multiplication.

3

Nerveregenerator t1_j1572fl wrote

Deep learning is much different from typical programming topics in that it is composed of a large amount of mathematical and complex theoretical concepts that are not avoidable using a library. Getting the code to run is relatively easy, and the choice of library has mostly to do with deployment goals and utilizing existing implementations. When things aren't working, theres not really a compiler error as to what's wrong with the model/data pipeline, and deep theoretical knowledge comes into play.

1

Nerveregenerator OP t1_ix92czu wrote

Reply to comment by chatterbox272 in GPU QUESTION by Nerveregenerator

Ok, thanks I think that clears up the drawbacks. I’d have to check which motherboard I’m using now, but generally would you expect a 3090 to be compatible with the motherboard that works with a 1080ti? Thanks

1

Nerveregenerator OP t1_ix75olf wrote

Reply to comment by scraper01 in GPU QUESTION by Nerveregenerator

So I did some research. According to the lambda labs website, 4 1080s combined will get me 1.5x throughout as a 3090 with FP32 training. FP16 seems to yield a 1.5x speed up for the 3090 for training. So even with mixed precision, it comes out to be the same. The actual configuration of 4 cards is not something I’m very familiar with, but I wanted to point this out as it seems like NVIDIA has really bullshitted a lot with their marketing. A lot of the numbers they throw around just don’t translate to ML.

2