SatoshiNotMe t1_j88fz8v wrote on February 12, 2023 at 12:43 PM

I agree, some of the things that make ML code inscrutable are that (a) every tensor has a shape that you have to guess, and keep track of as it goes through the various layers, plus (b) layers or operations that you have to constantly look up how they change the tensor shapes.

I’ve settled on two best practices to mitigate these:

Always include the tensor dimensions in the variable name: e.g. x_b_t_e is a tensor of shape (b,t,e), a trick I learned at a Berkeley DRL workshop many years ago.
Einops all the things! https://einops.rocks/

With einops you can express ops and layers in a transparent way by how the tensor dims change. And now suddenly your code is refreshingly clear.

The Einops page gives many nice examples but here’s a quick preview. Contrast these two lines:

y= x.view(x.shape[0], -1) # x: (batch, 256, 19, 19)

y_b_chw = rearrange(x_b_c_h_w, b c h w -> b (c h w)’)

Yes a little verbose but I find this helps hugely with the two issues mentioned above. YMMV :)

harharveryfunny t1_j88kg27 wrote on February 12, 2023 at 1:30 PM

>some of the things that make ML code inscrutable are that (a) every tensor has a shape that you have to guess, and keep track of as it goes through the various layers

That's not inherent to ML though - that's a library design choice to have tensor shape be defined at runtime vs compile time. A while back I wrote my own framework in C++ and chose to go with compile-time shapes, which as well as preventing shape errors is more in keeping with C++'s typing. For a dynamically typed language like Python maybe runtime-defined types/shapes seems a more natural choice, but still a choice nonetheless.