Viewing a single comment thread. View all comments

Background_Thanks604 t1_ixu3170 wrote

How do you mean - translate model weights by hand?

4

Deep-Station-1746 t1_ixu6ych wrote

Load pt model with torch, get all weights with state_dict. Make them all into channels last format with permute. Load all the arrays into numpy. Load all the arrays into tensorflow. Write the network forward logic in tensorflow. Plug in weights. Run.

31

Background_Thanks604 t1_ixuccby wrote

Thanks for clearification! Do you know if there is an tutorial/blog post for this approach?

4

Deep-Station-1746 t1_ixugp6t wrote

Nope. I don't think so. If you need help and are willing to wait a bit (A bit busy right now), DM me and I'll take a look at your problem.

6

Background_Thanks604 t1_ixukvcg wrote

Thx - appreciate it! I dont have a problem i just want to learn/try this approach because i never heard of it.

7

jobeta t1_ixum5yx wrote

How complex is the model you want to translate?

1

Background_Thanks604 t1_ixun76x wrote

I dont have a model to translate - i read about this approach in the comments and i want to learn about it.

2

ApeForHire t1_ixwveuz wrote

I was actually able to do this once by relying heavily on Github's Copilot, ie just giving functions very specific names and commenting things like "this function converts a pytorch model into a vector of weights" etc. It worked pretty well and was simple for basic network architectures, though I imagine it could get more complicated.

1

CodaholicCorgi OP t1_iy1s4ws wrote

I did it once in one of my projects. It's basically hand-picking weights from Pytorch model's layers to TensorFlow model's layers, but it feels more reliable than relying on ONNX with a bunch of warnings.

Though there are not much tutorials or blog posts about this, I will try creating a github repo for this later (just examples with simple layers), so many more people will know that this technique exists.

2