Why is it that when I go to create a CNN with 4 layers (output channels: 64, 32, 16, 16), I can do this in PyTorch, but in Tensorflow I get resource errors saying I don't have enough resources?

For reference I am using a stock NVIDIA RTX 3080.

Also, now that I am experimenting with larger models, would I benefit from renting TPU? Does this make the actual models train faster and would it help with larger batches?

Comments

You must log in or register to comment.

schludy t1_j09ukmr wrote on December 15, 2022 at 2:34 AM

Do you handle the data the same way? Maybe you're loading more data in the tensorflow implementation. It's really hard to tell without seeing the code

Oceanboi OP t1_j0cpolp wrote on December 15, 2022 at 6:16 PM

Data is handled both the same way. I think it has to do with what u/MrFlufypants said, because when I restart my kernel and run it after freeing up some resources, it runs. I think the number of filters I was setting are right at the threshold in which my GPU runs out of VRAM, so small memory management differences in TF and PyTorch are causing TF to hit the limit faster than PyTorch.

mofawzy89 t1_j09kc7w wrote on December 15, 2022 at 1:18 AM

I'm not sure about memory management between both but I faced thr same for BiLSTM For large models use gcp better yes tpus or nvidia A100

MrFlufypants t1_j0ao1ei wrote on December 15, 2022 at 7:07 AM

I’ve had issues where tensorflow automatically grabs the whole gpu while PyTorch only uses what the model asks for. Could totally not be your problem, but if you’re running multiple models it could be your problem

veb101 t1_j0atf1m wrote on December 15, 2022 at 8:16 AM

I think this can be solved using:

tf.config.experimental.set_memory_growth(gpu_device, True)

MrFlufypants t1_j0atiwu wrote on December 15, 2022 at 8:18 AM

There are a couple ways to do it. That’s the one I use normally. Sometimes that doesn’t work though. Can’t quite remember the use case where it wasn’t working

MOSFETBJT t1_j0dbib4 wrote on December 15, 2022 at 8:36 PM

This is what helped when I had a similar issue

[deleted] t1_j09kbyk wrote on December 15, 2022 at 1:18 AM

[deleted]

VirtualHat t1_j0ds33h wrote on December 15, 2022 at 10:24 PM

They mean they have 4 Conv layers, with 64, 32, 16, and 16 channel outputs. The filter size is not given.