vade t1_j6xeylg wrote on February 2, 2023 at 4:02 PM

Reply to [D] Apple's ane-transformers - experiences? by alkibijad

FWIW, a colleague of mine is working on this, and is also hitting some hiccups. Ive pointed them to this thread :)

vade t1_j0gpows wrote on December 16, 2022 at 2:54 PM

Reply to comment by Outrageous_Room_3167 in I have 6x3090 looking to build a rig by Outrageous_Room_3167

It’s fine - we haven’t had a huge issue due to it but we work on video related projects so memory is always a boon. That’s all!

vade t1_j0gp6ha wrote on December 16, 2022 at 2:50 PM

Reply to comment by Outrageous_Room_3167 in I have 6x3090 looking to build a rig by Outrageous_Room_3167

We have a Ryzen 3950 max ram (128gb) - but I’d like to get a system that supports more ram and 16x on all pci slots. But alas - $$$

vade t1_j0dl85z wrote on December 15, 2022 at 9:38 PM

Reply to I have 6x3090 looking to build a rig by Outrageous_Room_3167

I run 3x 3090 in a single case, without water cooling, but using one PCI riser and keeping the case open to allow for airflow. This is on a single 1600w PSU, no NVLink.

Anything more would be tough without a custom loop, and dual PSU.

Works great!

edit: I use a Fractal Design Design XL, and mount one 3090 FE vertically with a riser. Its janky but works.

vade t1_iumm84n wrote on November 1, 2022 at 2:19 PM

Reply to [D] Machine learning prototyping on Apple silicon? by laprika0

So there’s a few ways to think about this and some things to know

A) Apple Neural Engine is designed for inference workloads and not back prop or training as far as I’m aware.

B) This means only GPU or CPU for training for DL

C) You can get partial GPU accceleration using pytorch and tensorflow but neither are fully optimized or really competitive.

D) you can accept the training wheels (pun intended) and train simple models using createML GUI which has about as good as you’ll get M series support for GPU but is woefully out of date for many classes of problems and doesn’t support arbitrary layers, losses,optimizers etc. it’s a black box.

E) You can use createML api to get a tad more control but not much more.

If you’re interested in coreML for inference I will say from experience model conversion is non trivial if you want performance as some layers don’t always convert appropriately and shapes can’t always be deduced depending on the models source code.

Also CoreML inference in python doesn’t properly support batching. I’m not joking.

All in all if you get simple shit working it’s fast, but if you want anything remotely nuanced or not out of the box you’re fucked unless you want to write custom metal re-implementations of things like NMS do you can get access to outputs Apples layers don’t supply.

Source: banging my head against fucking wall

vade t1_iqohymz wrote on October 1, 2022 at 10:40 PM

Reply to [D] New Laptop for Deep Learning/ Machine Learning by MyActualUserName99

Running training is going to throttle the GPU quickly due to heat buildup. It’s just a fact that these systems have limited airflow and can’t sustain super high clock speeds / usage under load - esp ml training loads which really put systems under duress.

I don’t know enough about either build but something to be aware of.