lqstuart t1_iuouqng wrote on November 1, 2022 at 11:03 PM

I'm a little confused by some of the answers here, not sure if it's because this sub skews towards academia/early career or maybe I'm just out of touch.

Pretty much anywhere in the industry your $3000 M1 Mac is going to be used for either opening up a tab in Chrome or, at most, the rigorous task of SSHing into a Linux VM. There are basically two reasons:

Large, public companies typically don't allow their user/production data to exist on any machine that has internet access--you can get in very deep shit or in some cases even fired for having it stored locally--so that's game over for a MBP automatically.
Most companies of any size will have some need for distributed training. That means you need a platform to schedule training jobs, or else your GPUs will sit idle and you'll go bankrupt. That means maintaining compatible versions of TF/PT, CUDA, and GCC, which eventually means building their own distribution of TF/PT. They're not going to bother building a separate distro for two different flavors of MBP floating around in addition to production. Often, your model code doesn't work in the absence of the platform SDKs, because, for example, they need to be able to authenticate with S3 or HDFS or wherever the data is stored.

I'm not sure that any company besides Apple will ever invest in Apple's M1 shit specifically, but nobody uses TPUs and that doesn't stop Google from pushing it every chance they get. However, a lot of the industry is getting more and more pissed at NVIDIA, which may in turn open things up to local development in the future.

laprika0 OP t1_iup4bqt wrote on November 2, 2022 at 12:11 AM

These are interesting points. I think it depends on where in the stack I'll be. At my last place I spent most of my time building and testing abstract ML functionality that I never deployed to production myself (other teams did that) and could be tested on a CPU in a reasonable amount of time. I can imagine the "other team" worked with the restrictions you mention. In my next role, I may well wear both hats.

lqstuart t1_iuq94uz wrote on November 2, 2022 at 6:02 AM

The roles where you do a little of both are the most fun! I used to do the algo work, now I work entirely on the infra side of things at one of the larger corps. We support some massive teams that have their own platform team between the AI devs and us, and also some smaller teams where the AI devs do it all themselves and just talk to us directly.

In all cases, where I am now and in my previous infra-only role, the AI teams were kinda stuck on our Linux shit for the reasons I described--specifically, you need to write stuff differently (or use an SDK that's tightly coupled to the underlying compute) for distributed training so there's no real point running it locally.

I personally REALLY miss the ability to develop and test locally with a real IDE, so I hope something changes--however, the trend is heading towards better remote development, not making stuff work on Mac.