phb07jm

phb07jm t1_isbst7m wrote

I'm going through a similar process, but with an established team. I'm also working on a large company with a big technology arm so have the support of a decent data architecture team, and this is still a tricky question to navigate. Here's where my thinking is currently at, would love to hear alternative views.

  1. Don't reinvent the wheel. There are many ways to be right here but building your own in-house MLOps platform is nuts at this stage (unless perhaps you plan to sell access to it - i.e you're an ML consultancy).

  2. Use industry standard tools. I'd need to hear a pretty good argument to adopt a niche platform. Standard tools make it easier for new recruits to hit the ground running, and helps with retention.

  3. The big players are all viable options, but may be stronger/weaker candidates for you, depending on what matters. I.e. model cataloguing and governance, AutoML, data-wrangling, monitoring of deployed solutions, experiment tracking and model lineage...

  4. It matters what kind of ML you want to do. I.e. will you be doing scalable, low latency live inference, or are you mostly going to be doing lots of batch processes and descriptive modelling. Are you going to be building mostly bespoke/novel algorithms, or do you want access to a lot of pre-trained models and plug and play algorithms...

  5. Based on the above, what skills do you need in the team. Is no code/low code relevant for you? Do you need data migration tools built in...

  6. There are a good open-source solutions for many parts of the MLOps cycle (feature store, labeling, experiment tracking etc).

TLDR: start by thinking about what products you'll build, then think about the skills you'll need in the team, then review a bunch of options with someone who knows data architecture, and then pick the one(s) that make most sense for you.

7