TheTwigMaster t1_j5097m6 wrote
Using open source models might be good for quickly experimenting and getting a feel/sense of the value of an approach for a particular problem. But at a company (especially big tech companies), there are many more things to consider:
- How do I scale this to my particular dataset? It’s a bigger pain to change my data to fit a given model than to change the model to fit my data
- How can I integrate my company’s infrastructure/tooling/monitoring to this? Often it ends up being simpler to revisit the implementation from scratch
- How easy is it to experiment with adjustments to this? Often we don’t want to pick a single architecture forever, so we want to be able to adjust and modify easily. Open source models may not always accommodate this.
At the risk of being flippant/dismissive: coding up a model/architecture is one of the easiest and fastest parts of the problem. So if you can make other things easier by making a model implementation from scratch, it’s makes sense to just do that.
tennismlandguitar OP t1_j51r1ao wrote
Wow, thanks for the response, that was really enlightening-- I never thought about monitoring to support these models.
Have you noticed one of these problems to be the biggest issue in industry?
W/ regards to your last point, that definitely makes sense in case of a simple CNN or deep network, but sometimes there are more complicated RL algorithms or transformers that become a bit difficult and time-intensive to implement. In these cases, I would suspect that it would be easier to use something open-sourced?
Viewing a single comment thread. View all comments