chengstark

chengstark t1_j1gqy3z wrote

Look up some model compression techniques, use smaller batch sizes etc. sorry for your situation, it is very hard to do proper work without the proper tools.

2

chengstark t1_izygwqn wrote

In academia we usually have the data already labeled, but I did one unfortunate project where the annotation is absolutely garbage (too many mistakes). Ensuring the correctness of labeling should be one of the priorities. From my limited experience you would want collaborators with domain knowledge of the data to make sure the processing is absolutely correct.

Recent developments in self supervised learning and generalized pretrained big models may lower the amount of labeled samples needed, not sure what that would affect your product, but it seems related.

1

chengstark t1_iztu0do wrote

Sorry for being blunt, wtf is productization in this context, what does this word include? This is way too broad of a question, there are many nuances in ml/dl development, too many varibles could change based on a specific use case.

Simple models can be used just with the trained model and some API calls, this is the same between DL and ML. Non computational intensive tasks don’t even need GPUs/TPUs, most can even run on embedded hardwares. However they differ in amount of data required for training; data formats/ types also matter, typical ml algorithms work better with tabular data, but you wouldn’t use them for images. I mean what kind of garbage question is this lol. You can write a whole book on this.

If I get asked this question I’d ask back for a more concrete example, throwing out a generalized question only indicate the interviewer does not have the know how in ml/dl operations.

2