Viewing a single comment thread. View all comments

SkinnyJoshPeck t1_je5ue3b wrote

I'm not 100% sure what your infrastructure or background is, but generally you can just transform data to whatever data format works best for the model.

So, you would build a pipeline that goes

 Snowflake -> Some ETL process -> Transformed Data Storage -> Model Training -> Model Saving -> Model Loading for API to ask questions

where that Some ETL process is a process that transforms your data to whatever the model needs, and your model trains from that.

For example, on AWS you might have something like

Redshift/RDS/Whatever -> SageMaker -> Output Model to S3 -> API for your model or something idk

or if it's all going to be on-prem and you won't have Cloud tech, you'd do something like

Snowflake/Azure/Any Data Source -> Airflow for running training -> Model Upload to Some Folder -> API in a docker container in Kubernetes or something for users to hit

or they can just download the model locally and use some script to ask it questions, I'm not 100% sure it all depends on the model/language/etc that you use.

This is a fairly complicated task; if your company is getting serious about this, y'all should hire someone who is an ML engineer to do this task. :)

32

phb07jm t1_je676x4 wrote

Also you might want more than just one ML engineer! 🤣

16