SkinnyJoshPeck t1_je5ue3b wrote on March 29, 2023 at 5:00 PM

I'm not 100% sure what your infrastructure or background is, but generally you can just transform data to whatever data format works best for the model.

So, you would build a pipeline that goes

 Snowflake -&gt; Some ETL process -&gt; Transformed Data Storage -&gt; Model Training -&gt; Model Saving -&gt; Model Loading for API to ask questions

where that Some ETL process is a process that transforms your data to whatever the model needs, and your model trains from that.

For example, on AWS you might have something like

Redshift/RDS/Whatever -&gt; SageMaker -&gt; Output Model to S3 -&gt; API for your model or something idk

or if it's all going to be on-prem and you won't have Cloud tech, you'd do something like

Snowflake/Azure/Any Data Source -&gt; Airflow for running training -&gt; Model Upload to Some Folder -&gt; API in a docker container in Kubernetes or something for users to hit

or they can just download the model locally and use some script to ask it questions, I'm not 100% sure it all depends on the model/language/etc that you use.

This is a fairly complicated task; if your company is getting serious about this, y'all should hire someone who is an ML engineer to do this task. :)

phb07jm t1_je676x4 wrote on March 29, 2023 at 6:21 PM

Also you might want more than just one ML engineer! 🤣