Submitted by Vegetable-Skill-9700 t3_10nfquy in MachineLearning

I am building an open-source ML observability and refinement toolkit.

The tool helps ML practitioners to:

  1. Understand how their models are performing in production
  2. Catch edge-cases and outliers to help them refine their models
  3. Allow them to customise the tool according to their needs (hence, open-source)
  4. Bring data-security at the forefront (hence, self hosted)

You can check out the project https://github.com/uptrain-ai/uptrain and would love to hear feedback from the community

59

Comments

You must log in or register to comment.

Acceptable-Cress-374 t1_j68evw5 wrote

Ok, I'll bite. What's uptrain?

22

Vegetable-Skill-9700 OP t1_j68g80z wrote

So, you know how itโ€™s almost impossible to build 100% accurate and super-generalised ML models. On top, the performance of these models degrade over time. Furthermore, due to the black boxiness of ML models, identifying problems with them and fixing those problems is super-hard.

UpTrain solves for these exact issues. It identifies cases where the model is going wrong, collects those problematic data-points and retrains the model on them to improve it's accuracy!

You can checkout the repo here: https://github.com/uptrain-ai/uptrain

14

Acceptable-Cress-374 t1_j68pjdo wrote

Ah, you're so sweet! I was actually setting up the updog joke :)

I checked & bookmarked the repo. Looks promising!

24

Vegetable-Skill-9700 OP t1_j68ejbm wrote

We currently support LLMs, Vision models, Recommendation systems, etc., and are working to integrate it seamlessly with any of the major MLOps frameworks or cloud providers.

6

StoicBatman t1_j6bsdgk wrote

I am new here, How it is helpful in making ChatGPT answers better?

3

Vegetable-Skill-9700 OP t1_j6e8o99 wrote

Firstly, by measuring data drift and analyzing user behavior, UpTrain identifies which prompts/questions were unseen by the model or the cases where the user was unsatisfied with the model output. It automatically collects those cases for the model to retrain upon.

Secondly, you can use the package to define a custom rule and filter out relevant data sets to retrain ChatGPT for your use case.

Say you want to use LLM to write product descriptions for Nike shoes and have a database of Nike customer chats:
a) Rachel - I don't like these shoes. I want to return them. How do I do that?
b) Ross - These shoes are great! I love them. I wear them every day while practicing unagi.
c) Chandler - Are there any better shoes than Nike? ๐Ÿ‘Ÿ ๐Ÿ˜
You probably want to filter out cases with positive sentiments or cases with lots of emojis. With UpTrain, you can easily define such rules as a python function and collect those cases.

I am working on an example highlighting how all the above can be done. It should be done in a week. Stay tuned!

3