13ass13ass t1_je5nc8b wrote on March 29, 2023 at 4:16 PM

You could look at the natural language -> sql query tools that are all the rage right now. I’d recommend checking out langchains sqlchainagent since it’s open source.

jaxolingo OP t1_je7b3lf wrote on March 29, 2023 at 10:46 PM

Cool i'll check it out thanks!

I've been learning langchain but haven't gotten to sqlchain just yet

LetGoAndBeReal t1_je65ffo wrote on March 29, 2023 at 6:09 PM

The comments here so far have addressed three possible approaches to this. Two of those approaches - ie training your own model and fine-tuning an existing model - are not currently viable. Training your model would require a ridiculous amount of human and compute power and not result in something where data could be easily added. Fine-tuning a model does not result in the model absorbing new data - it only conditions the output patterns from the model using data/knowledge the model gained during initial training.

The only viable approach is to use retrieval augmented generation, where data relating to user questions are retrieved from outside the model and fed to model as part of the prompt. Tools like LangChain can help you build a RAG solution on your own. There are also many services coming out that provide this sort of capability, such as humata.ai.

Im_Unlucky t1_je6hn68 wrote on March 29, 2023 at 7:27 PM

This is the way ^

shove data into vector db
embed incoming query
return topk matches from vector db
inject those into prompt
make api call to openai
display response
???
you are now a ML engineer

Hands0L0 t1_je6o5t5 wrote on March 29, 2023 at 8:09 PM

Off topic but I love how underpants gnomes memes are still relevant 25 years later

[deleted] t1_je8ajeo wrote on March 30, 2023 at 3:18 AM

[removed]

visarga t1_je6k74d wrote on March 29, 2023 at 7:44 PM

Often it seems that the model can't properly synthesise information from a bunch of snippets, it lacks the context of those snippets, so it will combine the information incorrectly or hallucinate an explanation.

Retrieval + loading data in the context is far from solved.

Im_Unlucky t1_je6vjqd wrote on March 29, 2023 at 8:56 PM

This is true. Good prompt engineering has definitely gone a long way for me in improving rag performance though. I don’t think shoving snippets into the prompt is the long term solution, but at this point in time it’s the best option in my experience.

kromem t1_je6uv46 wrote on March 29, 2023 at 8:52 PM

"Moar layers" doesn't only need to apply to the NN.

CoT prompting works by breaking analysis down into smaller steps that each generate their own additional context.

Doing something similar with DB analysis is absolutely possible, such as preemptively summarizing schema and using that summary as part of the retrieval to contextualize the specific fragments.

Additionally, having static analysis examples on hand for related tables that's fed in to go from zero shot to few shot would go a long way at reducing some of the issues you highlight.

Tostino t1_je847jg wrote on March 30, 2023 at 2:25 AM

Literally just worked through this today manually as a proof of concept, using the LLM to augment the DB schema with comments describing any relevant info or corner cases. I'm essentially just manually feeding it as context to my prompts when I need to know something related to that set of tables, but it seems pretty powerful. Automating this is going to be nuts.

kromem t1_je84zam wrote on March 30, 2023 at 2:32 AM

> Automating this is going to be nuts.

Yes, yes it is.

Cantareus t1_je6vn79 wrote on March 29, 2023 at 8:57 PM

>shove data into vector db
>
>embed incoming query
>
>return topk matches from vector db

Can these steps be done through the openai API?

You only reply using JSON. Write 5 queries that will return useful data to aid you in answering the users questions "What was Apple’s return compared to it’s sector last month" return in JSON array with no explanations. Each query must run independently. Use the return format [{"Reason":"Why the query is useful","query":"The sql query"}] The schema is:

(Imagine a valid schema is here)

GPT-4 Response:

[

{

"Reason": "Retrieve Apple's stock return last month",

"query": "SELECT stock_return FROM stocks WHERE company = 'Apple' AND date >= DATEADD(month, -1, GETDATE()) AND date < GETDATE();"

},

{

"Reason": "Retrieve sector for Apple",

"query": "SELECT sector FROM companies WHERE company_name = 'Apple';"

},

{

"Reason": "Retrieve companies in Apple's sector",

"query": "SELECT company_name FROM companies WHERE sector = (SELECT sector FROM companies WHERE company_name = 'Apple');"

},

{

"Reason": "Calculate average return for Apple's sector last month",

"query": "SELECT AVG(stock_return) as sector_return FROM stocks WHERE company IN (SELECT company_name FROM companies WHERE sector = (SELECT sector FROM companies WHERE company_name = 'Apple')) AND date >= DATEADD(month, -1, GETDATE()) AND date < GETDATE();"

},

{

"Reason": "Compare Apple's return to the average return of its sector last month",

"query": "SELECT (SELECT stock_return FROM stocks WHERE company = 'Apple' AND date >= DATEADD(month, -1, GETDATE()) AND date < GETDATE()) - (SELECT AVG(stock_return) as sector_return FROM stocks WHERE company IN (SELECT company_name FROM companies WHERE sector = (SELECT sector FROM companies WHERE company_name = 'Apple')) AND date >= DATEADD(month, -1, GETDATE()) AND date < GETDATE()) as comparison;"

}

]

jaxolingo OP t1_je7c63z wrote on March 29, 2023 at 10:54 PM

Cool, this is exactly the approach I had in mind that the model would take

gstatton t1_je7mwmv wrote on March 30, 2023 at 12:15 AM

I’ve done something similar to this, but I created a view in snowflake, dumped the ‘describe table <table-name>’ to CSV, used that CSV descriptor in the head of the prompt, and asked the LLM to generate a SQL query from the user’s question using the provided table schema…I’ll take the model’s return and pass it directly to snowflake, return it as a data frame, and then send that back to the LLM to get a text summary of the query results

You can also use LangChain to set up the chain to do this automatically..

qncapper t1_jebjy9s wrote on March 30, 2023 at 8:34 PM

Cool, how can I be confident about my model not spewing sh*t or not making up things on the fly, cause what it gives out has impact on my stakeholders.

[deleted] t1_jeblsuz wrote on March 30, 2023 at 8:45 PM

[deleted]

machineko t1_je70llx wrote on March 29, 2023 at 9:30 PM

Why would you say that fine-tuning is not viable? There are many production use cases of fine-tuning a model using in-house proprietary data.
If fact, if you have the resources you can do both fine-tuning of an existing model (whether is just supervised or unsupervised) and also use that for retrieval augmented generation.

LetGoAndBeReal t1_je71r0g wrote on March 29, 2023 at 9:38 PM

Fine-tuning can be great for getting better output from the model based on the knowledge that model already contains. I only meant fine-tuning is not viable for getting new data/knowledge into a model. Fine-tuning does not accomplish knowledge absorption.

WokeAssBaller t1_je7y09s wrote on March 30, 2023 at 1:37 AM

Huh? I think that depends on the fine tuning you are talking about. Fine tuning can absolutely add knowledge to a model

lgastako t1_je8i6dw wrote on March 30, 2023 at 4:30 AM

Not generally very well.

WokeAssBaller t1_jea0ubd wrote on March 30, 2023 at 2:35 PM

Fine tuning is additional training, there are lots of ways of doing that and sometimes it’s absolutely ideal, there are tradeoffs

lgastako t1_jea7kb3 wrote on March 30, 2023 at 3:24 PM

Would love to see an example of it adding knowledge effectively. I haven't been able to find any at all.

WokeAssBaller t1_jealxm2 wrote on March 30, 2023 at 4:58 PM

Train one from scratch

lgastako t1_jeayn8v wrote on March 30, 2023 at 6:18 PM

I know training a model from scratch will work, but the context of the conversation is fine tuning an existing model and I'm saying I would love to see examples of the claims people are making actually working, because I have only been able to find and create examples of it not working very well at all.

WokeAssBaller t1_jebpjog wrote on March 30, 2023 at 9:09 PM

fine tuning is just additional training, so if it works from scratch it works with fine tuning. And no it may not be as effective as other methods but the poster was claiming it was impossible

lgastako t1_jecb96v wrote on March 30, 2023 at 11:39 PM

Ok, so can you point me to an example of it working well?

WokeAssBaller t1_jecc92g wrote on March 30, 2023 at 11:47 PM

What a waste of time

[deleted] t1_jece89n wrote on March 31, 2023 at 12:01 AM

[removed]

machineko t1_je83m8x wrote on March 30, 2023 at 2:21 AM

Unsupervised fine-tuning (or extending the pre-training) with additional data will work. Of course, how to get it to learn new information effectively is a challenge but not impossible.

Goldenier t1_je9uruu wrote on March 30, 2023 at 1:51 PM

This is false, and actually most of the time the opposite is the problem: the model learns too much of the new data it's finetuned on (overfitting on it), but forgets the "knowledge" in the original model. The simplest and most popularly used example right now is when you use the dreambooth, Lora or other finetuning methods to finetune parts of the big image diffusion models and if you overtrain it will place the newly trained face or object in almost all of it's output, so it easily learns new data but also easily forgets old one. ( One mitigation for this is to use preservation loss to make sure it also keeps the old knowledge. ) And there is no reason why the same methods wouldn't work on LLMs too, for example there is already Lora for LLMs too.

LetGoAndBeReal t1_je9zfyb wrote on March 30, 2023 at 2:25 PM

>And there is no reason why the same methods wouldn't work on LLMs too, for example there is already Lora for LLMs too.

It's really not helpful to make strong assertions like this without referring to specific, verifiable sources. Fine-tuning very typically is done in a way where certain layers/parameters of the model are frozen. This is done to avoid the sort of loss we are discussing. The LoRA paper itself states that LoRA "freezes the pre-trained model weights".

light24bulbs t1_je7ilvq wrote on March 29, 2023 at 11:42 PM

I disagree that it's not viable to train a model. The problem is that the best public model (llama) is non-commercial.

That said, it's extremely possible to train things into it. There's a ton of new scripts floating around online. The Lora training is especially good.

The trouble with vectors is they are so limited. They're fine if you need to look up one distinct thing (and the vector gets the match right) but they're utterly useless if you'd like the model to learn about something in general.

LetGoAndBeReal t1_je7m1tq wrote on March 30, 2023 at 12:08 AM

Take a closer look at every script/blog/video related to fine-tuning a model and you will see it doesn’t involve adding new knowledge to the model. If you find an exception I’d be delighted to see it.

light24bulbs t1_je7mr9p wrote on March 30, 2023 at 12:14 AM

False, they all do. The process of fine-tuning is identical to the initial pre-training, though perhaps with different settings. They're mostly setup to take q&a data for getting llama to take instructions better, but actually that's just text wrapped in some context and passed in straight up.

I was very confused by this as well but no, you can train new stuff.

LetGoAndBeReal t1_je7n1gc wrote on March 30, 2023 at 12:16 AM

Instead of seeing who can talk more loudly about who’s right, why don’t you post a link of a script that does this.

light24bulbs t1_je7ob17 wrote on March 30, 2023 at 12:25 AM

Okay, here's my friend turning the alpaca instructions into training data

https://github.com/lxe/llama-peft-tuner/blob/main/convert_alpaca_to_text.py

See how it's just turning it into a fat string?

LetGoAndBeReal t1_je7p0l8 wrote on March 30, 2023 at 12:30 AM

In what way does this show that new knowledge was added to a large language model?

light24bulbs t1_je7pnxa wrote on March 30, 2023 at 12:35 AM

This IS training. That's what it is. This is how "knowledge" got into the model in the first place

LetGoAndBeReal t1_je7re7y wrote on March 30, 2023 at 12:47 AM

Of course the fine-tuning data itself can have knowledge not in the model - that doesn’t prove anything.

What you need to show is that knowledge presumably added during fine-tuning was then retrieved from the model after fine-tuning.

light24bulbs t1_je7sn5e wrote on March 30, 2023 at 12:57 AM

The fact that the fine tuning can improve instruction following is EXACTLY that. There's no distinction between predicting the next word, following instructions, or deep knowledge. They are all the same thing as far as an LLM.

WokeAssBaller t1_je7y7ij wrote on March 30, 2023 at 1:39 AM

Lol this guy doesn’t understand ML, you are absolutely adding knowledge to the model

light24bulbs t1_je863pu wrote on March 30, 2023 at 2:40 AM

Yeah, he doesn't get it. That's ok though, but to be wrong and be sure about it is a bummer

LetGoAndBeReal t1_je8akb1 wrote on March 30, 2023 at 3:18 AM

I would agree with that last statement. You think you understand this, but you don’t seem to understand what does and doesn’t happen during fine-tuning or to realize that the problem of adding knowledge to LLMs is a notoriously difficult problem that ongoing research is trying to solve.

Try looking at some of the research: https://openreview.net/forum?id=vfsRB5MImo9

Or read what OpenAI says fine-tuning accomplishes: https://platform.openai.com/docs/guides/fine-tuning

Or, better yet, try actually getting a LLM to learn new facts by fine-tuning it. Then you will understand.

elbiot t1_je8i0i2 wrote on March 30, 2023 at 4:29 AM

The second link says fine tuning is a substitute for lengthy prompts, including putting more into it than can fit in the longest prompt. Prompts are a way to give the model new information. What is your definition of knowledge that isn't something you can put into a prompt?

LetGoAndBeReal t1_je8j7hw wrote on March 30, 2023 at 4:41 AM

The key word in that OpenAI link is “examples”. It says “more examples” and not “more knowledge”, because it’s referring to few shot training, which is about conditioning rather than providing new data.

In other words, if you want to get the model to classify sentiment of user comments as positive or negative, you can provide several examples in the prompt of both positive and negative comments. Fine-tuning allows you to provide many more such examples to the model than can fit in a prompt.

The key point is that through fine-tuning these examples can condition the model to classify sentiment but do not cause new facts to be absorbed by the model. You cannot get new facts to be readily absorbed through fine-tuning, which is why the OP should not look to fine-tuning to endow the model with the external dataset they want to use for question answering.

elbiot t1_je8ngu2 wrote on March 30, 2023 at 5:28 AM

Huh? Have you never included text in a prompt and asked it to answer questions about the text? Seems like that counts as "new knowledge" by your definition

LetGoAndBeReal t1_je9a3hb wrote on March 30, 2023 at 10:41 AM

Of course, that’s what allows RAG to work in the first place. I didn’t say you couldn’t provide new knowledge through the prompt. I only said you cannot provide new knowledge through the fine-tuning data. These are two completely separate things. This distinction is the reason RAG works for this use case and fine-tuning does not.

elbiot t1_je9s53t wrote on March 30, 2023 at 1:31 PM

Your claim that prompting can achieve what fine tuning can't contradicts the documentation for openai that you posted that said fine tuning can do whatever prompting can without the length limit

LetGoAndBeReal t1_jea1id9 wrote on March 30, 2023 at 2:40 PM

I believe you are referring to this statement from the link: "Ability to train on more examples than can fit in a prompt." Correct?

If so, as I explained, the key word here is "examples." And if you understand why, you will see that there is no contradiction. I will try to clarify why.

There are two methods that we are discussing for extending the capability of an LLM:

Prompt engineering
Fine-tuning

There are also different types of capability that might be extended. We are discussing the following two:

Adding new knowledge/facts to the model
Improving downstream processing tasks, such as classification, sentiment analysis, etc.

Both of these capabilities are readily done through prompt engineering. Adding new knowledge with prompt engineering involves including that knowledge as context in the prompt. Improving tasks such as classification is done by include examples of the processing you want done in the prompt.

What the article says is that for the case where you want to provide examples in the prompt to make the model perform better, you can alternatively use fine-tuning. The article does not say "Ability to add more knowledge than can fit in a prompt." Examples = downstream processing tasks. Examples != new knowledge.

WokeAssBaller t1_jea0o2f wrote on March 30, 2023 at 2:34 PM

Again you are using an incredibly limited definition of fine tuning based on what the open ai api allows, which once again tells me you don’t know ML.

Fine tuning is ANY additional training on a foundational model, this can be MLM training on the model base or selectively training the subsequent layers.

OF COURSE this can add knowledge as you are doing the same training that got it knowledge in the first place. Glad to see you jumped on the chatgpt band wagon last week, build a transformer from scratch and come talk to me

light24bulbs t1_je8d6bh wrote on March 30, 2023 at 3:42 AM

Continuous retraining is something else.

I'll be training llama soon, I'll get back to you with how it goes.

LetGoAndBeReal t1_je8m6y9 wrote on March 30, 2023 at 5:13 AM

Include new factual statements in your training data like “Joe Biden’s cat is named Fluffy.” Ask the model the name of Joe Biden’s cat before and after training and let us know the answers you get back. See if you get reliable answers across a set of data/questions.

WokeAssBaller t1_je7yeux wrote on March 30, 2023 at 1:41 AM

This is a fine approach but fine tuning can and does add knowledge to models, please quit saying that

LetGoAndBeReal t1_je9c66v wrote on March 30, 2023 at 11:05 AM

Instead of insisting that fine-tuning reliably adds new knowledge to an LLM, why not instead show some evidence of this claim. Per my links above, this is a notoriously challenging problem in ML.

Apart from these resources, let's think critically for a second. If the approach were viable at this point, then there would be tons of commercial solutions using fine-tuning instead of RAG for incorporating external knowledge in an LLM application. Can you find even one?

WokeAssBaller t1_jea17d0 wrote on March 30, 2023 at 2:38 PM

Why don’t you actually implement a transformer from scratch and then speak more confidently, this is like talking to a virgin about sex.

jaxolingo OP t1_je7clzh wrote on March 29, 2023 at 10:57 PM

Nice!! i'll look into it

valjestir t1_je8w22c wrote on March 30, 2023 at 7:19 AM

RAG is exactly what OP needs here. I don’t think Langchain has any way to connect to Azure of Snowflake though so they still need some way to extract that data.

A project I’m working on helps with ETL for retrieval augmented generation: https://github.com/ai-sidekick/sidekick

master-leaf t1_je5dtrm wrote on March 29, 2023 at 3:15 PM

There was a paper I read a few months ago (I think it was called tapas). In this paper they show how to ingest tabular data to a transformer model.

bttoddx t1_je5m3g6 wrote on March 29, 2023 at 4:08 PM

Adding to this, it looks like a survey paper was released earlier this month that details a number of methods to do the task op is looking for. The bibliography is a great resource.

jaxolingo OP t1_je5eunu wrote on March 29, 2023 at 3:21 PM

From Hugging Face?

master-leaf t1_je5hhu6 wrote on March 29, 2023 at 3:38 PM

I would check the paper, but I think they fine tune a pre trained local LM. They also created their own encodings to account for the structure of tabular data, such as the column headers, entity rows etc.

I will note though, from what I remember the table sizes were pretty small.

AdamEgrate t1_je5yxd9 wrote on March 29, 2023 at 5:29 PM

Tapas is from 2020 which In this field makes it ancient.

[deleted] t1_je6ra9u wrote on March 29, 2023 at 8:29 PM

[removed]

SkinnyJoshPeck t1_je5ue3b wrote on March 29, 2023 at 5:00 PM

I'm not 100% sure what your infrastructure or background is, but generally you can just transform data to whatever data format works best for the model.

So, you would build a pipeline that goes

 Snowflake -&gt; Some ETL process -&gt; Transformed Data Storage -&gt; Model Training -&gt; Model Saving -&gt; Model Loading for API to ask questions

where that Some ETL process is a process that transforms your data to whatever the model needs, and your model trains from that.

For example, on AWS you might have something like

Redshift/RDS/Whatever -&gt; SageMaker -&gt; Output Model to S3 -&gt; API for your model or something idk

or if it's all going to be on-prem and you won't have Cloud tech, you'd do something like

Snowflake/Azure/Any Data Source -&gt; Airflow for running training -&gt; Model Upload to Some Folder -&gt; API in a docker container in Kubernetes or something for users to hit

or they can just download the model locally and use some script to ask it questions, I'm not 100% sure it all depends on the model/language/etc that you use.

This is a fairly complicated task; if your company is getting serious about this, y'all should hire someone who is an ML engineer to do this task. :)

phb07jm t1_je676x4 wrote on March 29, 2023 at 6:21 PM

Also you might want more than just one ML engineer! 🤣

EverythingGoodWas t1_je612lg wrote on March 29, 2023 at 5:42 PM

You aren’t going to train an LLM on company data. You could fine tune an existing one with company data, but creating an LLM from scratch is an absolutely massive compute task. If you are trying to make a closed domain question answering system, that uses your company’s data, you basically need to create a full pipeline from parsing, searching, and finally pushing the context and question to a language model.

visarga t1_je6kqvw wrote on March 29, 2023 at 7:47 PM

I'd rather fine-tune the LLM on company documentations than feeding it through retrieval. Does anyone have experience with fine-tuning GPT-3 on a new text? Can it answer questions or freely use information from this text?

AlmightySnoo t1_jea6dra wrote on March 30, 2023 at 3:16 PM

I'm just curious, how are you supposed to fine-tune a model on company data if the current licences (either explicitly, or implicitly through the licence of the training data) on model weights prohibit commercial use?

EverythingGoodWas t1_jea7gbo wrote on March 30, 2023 at 3:23 PM

You wouldn’t, that would be a direct violation of that license. I would imagine they have a commercial use license as well though.

patniemeyer t1_je5wc4u wrote on March 29, 2023 at 5:13 PM

This may not be what you want, but I was not aware until recently that OpenAI offers an API to fine tune GPT-3/4 on your own data: https://platform.openai.com/docs/guides/fine-tuning

They charge your for training and for usage of your custom model, so it may or may not be economical for your use case.

evergreensphere t1_je613n6 wrote on March 29, 2023 at 5:42 PM

Fine tuning is only available for the old GPT3. It is not available for GPT3.5 or GPT4.

Also, most people I've talked to found that fine tuning did not work as well as using things like vectorized search, or vectorized search combined with graph index.

t_minus_1 t1_je7gsc7 wrote on March 29, 2023 at 11:28 PM

Please look at sketch and langchain pandas/SQL plugins. I have seen excellent results with both of these approaches. Both of these approaches will require you to send metadata to openAI.

If you would like to do it yourself may be start with GPT-J / LORA and use the same instruction approach/fine tuning like databricks did .

ianitic t1_je5j1jo wrote on March 29, 2023 at 3:48 PM

You might like something like this as you use azure: https://azure.microsoft.com/en-us/products/bot-services/

jaxolingo OP t1_je5m4b1 wrote on March 29, 2023 at 4:08 PM

>3

Oh that looks amazing thanks!

DonutListen2Me t1_je7rr2p wrote on March 30, 2023 at 12:50 AM

You don't want to input tabular data to a LLM. You should use a model specifically for tabular data, like XGBoost.

What you want to do is embed your company's documents and use a vector search on the embeddings to feed as context to an LLM.

2blazen t1_je69rl1 wrote on March 29, 2023 at 6:37 PM

I read about this tool on this sub and looks like what you're looking for https://lm-code-binder.github.io/

MadDanWithABox t1_je6vh9x wrote on March 29, 2023 at 8:56 PM

Just a caveat. Any training (fine-tuning really) that you do on an LLM is NOT guaranteed to be able to give you correct answers. The answers to your questions will seem plausible, and could be correct, but you'll need some system to verify if it's something you want to take action on.

OscarYouDotCom t1_je5xu0s wrote on March 29, 2023 at 5:22 PM

Id look into langchain

memberjan6 t1_je65xrg wrote on March 29, 2023 at 6:13 PM

Go wstch YouTube for pinecone.ai and milvus. Also, go watch office365 copilot video.

TheDeviousPanda t1_je5ddm3 wrote on March 29, 2023 at 3:12 PM

It’s going to be a lot easier to just take something like GPT-4 and feed in your data directly and ask questions.

jaxolingo OP t1_je5eovq wrote on March 29, 2023 at 3:20 PM

The end goal would be to add it into the products current chat in the web app, so I can't be doing that :)

only_short t1_je68peh wrote on March 29, 2023 at 6:30 PM

You can just call the GPT-4 API?

abnormal_human t1_je60s31 wrote on March 29, 2023 at 5:40 PM

Yes, it's totally possible to train an LLM to understand tabular data. It's a very general purpose architecture. With enough resources it is well suited to a wide range of problems, and yes, Azure/Snowflake can do everything you need (at some price, assuming you know what to do with them).

You need to make a decision about whether you want to bake the info into the LLM, or whether you want to teach the LLM to find the answers and then format them for humans.

This will depend on your use case, budget, team size, competencies, data set size, and time to market requirements. Baking the info into the LLM is a lot harder than doing the other thing, like potentially 100x-1000x harder and more expensive, and without people with experience doing it, you will waste a lot of time/energy getting there.

athos45678 t1_je7ercw wrote on March 29, 2023 at 11:13 PM

Train a Llama LoRa model. The 30 b model isn’t too expensive to tune (40 bucks ish), and is ridiculously capable.

You just need to format the data in a long text doc with each prompt separated by two line breaks. I found it worked best in the alpaca style, where you have a single line break after the prompt, like “write a function that sorts this table in python def sort():” followed by the written out code, and then the double line break to signal the start of the next input.

Then use the simple-llama trainer app to make it all easy.

netham91 t1_je7jpai wrote on March 29, 2023 at 11:50 PM

Can you share more steps on this and also share some relevant links? Thanks.

athos45678 t1_je82thk wrote on March 30, 2023 at 2:15 AM

So as far as set up goes, you just need to: “”” Git clone https://github.com/lxe/simple-llama-finetuner Cd simple-llama-finetuner Pip install -r requirements.txt Python app.py ## if you’re on a remote machine (Paperspace is my go to) then you may need to edit the last line of this script to set ‘share=True’ in the launch args “””

Then you should get a link for the gradio web app. Copy and paste the code samples in the format described before in the input text box. It will look something like this:

“”” Write a code snippet that sorts a function Def sort(arr):

  Return arr.sorted()

Some other code snippet input

Some answer

Etc. “””

Edit: I’m drinking with friends sorry i can’t format better. Single line break between prompt and observed correct response, double line break between prompt instances.

netham91 t1_je8dxrt wrote on March 30, 2023 at 3:49 AM

Thanks

jaxolingo OP t1_je98vdm wrote on March 30, 2023 at 10:25 AM

nice thanks i'll take a look at LoRa models

VelvetyPenus t1_je8gktg wrote on March 30, 2023 at 4:14 AM

First person to use AI to embezzle majority of company profits arrested. Convict used Reddit to ask how to turn company data into AI dataset.

sandys1 t1_je9thds wrote on March 30, 2023 at 1:41 PM

Hi. We are actually building a framework for this. P.S. 100% open-source so there is no paid component to it right now.

Would you be willing to talk about what you want ? We may be able to help get you what you want.

jaxolingo OP t1_je9w197 wrote on March 30, 2023 at 2:00 PM

Would be amazing yes!

sandys1 t1_jeav3l4 wrote on March 30, 2023 at 5:56 PM

im not able to dm you. can u please dm.

zeoNoeN t1_je6f3b5 wrote on March 29, 2023 at 7:11 PM

I had a lot of success with implementing huggingface models in the last week, so that could be a starting point

netham91 t1_je7jejn wrote on March 29, 2023 at 11:48 PM

Can you share more details or point in the right direction?

Rawvik t1_je6f9w8 wrote on March 29, 2023 at 7:12 PM

I am also currently looking to do something like this for my own company data. Please let me know if you find something useful.

detached-admin t1_je6gbz7 wrote on March 29, 2023 at 7:19 PM

Don't do that if you're dealing with other people's money in any way. If it's yours, try langchain.

shitasspetfuckers t1_je6p0z9 wrote on March 29, 2023 at 8:15 PM

Why not other people's money?

EarthquakeBass t1_je6wa0g wrote on March 29, 2023 at 9:01 PM

I think Azure might actually have support for private OpenAI stuff

bubudumbdumb t1_je7yrmq wrote on March 30, 2023 at 1:43 AM

In the last few days someone posted on hacker news about a system allowing the integration of a gpt with a postgress database

SatisfactionFine6298 t1_je85d51 wrote on March 30, 2023 at 2:35 AM

A personal language model (PLM) would be way better for that than an LLM. There are models you can train like www.personal.ai

Appropriate_Ant_4629 t1_je89aho wrote on March 30, 2023 at 3:07 AM

Databricks announced this week that they're trying to make it easy:

https://www.reuters.com/technology/databricks-pushes-open-source-chatbot-cheaper-chatgpt-alternative-2023-03-24/

Haven't tried it yet, but we will soon.

huyouare t1_je8fby1 wrote on March 30, 2023 at 4:02 AM

I was wondering how this relates to retrieval or SQL queries but it sounds like you’re suggesting that OP finetunes on their dataset regularly. Might be good to try in combination with retrieval, but how would you represent the tabular data as training examples?

disbeam t1_je920cv wrote on March 30, 2023 at 8:48 AM

What some people have done is to use Azure Cognitive Search as a pre-cursor to the LLM.

You use Cognitive Search to extract information from your organisation's own documentation and ask the LLM to only provide the correct answer from the details found in the search, otherwise responding with saying it doesn't know. It then answers complete with references. Having seen it in action with one of our customer's, I've been quite impressed.

Tiny_Arugula_5648 t1_je9oyfo wrote on March 30, 2023 at 1:06 PM

Not sure why no one is calling this out but there is no indication a LLM is going to be useful here.. you have tabular data, unless it's unstructured text held in there it's not goung to be useful, pick the right model for the job..

[deleted] t1_je7kipl wrote on March 29, 2023 at 11:57 PM

[deleted]

VinceD6 t1_je8kaza wrote on March 30, 2023 at 4:52 AM

I am currently trying to do the same thing. Take a look at LlamaIndex, build a POC yesterday and it seemed to work really good.

tedmobsky t1_je9o66u wrote on March 30, 2023 at 1:00 PM

!Remindme 3 days

RemindMeBot t1_je9o9gp wrote on March 30, 2023 at 1:00 PM

I will be messaging you in 3 days on 2023-04-02 13:00:00 UTC to remind you of this link

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

^(Parent commenter can ) ^(delete this message to hide from others.)

^(Info)	^(Custom)	^(Your Reminders)	^(Feedback)

[deleted] t1_jea4n7e wrote on March 30, 2023 at 3:03 PM

[removed]

Builder992 t1_jebamry wrote on March 30, 2023 at 7:35 PM

!Remindme 3 days

icm76 t1_jebgew9 wrote on March 30, 2023 at 8:11 PM

!remind me in 3 days

j_lyf t1_jebvmnp wrote on March 30, 2023 at 9:49 PM

Where do embeddings come into all of this?

Comments