Icko_
Icko_ t1_jdecnjx wrote
Reply to comment by edthewellendowed in [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho
Sure:
- Suppose you had 1 million embeddings of sentences, and one vector you want the closest sentence to. If the vectors were a single number, you could just do a binary search, and you'd be done. If they are higher dimensionality, it's a lot more involved. Pinecone is a paid product doing this. Faiss is a library by facebook, which is very good too, but is free.
- Recently, Facebook released the LLama models. They are large language models. ChatGPT is also a LLM, but after pretraining on a text corpus, you train it with human instructions, which is costly and time-consuming. Stanford took the LLama models, and trained them with ChatGPT. The result is pretty good not AS good, but pretty good. They called it "Alpaca".
Icko_ t1_jdc09e5 wrote
Reply to comment by _Arsenie_Boca_ in [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho
You could use faiss instead of pinecone and alpaca instead of gpt-4
Icko_ t1_itk6k37 wrote
Reply to comment by blwom in [D] Simple Questions Thread by AutoModerator
There's a bunch.
Neural MMO - runs every few months on different conferences
https://www.aicrowd.com/challenges/neurips-2022-minerl-basalt-competition
Icko_ t1_ir56fup wrote
Reply to comment by avialex in [R] Self-Programming Artificial Intelligence Using Code-Generating Language Models by Ash3nBlue
Jesus, at least label the axes...
Icko_ t1_jdh2pja wrote
Reply to comment by saintshing in [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho
Idk, I've never heard of it.