michaelthwan_ai

michaelthwan_ai OP t1_jdlf8g8 wrote

Because the recent release of LLMs has been too vigorous, I organized recent notable models from the news. Some may find the diagram useful, so please allow me to distribute it.

Please let me know if there is anything I should change or add so that I can learn. Thank you very much.

If you want to edit or create an issue, please use this repo.

---------EDIT 20230326

Thank you for your responses, I've learnt a lot. I have updated the chart:

Changes 20230326:

  • Added: OpenChatKit, Dolly and their predecessors
  • More high-res

To learn:

  • RWKV/ChatRWKV related, PaLM-rlhf-pytorch

Models that not considered (yet)

  • Models that is <= 2022 (e.g. T5 (2022May). This post is created to help people quickly gather information about new models)
  • Models that is not fully released yet (e.g. Bard, under limited review)
37

michaelthwan_ai OP t1_jcxsd0x wrote

Thank you for your comprehensive input.
- I have mixed feeling about opening/closing the technology. There are pros/cons to it. For example, we, especially people in this field have a strong curiosity about how giant technology solves their problems (like chatgpt). Therefore open-sourcing them will bring us rapid development in related fields (like the current AI development). However, I also understand that, malicious usage is also highly possible when doing so. For example, switching the reward function from chatgpt model from positive to negative may make a safe AI into the worst AI ever.
- Humans seem to not be able to stop technological advancement. Those technologies will come sooner or later.
- Yes I agree to preserve our rights today and the society should carefully think about how to deal with this unavoidable (AI-powered) future.

2

michaelthwan_ai OP t1_jcxrilh wrote

haha. Nice point.

I'm not sure whether it fulfil the definition of a search engine, but this work essentially mimics your experiences during googling: Google->got n websites->surf and find info one by one.

SearchGPT (or e.g. new Bing) attempted to automate this process. (Thus Google is unhappy)

1

michaelthwan_ai OP t1_jcxrcfu wrote

I agree with you. 3 thoughts from me

- I think one direction of the so-called safety AI to give a genuine answer, is to give it factual/external info. I mean 1) a Retrieval-based model like searchGPT 2) API calling like toolformer (e.g. check weather API)

- LLM, is essentially a compression problem (I got the idea in lambdalabs). But it cannot remember everything. Therefore an efficient way to solve so are retrieval methods to search a very large space (like pagerank/google search), then obtain a smaller result set and let the LLM organize and filter related content from it.

- Humans are basically like that right? But if we got a query, we may need to read books (external retrieval) which is pretty slow. However, humans have a cool feature, long-term memory, to store things permanently. Imagine if an LLM can select appropriate things during your queries/chat and store them as a text or knowledge base inside it, then it is a knowledge profile to permanently remember the context bonded between you and the AI, instead of the current situation that ChatGPT will forget everything after a restart.

3

michaelthwan_ai OP t1_jcws6h8 wrote

ChatGPT said what I want to say.

>I apologize for any confusion or misinformation in my previous response. You are correct that SQL databases do support various text search and similarity matching features, including the use of keywords like LIKE and CTE (Common Table Expressions) to enable more flexible and efficient querying.
>
>While it's true that specialized tools like Elasticsearch, Solr, or Algolia may offer additional features and performance benefits for certain natural language processing tasks, SQL databases can still be a powerful and effective tool for storing and querying structured and unstructured data, including text data.
>
>Thank you for bringing this to my attention and allowing me to clarify my previous response.

3

michaelthwan_ai OP t1_jctvmqe wrote

Cool! Thanks for the sharing.

During my development, I've also found 5+ projects, some open-source and some are closed, where they are doing similar things.

In exact, it is called retrieval-based language model.

Some discussion on that:

https://ai.stanford.edu/blog/retrieval-based-NLP/

11