Submitted by spiritus_dei t3_11qgxs8 in MachineLearning
One of the biggest limitations of large language models is the text limit. This limits their use cases and prohibits more ambitious prompts.
This was recently resolved by researchers at Google Brain in Alberta, Canada. In their recent paper they describe a new method of using associative memory which removes the text limit and they also prove that some large language models are universal Turing machines.
This will pave the way for entire novels being shared with large language models, personal genomes, etc.
The paper talks about the use of "associative memory" which is also known as content-addressable memory (CAM). This type of memory allows the system to retrieve data based on its content rather than its location. Unlike traditional memory systems that use specific memory addresses to access data, associative memory uses CAM to find data based on a pattern or keyword.
Presumably, this will open up a new market for associative memory since I would happily pay some extra money for content to be permanently stored in associative memory and to remove the text limit. This will also drive down the price of associative memory if millions of people are willing to pay a monthly fee for storage and the removal of prompt text limits.
The paper does point that there are still problems with conditional statements that confuse the large language models. However, I believe this can be resolved with semantic graphs. This would involve collecting data from various sources and using natural language processing techniques to extract entities and relationships from the text. Once the graph is constructed, it could be integrated into the language model in a variety of ways. One approach is to use the graph as an external memory, similar to the approach taken in the paper. The graph can be encoded as a set of key-value pairs and used to augment the model's attention mechanism during inference. The attention mechanism can then focus on relevant nodes in the graph when generating outputs.
Another potential approach is to incorporate the graph into the model's architecture itself. For example, the graph can be used to inform the initialization of the model's parameters or to guide the attention mechanism during training. This could help the model learn to reason about complex concepts and relationships more effectively, potentially leading to better performance on tasks that require this kind of reasoning.
The use of knowledge graphs can also help ground truth large language models and reduce hallucinations.
I'm curious to read your thoughts.
big_ol_tender t1_jc4lrqf wrote
This makes me depressed because I’ve been working with the llama-index project and I feel like these huge companies are going to take my ball away 😢. They just have too many resources to build stuff.