Viewing a single comment thread. View all comments

dancingnightly t1_j7s355b wrote

In a sense, you can communicate between semantic text embeddings and LM models through this method(would operate differently to multi modal embeddings): https://www.lesswrong.com/posts/mkbGjzxD8d8XqKHzA/the-singular-value-decompositions-of-transformer-weight

This method, which is only practical for toy problems really right now, would allow you to use semantic embeddings to find what to look for when doing SVD on an (autoregressive) LM. You could depend this on the input, for example, transforming your embedding into the keys to apply the abduction with in that process, and impacting the generation of logits. I'm not sure this would behave much differently to altering the logit_bias of tokens, but it would be interesting to hear if it was.

2