Viewing a single comment thread. View all comments

CertainMiddle2382 t1_j6vwxpj wrote

We have absolutely no clue about exactly what the latent space of those models represent.

Their own programmers have been trying to do that even with pre Transformer models without much success.

There is a huge incentive in doing so especially for time critical and vital systems like in medicine or machine control.

Above a few layer, we really don’t have a clue on what the activation pattern represent…

3