Viewing a single comment thread. View all comments

dancingnightly t1_j58anv8 wrote

That's a great resource, thanks. I have studied how this kind of autoregressive model works and found attention fascinating, but here it's graph embedding entities you brought up that sound exciting. I have just skim read your paper for now, so perhaps I made a mistake, but what I mean is:

For graph embeddings, could you dynamically capture different entities/tokens up to a much broader context than for common sense reasoning statements and questions? i.e. do entailment on a whole chapter(or knowledge base entry with 50 triplets), where the graph embeddings meaningfully represent many entities (perhaps with Sine positional embeddings for each additional text entry mention in addition to the graph, just like for attention)?

[Why I'm interested: because I presume it's impractical to scale this approach up in context - similar to for autoregressive models - due to the graph scaling exponentially if fully connected, but I'd love to know your thoughts - can a graph be strategically connected etc]

1

axm92 t1_j5b2ug8 wrote

I’m not sure if I understand you, but you can generate these graphs over long documents, and then run a GNN.

For creating graphs over long documents, one trick I’ve used in my past papers is to create a graph per 3 paragraphs, and then merge these graphs (by fusing similar nodes).

1

dancingnightly t1_j5c31u6 wrote

Oh ok. Thank you for taking the time to explain. I see that this graph approach isn't for extending beyond the existing context of RoBERTa/similar transformer models, but rather enhancing performance.

I was hoping graphs could capture relational information (in a way compatible with transformer embeddings) within the document at far parts between it essentially (like: for each doc.ents, connect in a fully connected graph), sounds like this dynamic graph size/structure per document input wouldn't work with the transformer embeddings for now though.

1