Viewing a single comment thread. View all comments

dancingnightly t1_j583tfa wrote

>we first generate a graph that can capture relationship between entities in the question

This is really impressive, what's your thoughts on the state of this kind of approach? Could it be extended from sentences to whole context paragraphs at some stage, with the entities dynamically being different graph items?

1

axm92 t1_j58761h wrote

>Could it be extended from sentences to whole context paragraphs at some stage, with the entities dynamically being different graph items?

Absolutely. Highly recommend that you try playing around with some examples here: https://beta.openai.com/playground.

3

dancingnightly t1_j58anv8 wrote

That's a great resource, thanks. I have studied how this kind of autoregressive model works and found attention fascinating, but here it's graph embedding entities you brought up that sound exciting. I have just skim read your paper for now, so perhaps I made a mistake, but what I mean is:

For graph embeddings, could you dynamically capture different entities/tokens up to a much broader context than for common sense reasoning statements and questions? i.e. do entailment on a whole chapter(or knowledge base entry with 50 triplets), where the graph embeddings meaningfully represent many entities (perhaps with Sine positional embeddings for each additional text entry mention in addition to the graph, just like for attention)?

[Why I'm interested: because I presume it's impractical to scale this approach up in context - similar to for autoregressive models - due to the graph scaling exponentially if fully connected, but I'd love to know your thoughts - can a graph be strategically connected etc]

1

axm92 t1_j5b2ug8 wrote

I’m not sure if I understand you, but you can generate these graphs over long documents, and then run a GNN.

For creating graphs over long documents, one trick I’ve used in my past papers is to create a graph per 3 paragraphs, and then merge these graphs (by fusing similar nodes).

1

dancingnightly t1_j5c31u6 wrote

Oh ok. Thank you for taking the time to explain. I see that this graph approach isn't for extending beyond the existing context of RoBERTa/similar transformer models, but rather enhancing performance.

I was hoping graphs could capture relational information (in a way compatible with transformer embeddings) within the document at far parts between it essentially (like: for each doc.ents, connect in a fully connected graph), sounds like this dynamic graph size/structure per document input wouldn't work with the transformer embeddings for now though.

1