elijahmeeks

elijahmeeks OP t1_jag2y5a wrote

A lot of very confident points that you've posted, you'd do well as an online AI.

  1. ChatGPT is most definitely a data source now and (along with similar such tools) will be used as such more and more going forward, so it's good to examine how it approaches making data. Whether it "should" be able to assign meaningful numerical scores to things like this, it sure was willing to.
  2. Agree and it's even more concerning how it does it with data. Take a look at the end of the notebook and you'll see at the end how it hallucinates with the data it gives me. Again, people are going to use these tools like this, so we should be aware of how it responds.
  3. I think it's revealing not just of the biases of the corpora and creators, but also of the controls to avoid controversy that makes it evaluate certain topics as more "uncertain".
  4. Good point. I struggle with the way this subreddit is designed to showcase a single chart since so many of my charts are part of larger apps or documents.
0

elijahmeeks OP t1_jag0zz8 wrote

I think it reflects the inherent biases in the design, training and implementation of the ML algorithms that drive ChatGPT: Science topics are considered "less uncertain but more complex" because its source material, creators and developers believe that, but also it has controls in place to avoid saying controversial things and topics like ghosts, history, art & religion are all much more likely to have controversy and therefore be more "uncertain" to ChatGPT when it comes to giving answers.

2

elijahmeeks OP t1_jag0qj8 wrote

Not a stupid question, it's using the "nice" scales functionality in D3 scales. because the low end doesn't land on a "nice" value it's hidden (which can be really frustrating sometimes). The chart is generated via a dataviz library I created called Semiotic which uses D3 under the hood for things like this. You can see the chart interactively on the original notebook that I link to in my comment if you want to play with it.

2

elijahmeeks OP t1_jafwspx wrote

The short answer: These ratings come from ChatGPT.

Long Answer: You'd have to read through the full exploration to see the entire picture. Basically, I asked ChatGPT to give me 100 topics too uncertain or complex to discuss without misleading users, and to give me subject areas for them and ratings on complexity and uncertainty. So this plot shows the aggregate complexity/uncertainty value of those 100 topics by subject area of the topic. You can see it all in much more detail in the notebook I link to in my comment.

5

elijahmeeks OP t1_jafu0ax wrote

I asked ChatGPT to give me a list of topics that were too complex or uncertain for it to discuss without being misleading and made a bunch of charts out of the results. This is a nice overview Dumbbell plot of the subject areas of the 100 topics that ChatGPT gave me but if you want to see some treemaps and word clouds you can check out the whole thing here: https://app.noteable.io/f/8e355d65-cd94-4afe-bb81-b9aa24a457dc

I used a Noteable notebook, which is Pythn+SQL. The data visualization uses their built-in tool DEX, which uses javascript with Semiotic and D3 under the hood.

Edited to add: This is a live notebook that you can interact with, explore, and export the dataset that I had ChatGPT create for me.

3