ianhillmedia

ianhillmedia t1_j6db30j wrote

Got it thanks for the reply! I know not everyone supports RSS, and it’s a challenge when folks format RSS in different ways, but as they’re a primary source from the publisher I’d encourage you to use RSS over APIs from Google.

I was curious the signals in your algorithm as well. One of the challenges with automating taxonomies for news stories is the inexactitude of language and differences in style. A story might mention DeSantis and books in the headline and description but might actually be about GOP primaries; a story might emphasize DeSantis in the primaries in the headline and title but it might actually be about book banning.

Or a better example: a story that mentions Tyre Nichols may be about the actual incident, police violence or defunding the police.

Digging in even further, a local news organization might use colloquialisms for place names that can make it difficult for folks from outside that market to categorize those stories.

2

ianhillmedia t1_j6d3dqb wrote

Happy to help! And I think you’re spot on when you say you need to clarify the definition of prevalence. Just because a news org puts resources into a topic doesn’t mean it’s prevalent to the user. That said, the number of stories a news org efforts on a subject is an interesting data point.

As someone on the other side of this, I hear you on the challenges associated with getting useful data. How are you currently tracking all articles published by those news orgs? And how are you parsing that data to identify specific stories - what search terms are you using to filter the data?

2

ianhillmedia t1_j6cydkh wrote

Hey there, journalist here with 20+ years experience in the news industry, including 10+ years in digital news. To fairly consider the “prevalence” of news stories and how they progress in the news cycle you’d need a much bigger dataset. Right now, as you’ve described it in other comments, a more accurate title for this chart is: “How the top 10 articles were curated and ranked on homepages of 64 U.S. general interest national(?) news sites when the researcher looked at them.” Correct?

That can still be interesting data, but it doesn’t truly reflect the which stories are most “prevalent” in the news.

Know that news website homepages are one of a few (many?) places people consume news online. Google Analytics, which many news organizations use to track success online, reports acquisition channels as Search, Social, Referral and Direct. The percentage of users that see or are delivered (“prevalence”) and consume news via each channel varies by news organization, but a news org with a legacy brand will get a healthy percentage of traffic from each. People that come to news homepages are a subset of Direct traffic, which also includes users who just type in Mylocalsite.com/weather, for example.

Direct traffic also can include visitors to news organization mobile apps, which can be curated differently from website homepages. Direct traffic does not include people who read push alerts from mobile apps but don’t click through, and what those folks see also should be considered when determining the “prevalence” of news stories online. Referrals, meanwhile, can include visitors who click through from news organization email newsletters, which are often written and curated differently from homepages.

So the stories “prevalent” to U.S. news consumers can be different based on the platform on which they’re delivered news.

That brings us to Social and Search, both of which send healthy traffic to U.S. news sites and play a noteworthy role in determining the “prevalence” of news for Americans. Pew research reported in September indicated that 50% of U.S. adults get news from social media sometimes or often. 82% of American adults use YouTube; 25% of those users say they regularly get news on the site. 70% of American adults use Facebook, and 31% of those users regularly get news on the site. 30% of American adults use TikTok, and 10% of those users regularly get news on the site.

So to really track and report the “prevalence” of news stories, you’d also need to track and report which stories are delivered and consumed on social, and that delivery is determined in large part by social network algorithms powered by user behavior. Which is in part how we get vertical communities on social (“BookTok,” “Black Twitter.”) The prevalence of news stories in those communities can be community-dependent.

For Search, the good news is that data on what news stories people seek out and are delivered is available from Google Trends. That said, I’d suggest reading the Google Trends help docs before digging into and reporting that data. You need to know what relevance means to Google when looking at those numbers.

Those are just the differences in digital formats that need to be considered when researching the “prevalence” of news stories. We haven’t even discussed that to really measure “prevalence” you’d need to consider what’s in print editions and broadcast newscasts, both of which still help determine the news agenda for the country. We haven’t discussed the role that consumers play on setting the agenda - the number of people clicking on a story on other acquisition channels helps determine if that story is ranked on a homepage, and for how long. And we didn’t discuss the fact that some news organizations are testing personalization of homepages powered by machine learning. What you see on a news homepage can be unique to you and based on how cookies tracked your habits across the web. It might be different from what other visitors see.

It’s also worth noting that 64 news sites may not constitute a useful sample. At a minimum, in the top 100 DMAs in the U.S., there are typically at least four broadcast news websites and one newspaper of record website. That’s 500 local sites that determine the prevalence of news in the U.S. just in the top 100 DMAs. There are 210 Nielsen DMAs in the U.S. Many of those DMAs also are home to hyperlocal startups and alts which also should be considered when tracking and reporting the “prevalence” of news stories. What’s prevalent to people in Cleveland will be different from what’s prevalent to people in Memphis, which will be different from what’s prevalent to people in L.A., etc.

And that’s just the U.S.

That’s not to say that there aren’t worthwhile data-based stories to tell about news consumption and delivery in the U.S. It’s always interesting to learn more about how specific stories are presented by different news organizations on a specific market. You also could subscribe to a bunch of different newsletters and report on what they present, given that newsletters are static. Google Trends data from the previous day also is static.

Here’s the source of the Pew data about news consumption on social: https://www.pewresearch.org/journalism/fact-sheet/social-media-and-news-fact-sheet/

Hope that’s helpful!

2

ianhillmedia t1_j2eb20h wrote

The economic model creates additional challenges. Without corporate financing, ad revenue or other support, admins need to limit access to their servers so they can pay the bills. That means users can’t necessarily immediately join the instances that best meet their interests. Users then come away saying Mastodon is too complicated. But some of it is also branding. When you use terms like “server” and “instance,” users assume the network is only for those who are tech-savvy.

All that said, learning to navigate Mastodon really isn’t any more complicated than learning how to navigate subreddits. When I say it took me a minute to learn the Mastodon UX, it also takes a minute to learn Reddit’s UX and culture - the importance of replying vs. posting, how to find active subreddits that meet your interests, the rules of subreddits, etc. And I’d argue that learning how to navigate a Mastodon instance is easier than learning how to navigate Facebook right now - that UX is a trainwreck.

1

ianhillmedia t1_j2b915g wrote

One of the things I’ve actually appreciated about Mastodon instances is that they’ve kept me more engaged. That’s because on Mastodon, you’re not limited to only seeing the posts of folks you follow or that those folks want you to see. You can easily see feeds of the posts of everyone on your instance, posts from the folks they follow and folks on other instances. Twitter initially offered a feed of all users on its service - it’s similar to that. So I’m exposed to ideas I don’t get on other services. That said, Mastodon’s 8M users are still largely early adopters - technologists, scientists, artists, activists, educators and journalists - as the post notes. That’s not everybody’s community. If they’re going to keep growing, Mastodon instances will need to attract folks from other communities - and that, in part, means addressing the perception that Mastodon is difficult to join and understand.

11

ianhillmedia t1_j2auaea wrote

I honestly think this is a branding problem more than anything. I joined a Mastodon server in late October as part of the first Twitter migration, and while it took a minute to get used to, I didn’t find it all that difficult to learn. I also think a fair amount about what it would be like if I had to learn Facebook’s UX today from scratch. It’s more complicated than that of a typical Mastodon server.

14

ianhillmedia t1_j243z7p wrote

It’s definitely worth reading the article for context; the headline over-promises. It only has $100K in funding and it’s being developed by two researchers at a college in Illinois. The audience isn’t the average person - they want to create something that will send doctors an alert when medical misinformation is spreading so that those doctors can use their communication channels to spread facts. They also say it’s at least a few years away from launch.

1