Notes taker: Brian Trammell
Mostafa Ansar (10 mins) (remote)
(no questions)
Elisa Luo (15 mins) (remote)
Alissa Cooper: Slide 16, What does "explicitly allowed" mean?
Elisa: There is a robots.txt rule allowing the crawler directly
Alissa: Removed restrictions is a delta?
Elisa: Yes, delta from previous monthly snapshot.
Corinne Cath: You identify some next steps that content creators should be taking. Do you have a sense of why they're not, and what it would take to get people to use the tools that we have?
Elisa: From user study, a lot of content creators don't have the technical ability to implement robots.txt. You need control over the webserver your site is hosted on. 99% of smaller content creators just use Weebly or Wix, which don't actually give you access to robots.txt. First next step is for these hosters to provide options for this and to advertise this for users.
Corinne: So we have a need to talk to the Weebly's of the world.
Elisa: Something like Squarespace, they do have this option, but still only 17% enabled it.
Suresh Krishnan: For active blocking, when you say "block AI", you're blocking based on UA or IP ranges?
Elisa: We only looked at user agent, because we can't spoof the address range.
Suresh: Did those hosters with a feature, did they default to blocking or not blocking?
Elisa: Most do not block by default.
Robert Sparks (5 mins)
(no time for questions)
Chris Petrillo and Birgit Müller (20 mins) (remote)
(no questions)
Thibault Meunier (5 mins)
Richard Wilhelm: Any data on cloudflare 402? Any intent to publish anything?
Thibault: Nothing public yet. That process may take some time?
Alissa: can you go to the 24/25 shares? What is this a share of?
Thibault: Total amount of bot traffic, not blocked traffic. Same on the previous slide?
Mirja: How much of your traffic is bot traffic?
Thibault: Available at radar.cloudflare.com in detail. Headline: about 30%.
Krishna Madhavan (20 mins)
Ameya Deshpande: If a website is using IndexNow, do you still crawl it independently?
Krishna: It means that it becomes the primary signal. There is a little traffic that persists, but as indexnow becomes the more trusted that tapers off.
Wes Hardaker: thanks to the chairs and the presenters. The missing piece here is how do you bootstrap this? A lot of crawlers, especially with new AI companies doing it, it would be good if we could incorporate models that exist, like sites that publish dumps.
Krishna: We're very early now, just starting to see rapid adoption...
Brian Trammell: Can you say something about uptake of the API?
Krishna: Bing results went from 33% indexnow to 50% indexnow sourced over the past year. Will look into how to publish API-side numbers.
Kiriti Kompella: Can you attribute Bing crawler traffic reduction to IndexNow?
Krishna: big part of it, but we're also working hard on detecting change in the Internet. When you see the traffic goes down by 1.8% or whatever, that's billions of crawls.
Dirk Kutcher: +1 thanks to the chairs. This is the kind of activity we should do more of in the IRTF. For the topic just presented, thanks for the presentaiton.
Mirja: Thanks to everyone who presented, and engaged!