What follows is some of the evidence we’ve found so far that supports the idea that crawling the web is a natural monopoly and that Google has c

The Evidence We’ve Found So Far

submited by
Style Pass
2022-01-15 10:00:03

What follows is some of the evidence we’ve found so far that supports the idea that crawling the web is a natural monopoly and that Google has control of that monopoly. We are working on writing up more of the evidence we have found so far, and will be posting notifications about updates to this page as we go along.

As a disclaimer, we are ascribing specific intents to the robots.txt files without having communicated with the authors of those files. Sometimes there are comments left in the files that we can refer to that let us know the website operators intent, other times it is not clear whether something was a mistake or whether they wanted to give Google as wide access as they did. When we aren’t sure about something, we have tried to note that openly. It can be unclear whether a website operator intended for a part of their website to be exclusively crawled by Google or not, but often times those grey areas work out in Google’s favor.

What follows is a collection of websites that we have written up, each section showing off a different type of privilege Google gets when it comes to web crawling. Using our data warehouse and a rough analysis script, there’s about 600,000 more sites that are showing clear bias towards Google in some way. We’re working our way through them, trying to select the ones that act as most representative of the types of advantages Google gets.

Leave a Comment