I don't think they have one, which raises the question of how they build their index. There's almost no info on it except vague mentions of &q

rek-lama comments on Is there more info on the spider / bot Brave uses to crawl the web?

submited by
Style Pass
2021-06-23 14:30:08

I don't think they have one, which raises the question of how they build their index. There's almost no info on it except vague mentions of "anonymous contributions from the community".

To sum up, Cliqz used a browser extension that collected search queries and URLs from its users. For example, if someone googled "weather" and clicked "http://asp.usatoday.com" they would add that to their index. It's right there in the blog post:

Yes, the bulk of query to URL associations come out of our users default search engine, which happens to be Google most of the time. Note however, that we are not crawling these search engines directly. We are learning from them by means of people using Cliqz.

I'm curious as to what Brave Search is doing to keep their index up to date, but there's no information on that, perhaps because it would make them sound a lot less impressive ¯\_(ツ)_/¯

In addition to the search queries and the clicked URL's, they also need to collect the meta data / results snippets to show on their SERPS. Somehow they need to collect that data. I'd like to know how they do it.

Leave a Comment
Related Posts