Making Free Proxies with Tor and Ansible

submited by
Style Pass
2024-09-20 10:30:04

Was reading through Hacker News when I saw “We accidentally burned through 200GB of proxy bandwidth in 6 hours”. Brutal! 😅

I remember getting into Skyvern. Really interesting tech! Too bad the open-source models aren’t quite there yet. I’m not VC enough to spend AI credits on web scraping either.

The post was honest, and I certainly could’ve made the very same mistake! But recently I’ve been feeling more like the Chad pictured below:

My favorite trick at the moment for getting free proxies is to just use Tor. It doesn’t work with every website, but for ones that do, that’s ~2k+ proxies free of charge!

When starting with the PoC (proof-of-concept) I started with webfp/tor-browser-selenium. Seemed like the natural place to start. Real quickly though, it became apparent that somehow… sites were detecting Selenium and rejecting my requests as bot-like.

Diving deep into the Firefox about:* pages looking for what could be the issue. I spent quite a while looking, trying things like privacy.resistFingerprinting = False, excluding domains, etc. etc. In the end, I believe it was a combination of Selenium and the browser telling the website that it was being automated. “Marionette” as they called it.

Leave a Comment