When a website adds JavaScript rendering, you don't have to rewrite everything, only switch to one of the browser crawlers. When you later find a

Crawlee is a web scraping and browser automation library

submited by
Style Pass
2024-07-09 09:00:40

When a website adds JavaScript rendering, you don't have to rewrite everything, only switch to one of the browser crawlers. When you later find a great API to speed up your crawls, flip the switch back.

Crawlee is built by people who scrape for a living and use it every day to scrape millions of pages. Meet our community on Discord.

Crawlee for Python is written in a modern way using type hints, providing code completion in your IDE and helping you catch bugs early on build time.

Switch your crawlers from HTTP to headless browsers in 3 lines of code. Crawlee builds on top of Playwright and adds its own anti-blocking features and human-like fingerprints. Chrome, Firefox and more.

Crawlee automatically manages concurrency based on available system resources and smartly rotates proxies. Proxies that often time-out, return network errors or bad HTTP codes like 401 or 403 are discarded.

Leave a Comment
Related Posts