Pipet is a command line based web scraper. It supports 3 modes of operation - HTML parsing, JSON parsing, and client-side JavaScript evaluation. It re

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-09-30 10:30:02

Pipet is a command line based web scraper. It supports 3 modes of operation - HTML parsing, JSON parsing, and client-side JavaScript evaluation. It relies heavily on existing tools like curl, and it uses unix pipes for extending its built-in capabilities.

You can use Pipet to track a shipment, get notified when concert tickets are available, stock price changes, and any other kind of information that appears online.

Use the --separator (or -s) flag to specify custom separators for text output. For example, run pipet -s "\n" -s "->" hackernews.pipet to see each item in a new line, with -> between the title and the domain.

Use the --json flag to make Pipet collect the results into a nice JSON. For example, run pipet --json hackernews.pipet to get a JSON representation of the above results.

Use Unix pipes after your queries, as if they were running in your shell. For example, count the characters in each title (with wc) and extract the full article URL (with htmlq):

Leave a Comment