The pipe is a multimodal-first tool for flattening unstructured files, directories, and websites into a prompt-ready format for use with large languag

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-04-02 19:00:08

The pipe is a multimodal-first tool for flattening unstructured files, directories, and websites into a prompt-ready format for use with large language models. It is built on top of dozens of carefully-crafted heuristics to create sensible text and image prompts from files, directories, web pages, papers, github repos, etc.

You can either use the hosted API at thepi.pe or run The Pipe locally. The simplest way to use the pipe is to use the hosted API by following the instructions at the API documentation page.

This command will process all supported files within the specified directory, compressing any information over the token limit if necessary, and outputting the resulting prompt and images to a folder.

The pipe is accessible from the command line or from Python. The input source is either a file path, a URL, or a directory (or zip file) path. The pipe will extract information from the source and process it for downstream use with language models, vision transformers, or vision-language models. The output from the pipe is a sensible text-based (or multimodal) representation of the extracted information, carefully crafted to fit within context windows for any models from gemma-7b to GPT-4. It uses a variety of heuristics for optimal performance with vision-language models, including AI filetype detection with filetype detection, AI PDF extraction, efficient token compression, automatic image encoding, reranking for lost-in-the-middle effects, and more, all pre-built to work out-of-the-box.

Leave a Comment