wdoc is a powerful RAG (Retrieval-Augmented Generation) system designed to summarize, search, and query documents across various file types. It's part

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-10-31 22:00:07

wdoc is a powerful RAG (Retrieval-Augmented Generation) system designed to summarize, search, and query documents across various file types. It's particularly useful for handling large volumes of diverse document types, making it ideal for researchers, students, and professionals dealing with extensive information sources. I was frustrated with all other RAG solutions for querying or summarizing, so I made my perfect solution in a single package.

Goal and project specifications: wdoc uses LangChain to process and analyze documents. It's capable of querying tens of thousands of documents across various file types at the same time. The project also includes a tailored summary feature to help users efficiently keep up with large amounts of information.

For extra large documents like books for example, this summary can be recusively fed to wdoc using argument --summary_n_recursion=2 for example.

Leave a Comment