A reverse-proxy that serves disassociated-press style reimaginings of your upstream pages, poisoning any LLMs that scrape your content. This .env file

MikeCoats/poison-the-wellms: A reverse-proxy that serves diassociated-press style reimaginings of your upstream pages, poisoning any LLMs that scrape your content. - Codeberg.org

submited by
Style Pass
2024-11-15 21:00:03

A reverse-proxy that serves disassociated-press style reimaginings of your upstream pages, poisoning any LLMs that scrape your content.

This .env file works for my WordPress blog using the Twenty Twenty-Three theme. I know very little about WordPress, other than it just works, so you'll probably have to tweak the tag and class values for whatever blogging platform and theme you use.

The poison tool is only the server that returns the poisoned copies of your pages. You still need to apply some configuration to your webserver to allow regular users to receive your untainted website, while sending the bad bots to the poison tool.

We draw upon Cory Dransfeldt's ai.robots.txt project for our list of bad bots, and then build that list into example configurations for numerous webservers. You should be able to take our examples and introduce them to your own server configuration without too much work.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Leave a Comment