tl;dr: Here’s a how-to for adding some “AI”-poison to your static site that’s hosted on Codeberg Pages (or GitHub Pages). I’d appreciate some feedback on if this is useful/how it could be improved.
If you’re running any type of website in 2025, you’ll likely be suffering from the impact of generative “AI”. Be that “AI”-generated spam posted to your site, crawlers bringing your server(s) down or just having your digital stuff taken without consent to be thrown into an environment-destroying plagiarism-machine. No wonder I’ve become a cardtattoo-carrying Luddite.
And I’m not alone with that: Algorithmic sabotage and poisoning generative “AI” has been a topic for a while, using a wide range of methods. From poisoned images, video subtitles, to various text- and server-based methods, which the Algorithmic Sabotage Research Group has been collecting. This last category includes many different approaches that combine making up fake-texts for “AI” crawlers to read (or serving them Bee Movie), identifying “AI” crawlers to trap them into a tarpit where they will spend aeons of compute-time with slow-loading websites full of garbage, and other fun approaches.
All of these are great, but unfortunately rely on a quite actively controlled server-environment. Which means that those approaches won’t help your own direct action if you deploy websites through a static site generator (SSG) like Jekyll, Hugo etc. Even less so, if you deploy your static site through something like Codeberg Pages or GitHub Pages, where you have no way to access/edit the web server. As this page is deployed in such a static way, I wondered how I could engage in some sabotage or push-back against such crawlers.