I currently put a lot of effort into researching the question how to distinguish web based bots from real human beings. Researchers are publishing intriguing new papers constantly that try to answer this question for good. On this page however, I try to take a slightly more practical approach to find solutions.
— Personal Stance Compared to other actors active in this niche, I am convinced that advanced bots are not always bad. On the contrary, they often drive innovation and nurture creativity. After all, the Internet becomes a greater Oligopoly each passing day. Giving information back to smaller actors is a necessary and endorsed correction.
With Web Based Bots, I refer to bots that communicate mostly over the HTTP and HTTPS protocols and thus interact with the Internet that is visible and accessible to normal human users. Furthermore, I will only consider bots that are somewhat advanced. Put differently, advanced bots leverage real browsers by using some form of browser automation/testing framework such as Playwright, Puppeteer, Selenium, PhantomJS and many others.
In order to not get detected and labeled as a bad bot, there are several areas where an advanced bot programmer attempts to behave as closely as possible as real human being. In the following five chapters, I try to elaborate on those different realms.