Truly portable C applications

submited by
Style Pass
2024-11-22 03:00:03

Programming language polyglots are files that are valid programs in multiple languages, and do different things in each. While polyglots are normally nothing more than a curiosity, the Cosmopolitan Libc project has been trying to put them to a novel use: producing native, multi-platform binaries that run directly on several operating systems and architectures. There are still some rough edges with the project's approach, but it is generally possible to build C programs into a polyglot format with with minimal tweaking.

Justine Tunney, the creator of the project, calls the polyglot format she put together "αcτµαlly pδrταblε εxεcµταblεs" (APEs). Every program compiled to this format starts with a header that can simultaneously be interpreted as a shell script, a BIOS boot sector, and macOS or Windows executables. This lets APEs run across Linux, macOS, FreeBSD, OpenBSD, NetBSD, Windows, and bare metal on both x86_64 and Arm64 chips. When interpreted as a shell script, the program detects which architecture and operating system it is running on, and (by default) overwrites the first few bytes of the program on disk with an ELF header pointing to the code for that architecture and system, before re-executing itself. This does mean that the binary is no longer portable, but it can be restored by overwriting the ELF header again.

By building separate versions of the program for different architectures and then combining them with this polyglot trick, the project promises portability between different architectures without the overhead of an emulator or bytecode virtual machine. This approach does have downsides — APEs are larger than normal C binaries, although still smaller than many produced by other languages — but for projects where the portability is a benefit, the tradeoff may be worth it.

Leave a Comment