Write (module) signatures, not (protocol) schemas

submited by
Style Pass
2021-07-23 13:30:05

For example, you can pass a function pointer as an argument to a function, even in C. But almost no protocol schemas support first-class functions. In more advanced languages, you can pass objects, abstract data types, references to other libraries, and other things. You can return any of those things, too. You can't express any of that in protocol schemas; this makes many things far more complicated, by denying the use of simplifying abstractions.

Part of the issue is that (as far as I know) there's no good language-independent type system; the best we have is C. That in turn means that the highest level of abstraction reachable by language-independent libraries is that of C, which is only a little better than protocol schemas. A language-independent type system would be really useful...

Of course, many people believe that there are some features that they can only get by using out-of-language servers running as a separate process, possibly on a different host, necessitating the use of protocol schemas. In fact, anything an out-of-language server can do, a regular in-language library can do, too; so you don't need to give up on the more powerful option of type signatures. Libraries can maintain stable, backwards-compatible interfaces, both API and ABI. Most people know how to do this at the API level. For example, they know to not add new mandatory arguments to an existing function, and instead to only add optional arguments with defaults. Backwards compatibility can be a bit trickier at the ABI level, but still possible, especially in more advanced languages than C, and in the many situations where ABI stability doesn't actually matter. In fact, a library is strictly more powerful than a protocol schema in this regard: If you really wanted to, you could use a protocol schema directly to define the format of the data passed in and out of the library. Libraries can be updated without restarting the process using them. This field of techniques is called dynamic software updating. These techniques are actually quite easy if the library only does things that one could do over a protocol schema. If the library uses more than that bare minimum of features, dynamic software updating becomes harder, but still possible. One interesting use of this ability is to implement extremely high-performance, but backwards-incompatible, wire protocols. Libraries can have access to resources that the rest of the program cannot access. For example: Javascript running in a browser can access functionality provided by the browser as a library; the browser and the Javascript are running at different privilege levels. An object in a type-safe language can contain capabilities for resources which it uses to implement its methods, without those capabilities being available to the code calling those methods. Java-style stack inspection can restrict user or library code to deny access at runtime to unauthorized methods. Capability-safe architectures such as CHERI prevent code from accessing memory that it doesn't have an explicit capability for. Software fault isolation can allow Multics-style "call gates", where a library has a different privilege level from other code. You might be concerned about using such fancy techniques. Sometimes, they aren't necessary, because the whole program can safely be given access to the supposedly-privileged resource, perhaps because the resource is one more of: fast to create (so we can just create them on the fly in the library) cheap to create (so we can give every user their own) already multiplexed and safe to share (it's amazing how easy it is to miss this if you don't explicitly think about it) easily tweaked to be shareable (for example, a resource could be leased out to a user then reset when they're done) or otherwise safe or can be made safe to give to users directly Consider this carefully; this point is often non-obvious. Libraries can run in parallel, using multiple cores. Obviously: You can run library functions on multiple threads. Libraries can access resources on other hosts and in other environments. In general, there's no reason a library can't spin up a stub in some environment, and use that to access resources in that environment, including on remote hosts. I have a constructive proof of this: rsyscall. Distributed languages (including libraries that make existing languages into distributed lanaguages) let you write libraries which access remote resources. Process supervisors can be replaced with a library.

Leave a Comment