Logging on Nomad and log aggregation with Loki

submited by

Style Pass

2021-07-10 09:00:06

When running a task orchestrator like Nomad or Kubernetes, there’s usually a bunch of different instances ( containers, micro-VMs, jails, etc. ) running, more or less ephemerally, across a fleet of servers. By default all logs would be local to the nodes actually running the stuff we want to run, making it burdensome to debug, correlate events, alert, etc., especially if the node crashes, hence why it’s a well established practice to collect all logs they emit to a central location, where all of those actions happen.

One of the most popular log management stacks is the so-called ELK ( ElasticSearch for log indexing, Logstash for parsing/transforming them, and Kibana for visualisation, with either Filebeat or Fluentd / Fluent bit for log collection), which has a few drawbacks - most notably heavy resource consumption and licence changes, the latter leading to numerous forks, which will probably result in some chaos/incompatibilities in the future.

A recent-ish contender in that space is Grafana Labs' Loki. It’s a lightweight Prometheus-inspired tool, which can be run as a bunch of microservices for better and easier scale-out, or in monolithic all-in-one mode. In contrast to ElasticSearch, it only indexes labels ( which are user defined ), the logs themselves ( chunks ) are stored as-is, separately. That makes it more flexible and much cheaper with regards to storage and compute.