Go's Cross-Platform Promise: The Reality of Building a Monitoring Agent
Explore the unexpected complexities of achieving true cross-platform portability with Go, focusing on challenges encountered while building `simob`, a server monitoring agent. Learn how dynamic linking with C libraries like `systemd` and `glibc` introduced intricate build issues, necessitating sophisticated solutions for different Linux environments.
We embarked on developing simob, an open-source server monitoring agent for the Simple Observability platform, with the belief that Go would provide a single, portable binary for all Linux distributions. This post is the first in a series detailing the complexities we encountered while building a truly cross-platform agent.
simob is designed as a lightweight, passive sensor, not a long-running daemon. Our goal was to create a small, self-contained binary with no prerequisites or external dependencies, mirroring the simplicity of a physical sensor. We envisioned a project that could be compiled from source on a development machine and deployed across diverse infrastructure without complex pipelines or third-party build services.
Why Go Was Our Choice
In the observability sector, Go is a prevalent choice for building agents that collect metrics and logs, as seen in projects like Promtail, Telegraf, and Grafana Alloy. Our decision was based on several key advantages:
- Compiled Language: Go’s compilation process catches a significant class of runtime errors proactively.
- Garbage Collector: For an application constantly ingesting and forwarding data, automated memory management offers a substantial benefit.
- Goroutines: This excellent abstraction simplifies concurrent task management. Our agent needed to handle numerous parallel operations—tailing log files, processing input plugins, and sending data upstream—and Goroutines allowed us to write clear, sequential-looking code while the runtime managed concurrency.
- Cross-Platform Compilation: We initially believed Go would effortlessly compile for any platform by simply setting
GOOSandGOARCHat compile time.
The Initial Simplicity
Early development proved straightforward. Go's mature ecosystem provided robust solutions; for core metrics collection, we leveraged gopsutil, a Go port of Python's psutil. This library offered a clean API for CPU, memory, network, and disk metrics, supporting a wide range of operating systems and CPU architectures, thus eliminating the need for custom, system-specific code.
The Challenge of Journal Collection
Complexity arose when users requested support for systemd journal logs. Unlike plain text files, journal logs are stored in a binary format, typically located in /var/log/journal or /run/log/journal. This format is structured, indexed, and can include inline compression.
We considered two primary approaches:
-
Custom Parser: The file format is documented and the
systemdsource code is available. Tools like Kaitai Struct could assist in generating parser code. While feasible, this option demanded significant time for careful review of the specification and actual implementation. Our major concern was maintaining compatibility across past, current, and future versions of the journal format, leading us to discard this option due to the burden of backward compatibility and code archaeology.As a humorous (and concerning) note from the
systemdjournal documentation states: "Note that the actual implementation in the systemd codebase is the only ultimately authoritative description of the format, so if this document and the code disagree, the code is right"—a stark reminder of the complexities involved. -
C API Wrapper: The
systemdproject provides a C API for reading journal entries, and a Go wrapper already existed, exposing this C API directly. On paper, this seemed like the optimal solution, and we proceeded with it.
Leveraging the C API introduced new constraints. Because the wrapper directly invokes the C API, the systemd library becomes dynamically linked, requiring its presence on the target machine at runtime. While this is acceptable—machines without systemd have no journal logs to collect—it generated new build challenges.
Firstly, builds failed on non-systemd systems like macOS. Without libsystemd available, cross-compilation to Linux was impossible from such environments; builds had to originate from a Linux system. This impacted both release and development builds, preventing even go run locally on non-systemd machines. Thankfully, Go's build tags offered a solution:
//go:build linux
This directive instructs the Go compiler to include the file only when building for Linux systems. While it necessitated stub files for other systems to ensure the package still compiled, it effectively isolated the platform-specific code.
// myfunc_linux.go
//go:build linux
package mypkg
func MyFunc() string {
// real Linux implementation
}
// myfunc_stub.go
//go:build !linux
package mypkg
func MyFunc() string {
// "stub for other systems"
}
Separate files with build tags allow providing a complete implementation for Linux while retaining a compilation-friendly stub for other environments.
Secondly, libsystemd binaries vary across architectures. An amd64 version is required to build an amd64 binary, and an arm64 version for arm64. This meant that simply setting GOARCH was insufficient to produce all targets from a single build worker; each architecture build required a worker with the matching libsystemd installed.
The glibc Problem
A more subtle and challenging issue emerged with glibc. Go's CGO_ENABLED build flag, when active, dynamically links any C dependencies. This applies to explicit C wrappers, such as the sdjournal package, and indirect calls within the Go standard library, like DNS resolution on Linux, which depends on glibc. With CGO_ENABLED set to 1, the final binary links to libc at runtime.
The default value of CGO_ENABLED varies; it's usually enabled when building natively on a system supporting cgo but disabled during cross-compilation or when a C compiler isn't found in the PATH. These defaults are generally sensible, as cgo is typically undesirable for cross-compilation or targets without glibc (e.g., Windows).
The critical problem is that a dynamically linked libc isn't universally compatible across all Linux systems. Distributions like Alpine Linux use musl instead of glibc. Consequently, a binary built with CGO_ENABLED for a Linux system will function on Ubuntu or Debian but will fail on Alpine Linux with an error like:
/bin/sh: ./simob: Permission denied
Despite appearances, "Permission denied" in this context on musl-based systems, especially when permissions are correctly set, almost invariably indicates the kernel's inability to locate the required glibc dynamic linker. This necessitates building a separate agent version specifically for non-glibc systems.
Was Go the Problem?
Ultimately, Go performed precisely as documented. Our challenge stemmed from assuming "portability" implied "effortless deployment." Once we incorporated low-level C libraries and targeted a diverse mix of glibc and non-glibc systems, the straightforward narrative dissolved. These were not dramatic failures but rather a series of constraints that became apparent only upon encountering them.
Our initial vision of building everything on a single laptop and shipping identical binaries everywhere quickly proved untenable. We now rely on GitHub Actions, utilizing appropriate runners for each architecture. While this introduces more moving parts than originally desired, it effectively manages the complexity and keeps it out of the critical development path. Local builds remain feasible through containers or emulation, though they are somewhat less streamlined than initially hoped.
In conclusion, while our build pipeline is more intricate than first imagined, the binaries we ship remain small and self-contained, successfully preserving our original objective.
Simple Observability is a platform that offers comprehensive visibility into your servers. It gathers logs and metrics via a lightweight agent, supports job and cron monitoring, and presents all data through a unified web interface with centrally managed configuration.
To learn more, visit simpleobservability.com.
The simob agent is open source and available on GitHub.