Filed: A Simple, File-Based Job Queue for Go Applications

developer-tools

Filed, a concurrent file-based job queue written in Go, is ideal for single-server applications. It simplifies job management using the file system, offering sandboxing, automatic retries, and intuitive administration.

Introducing Filed: A File-Based Job Queue in Go

Filed is a concurrent, file-based job queue written in Go. It uniquely leverages standard file system operations (files and directories) for job manipulation. For instance, you can create a job by running printf cmd > /pending/$id, view active jobs with ls /active, and restart a failed job by mv /failed/$id /pending. Designed for single-server workloads, Filed acts as a companion queue for other applications. It includes built-in features such as sandboxing, automatic retries, timeouts, and exponential backoff.

Installation

Filed is built with Go and requires both SQLite and FUSE (ensure fusermount is accessible in your system's PATH).

git clone https://sr.ht/~marcc/filed/
cd filed
go install
go install cmd/filed-launch.go

To build the documentation, you will need scdoc:

for f in filed*.scd ; do scdoc < "$f" > "${f%.scd}" ; done
# Optionally, move man pages to their respective directories (requires root):
# mv filed.5 /usr/local/man/man5
# mv filed.config.5 /usr/local/man/man5
# mv filed-launch.1 /usr/local/man/man1

Getting Started

For comprehensive documentation and security best practices, consulting the official man pages is highly recommended. Below is a concise example to help you begin.

Filed requires a designated job directory and a state file location (which defaults to XDG_DATA_HOME). Once these are configured, you can start the daemon:

mkdir /tmp/filed-jobs
filed -rof "/usr/bin/echo" -ro "/lib" /tmp/filed-jobs

In this example, filed mounts the /tmp/filed-jobs directory, exposing several special files and directories. The daemon launches each job in a sandboxed environment, restricting access exclusively to echo and lib.

To add a job, simply create a file within the newly available /pending directory:

printf "echo 'hello world'" > /tmp/filed-jobs/pending/1

Upon successful completion, the job's output can be viewed in the /complete directory:

cat /tmp/filed-jobs/complete/1
>>> echo 'hello world'
hello world

By default, a job will retry 3 times. If it remains unsuccessful, it will be moved to the /failed directory. You can inspect the logs in this directory to diagnose issues:

printf "ech this-will-fail" > /tmp/filed-jobs/pending/2
# Wait for a bit until it finishes retrying
cat /tmp/filed-jobs/failed/2
>>> ech this-will-fail
sh: 1: ech: not found
[System Error]: exit status 127

To restart a failed job, simply move it back into the /pending directory:

mv /tmp/filed-jobs/failed/2 /tmp/filed-jobs/pending

Finally, to remove completed or failed jobs:

rm /tmp/filed-jobs/failed/2

Further Documentation

Detailed documentation, including security considerations and maintenance guidelines, is available in the following man pages:

  • filed.5
  • filed.config.5
  • filed-launch.1

Design Philosophy & Motivation

The primary motivation behind Filed was to develop an easy-to-use job queue for self-hosted web applications, compatible with any programming language. A key goal was to simplify troubleshooting for administrators, enabling them to easily understand why a job failed and rerun it when necessary.

Inspired by the 9P protocol, Filed leverages files as an effective abstraction, where directories intuitively model state transitions. This design choice allows for straightforward inspection of the queue's state without the need for a dedicated administrative portal or separate login. All administrative operations can be performed directly by SSHing into the server, making system manipulation, security, and automation highly intuitive. This approach results in a streamlined codebase that nevertheless delivers a rich set of features.

Development Roadmap (TODO)

Completed Features:

  • Support for chmod and chown
  • State configuration via environment variables
  • Customizable backoff and timeout for retries
  • Accurate rendering of "last modified" and "created at" timestamps for jobs
  • "Landlock"-mode for enhanced sandboxing
  • Introduction of filed-launch, a script to restrict command access
  • Command arguments for filed to lockdown access while maintaining state file access, with removal of this access in filed-launch
  • Immediate job termination upon removal

Planned Features:

  • Support for Landlock CLI with -ro or -rw options, using stat to determine file type
  • Network restriction support
  • A reusable systemd unit file
  • Failure notifications (Note: inotify currently doesn't work with FUSE, which would have been an elegant solution)
  • Notification for "forget" and other updates
  • Packaging for Alpine Linux (including a reusable OpenRC script)
  • Support for removing/moving active jobs
  • Immediate job termination when moved to the failed state

Contributing

Bug reports and patches are welcome and can be submitted via email to ~marcc/public-inbox@lists.sr.ht.

Project Status

Filed has undergone testing, but it is not yet "battle-tested" in production environments. There may still be minor inefficiencies and areas for improvement.

Alternatives to Filed

Here's a comparison of Filed with other job queuing solutions:

  • nq: nq is a simpler, non-persistent process that lacks retry functionality. It's suitable for ad-hoc command-line queuing, whereas Filed is designed as a server-side job manager, enabling administrators to monitor and rerun jobs.
  • task-spooler (ts): ts offers more granular control over task execution (e.g., GPU/CPU allocation) and a wider range of features. However, it typically does not support automatic retries, a key feature in Filed.
  • Bull: Primarily designed for Node.js and JavaScript environments, Bull provides a graphical user interface and other features not present in Filed. Filed, by contrast, prioritizes simplicity and file-based operations over a GUI, enhancing interoperability with other systems and leveraging standard Unix permissions for access control.
  • AWS SQS: Amazon SQS (Simple Queue Service) is a significantly more complex and flexible message-passing service. It requires users to implement much of the retry infrastructure themselves and is less straightforward to inspect. While SQS scales exceptionally well and supports a broader range of workloads, Filed offers a simpler, more inspectable local solution.