Streamlined Web Analytics: A Custom Clojure Solution for Self-Hosted Sites
A Clojure-based, self-hosted web analytics solution for developers seeking privacy, performance, and simplicity. Discover how to gain insights into site traffic without the bloat of traditional platforms.
Streamlined Web Analytics for Self-Hosted Projects
Understanding website traffic is crucial for creators, yet traditional analytics solutions often present a dilemma. Many platforms are overly complex, raise privacy concerns, or lack the flexibility required for self-hosted web projects. The challenge lies in finding a balance: gaining meaningful insights without sacrificing simplicity, performance, or control.
While some prefer minimal engagement with statistics, recognizing visitor engagement provides valuable feedback, confirming that content reaches an audience. The need is for straightforward metrics: daily readers, article views, and basic interaction data. For those self-hosting web projects on conventional servers, leveraging the full power of the underlying programming environment opens doors beyond static page limitations, allowing for custom functionalities—including bespoke analytics.
Addressing Existing Analytics Shortcomings
The quest for a suitable analytics solution often reveals the limitations of popular options:
- Google Analytics: Frequently criticized for its bloat, privacy implications, and complex user experience.
These platforms can introduce significant overhead and raise questions about data ownership and long-term viability. - JavaScript-based SaaS Solutions: While offering convenience, they often come with subscription costs, depend on third-party availability, and require careful consideration for GDPR compliance, cookie policies, and the ability to track non-browser requests like RSS feeds.
- Server-Side Log Processors (e.g., Goatcounter via Nginx logs): Though seemingly a direct approach, feeding off server access logs can quickly become a management burden. Setting up domains, managing accounts, and monitoring performance can detract from core development, especially when processing high volumes of requests. Performance can be an issue even with moderate traffic.
Introducing a Custom-Built Analytics Solution
Driven by these challenges, a custom-built solution was developed, designed to be minimalist, performant, and perfectly tailored for self-hosted Clojure web applications.

This basic yet effective system focuses on providing essential functionalities with zero-friction integration.
Effortless Setup
Integration is a core design principle. The solution leverages existing web server infrastructure, minimizing setup and configuration. It integrates seamlessly into a Ring stack as a middleware:
(def app
(-> routes
...
(ring.middleware.params/wrap-params)
(ring.middleware.cookies/wrap-cookies)
...
(clj-simple-stats.core/wrap-stats))) ;; <-- Just add this
This approach requires no additional configuration or monitoring. It begins collecting and reporting data immediately, becoming an integral part of your existing web server environment.
Intelligent Request Classification
The system intelligently distinguishes between various types of requests, focusing on human visitors while accounting for other interactions:
- Live Visitors: Separately counted to provide a clear picture of active engagement.
- RSS Feed Requests: Special attention is paid to accurately count unique RSS readers, even if they make multiple requests within a day, often reported with subscriber counts in their
User-Agentstrings (e.g., Feedly, BazQux, Feedbin).
- Bots, Redirects, Favicon Requests, and Malformed URLs: These are identified and processed distinctly to prevent skewing user statistics, a crucial feature given the prevalence of bots scraping data.
Accurate Data Visualization
Data presentation is critical for clear understanding. The solution prioritizes semantically correct graph types and intuitive labeling:
- Discrete vs. Continuous Data: Unlike some analytics platforms that present discrete daily visit counts as continuous lines, implying interpolation between data points, this system correctly uses step graphs.
Incorrect: A continuous line can misleadingly suggest intermediate values.
Correct: A step graph accurately represents discrete counts for each time interval. - Intuitive Axis Labels: Axes are labeled with round, appropriate numbers (e.g., 100, 200, 500, 1K), enhancing readability. All graphs maintain a consistent vertical scale and synchronized horizontal scrolling for comparative analysis.
Core Insights
While designed for minimalism, the solution offers fundamental reporting capabilities. Users can narrow down reports by specific pages, query parameters, referrers, user agents, and custom date ranges.
Future Enhancements
Ongoing development aims to introduce further valuable features:
- Spike Causation Analysis: Identifying the reasons behind sudden surges in traffic.
- Geographic Breakdown: Basic country-level insights, requiring efficient GeoIP data integration (ideally under 1MB).
- Referrer Signal/Noise Separation: Distinguishing genuine mentions from general web traffic among referrers.
- Performance Optimizations: Exploring pre-calculated aggregates for faster dashboard queries, especially as the database grows (currently ~600 MiB over three years, powered by DuckDB for columnar queries and data compression).
Accessing the Solution
The solution is open-source and available on GitHub.

For installation instructions and to explore the codebase, visit: github.com/tonsky/clj-simple-stats.
A live demonstration is also available at tonsky.me/stats, providing a real-world view of its functionality, albeit with historically spotty data due to intermittent Nginx log collection. This offers a clear general idea of its capabilities.