buzl.uk


analytics with duckdb

Published on

#duckdb #docker #vue

Recently, I’ve become curious about the visitor statistics of my websites. After a quick search, I decided to just build my own analytics solution to avoid any big brother-like companies (I’m looking at you big 5). So, here we are, building a simple analytics solution with DuckDB.

For those who don’t know, DuckDB is mostly used for data analysis while offering a SQL dialect. It can be used to import data from CSV, JSON, and Parquet files. Installing it to be used with Python is as simple as running pip install duckdb. In this project, I will be using DuckDB to store the visitor statistics of my websites.

The whole application is just a simple Flask web server that listens for incoming beacon requests and adds them to the database. The beacon requests are just POST requests with a JSON body that contains the visitor’s origin of the request and path. The server then adds them to the database along with the hashed IP address of the visitor for counting unique visitors.

For viewing the statistics, I’ve built a simple dashboard with Vue from CDN, so that I can just serve the page as a static file from the Flask server.

Here’s how it looks like:

Dashboard

To easily add it as a service to my server, I’ve also created a Docker image on the hub. Let me show you how to add it to your docker-compose.yml:

services:
    analytics:
        image: kaangiray26/analytics
        restart: on-failure
        ports:
            - "5000:5000"
        environment:
            SITES: "null, https://example.com"
            ADDRESS: "192.168.1.1:5000"

    silly-project1:
        build: .
        restart: on-failure

    silly-project2:
        build: .
        restart: on-failure
    ...

See the analytics service? You can specify which sites you want to set as valid origins for the beacon requests with the SITES environment variable, which is a comma-separated list of URLs. You should also specify the local ip address of the server with the ADDRESS environment variable, so that you can access the dashboard from the same network.

How to use it? We have the include the following script in the body of the pages we want to track like this:

<script defer>
    (function () {
        window.addEventListener("DOMContentLoaded", () => {
            fetch("https://home.buzl.uk:8443/analytics/beacon", {
                method: "POST",
                headers: { "Content-Type": "application/json" },
                body: JSON.stringify({
                    origin: window.location.origin,
                    path: window.location.pathname,
                }),
            });
        });
    })();
</script>

This script sends a POST request to the server after the page has loaded.

That’s it! To view the statistics, just visit the address you’ve specified in the ADDRESS environment variable.

I hope you’ve found this post not so boring and maybe even useful.

kaangiray26

Webmentions