I always wanted to get hands-on experience with eBPF as it seems to be an exciting technology. A friend of mine recommended me the BCC Toolchain when I investigated some TCP socket issues and I was astonished how rich the possibilities are and how easy the code is compared to writing kernel patches.

When I was working on a test setup for a storage solution, we discovered that the disks images for multiple virtual machines seem to have I/O operations taking longer than 10 seconds. We suspected the underlying NetApp Appliance to hang once in a while so I searched for a way to log those long I/Os.

Within the BCC Toolchain there are multiple tools for searching for long I/O operations but I missed something to set a threshold and actual write them to a log file. So I started to write a bcc style tool myself to do that.
To get started with eBPF in Python I can recommend this very good (but slightly outdated) bcc Python Developer Tutorial. And also the corresponding reference guide.

In the end I actually created a tool call bioslow which is capable of logging all I/Os over a certain threshold in various formats:

[user@bcc-bioslow]$
sudo ./bioslow -t 15 -u ms
Logging I/O operations longer than 15 msecs

Tracing ... Ctrl-C to end.
2024-04-30 12:53:27.398519: 27 msecs
2024-04-30 12:53:35.941692: 55 msecs
2024-04-30 12:53:35.953540: 61 msecs
2024-04-30 12:53:35.953967: 61 msecs
2024-04-30 12:53:35.966807: 67 msecs
...

The tool is of course available on github.com/benibr/bcc-bioslow.

The eBPF code for this is rather small and just diffs the time between the start of a block I/O request and the end of it. Also the threshold is broken down to a FACTOR which is passed directly to the eBPF program to filter what is reported up to the python tooling.

BPF_HASH(start, struct request *);

void trace_start(struct pt_regs *ctx, struct request *req) {
    // stash start timestamp by request ptr
    u64 ts = bpf_ktime_get_ns();
    start.update(&req, &ts);
}

void trace_stop(struct pt_regs *ctx, struct request *req) {
    u64 *tsp, delta, factor = 0;

    tsp = start.lookup(&req);
    if (tsp != 0) {
        delta = bpf_ktime_get_ns() - *tsp;
        // the FACTOR will be later replaced by "delta /= 1000000000;" or similar
        FACTOR
        if (delta >= factor) {
            delta /= factor;
            bpf_trace_printk("%d %d %d\\n",
                req->__data_len, req->cmd_flags, delta);
        }
        start.delete(&req);
    }
}

The main difficulties I had were: 1) Figuring out which kernel functions I actually need to attach to and 2) synchronizing the time between the monotonic time since boot that the kernel reports and datetime.now() from Python.
The next thing would be to rewrite the tool so that it uses BPF_PERF_OUTPUT instead of bpf_trace_printk.

Anyway, this was easier than I expected and real fun. I’m already looking forward to another opportunity to use eBPF.