Input image (64 × 64)
The encoding starts with a 4 096-pixel image: small enough to fit a single DNA strand experiment, large enough to make the trade-offs meaningful.
Current and recent work.
The encoding starts with a 4 096-pixel image: small enough to fit a single DNA strand experiment, large enough to make the trade-offs meaningful.
Each pixel is converted to a binary representation, producing a stream of 0s and 1s, the canonical form before any DNA-specific mapping.
Redundancy is added with an error-correcting code (e.g. Reed-Solomon). DNA synthesis and sequencing introduce substitutions, insertions, and deletions; the ECC layer lets the original bits be recovered after readout.
Bits are translated into nucleotides, typically two
bits per base
(00 → A,
01 → C,
10 → G,
11 → T). Constraints like
avoiding long homopolymer runs are applied here.
The sequence is synthesised as physical DNA and stored cold and dry. To read it back, the DNA is sequenced and the pipeline runs in reverse: base → bits, ECC decoding, then image reconstruction.
In Shannon's information-theory model, the stored DNA is the channel: a noisy transmission line between encoder and decoder. Environmental factors (UV radiation, temperature, humidity, time itself) cause substitutions, insertions, and deletions in the bases. The error-correcting code added during encoding is what lets the original bits be recovered after the channel has done its damage.
Researching DNA as a long-term archival medium. Reproducing existing encoding and decoding workflows in practice, with a focus on error handling, efficiency, and robustness. The goal is to make the trade-offs of DNA storage tangible, and to identify where the technology actually makes sense.
Research question: What is the trade-off / efficient frontier between error correction and DNA sequence length for encoding a 64×64 image?
Minimum spanning trees
Finds the cheapest way to wire up a network so every point is connected, adding one connection at a time and always taking the least expensive option available.
Lazy infinite data structures
Builds trees that are endless in principle but only compute each branch the moment you actually look at it, so you can search and transform them without ever running out of memory.
Interpreters & closures
A small programming language and the interpreter that
runs it, with variables, functions that remember the
values around them, and if conditionals. A
bonus version even handles recursion without the
language building it in.
Dynamic programming + I/O
Solves the classic packing puzzle: which items maximise value without going over a weight limit. It reuses earlier results to avoid repeating work and reads the items from a simple text file.
The module system
Writes matrix code once and reuses it for very different kinds of numbers, generating both regular (dense) and memory-saving (sparse) matrices from the same blueprint.
Search & game AI
A simple game opponent: it works out which board squares it can reach, checks whether a move is legal, and scores the options to pick a good next move.
A collection of self-contained OCaml projects written for a functional programming course. Each is a Dune project built around one assignment and named after the most demanding concept it exercises. Together they work through greedy graph algorithms, lazy infinite data structures, interpreters and closures, dynamic programming, the module system, and game-search AI.
Study solutions: the goal was to learn the ideas, so the code favours clarity over micro-optimisation.
A command-line tool in Go that reports how delayed a Deutsche Bahn trip (origin → destination) was on average over a past time window. It downloads an open dataset of historical train stops, matches every train that leaves your origin and later reaches your destination, and computes the average, typical, and worst-case lateness on arrival.
The official DB API only exposes the live situation, so it relies on the open piebro/deutsche-bahn-data dataset (every German station since July 2024). Station spelling varies in the source data, so a stations subcommand helps find the exact name.
A command-line tool in Go that drives a headless Chrome browser to scrape a GitHub repository. It visits the repo's main page, issues tab, and commits page, then gathers everything into a single JSON report, with an optional full-page screenshot.
Built on chromedp for browser automation. Point it at any public repo URL and it returns structured data without touching the GitHub API.