Watching Wikipedia edit itself

I’ve been building a web app that watches Wikipedia edits in real time, and uses AI to narrate what it sees.

Wikipedia is one of those parts of the internet that feels both enormous and strangely intimate. Every second, people are adding citations, correcting facts, reverting vandalism, arguing on talk pages, updating sports drafts, tweaking article categories, and occasionally leaving edit comments that read like tiny pieces of found poetry.

I built this because I wanted to test a hunch I already had from working with real-time data: that AI could be a force multiplier for turning torrents of activity into something human-readable.

In my last role, I spent a lot of time around real-time systems. That work made a few things very clear to me: data can arrive incredibly quickly; raw data is often almost impossible to read unless you know exactly what you are looking at; AND there is a huge amount of craft in the work data engineers and analysts do to turn that raw material into something a business can actually use.

Dashboards, reports, and analytics tools are all part of that translation layer. They are useful and necessary. But I’m wondering what else might be possible, especially for business users who need a quick read on what is happening right now and do not always want to go digging through charts. What if a live system could tell you a story about itself? Not a replacement for dashboards or analysis, more like another interface. A narrative layer that can sit on top of a stream of events and say, in plain language, “here is what seems to be happening.”

So I went looking for a live data source. I needed something real-time, accessible, and interesting enough to carry an experiment, and that is how I found the Wikimedia recent changes stream: a live feed of edits happening across Wikimedia projects, around the clock. This is the “Wikipedia edits firehose”. The raw stream is technically fascinating and experientially useless. Page titles, usernames, timestamps, edit comments, bot activity, categorisation changes, vandalism reverts. Hundreds of thousands of edits a day, most of them meaningful only if you already understand the system around them.

Wikipedia was a perfect test case because everyone knows the polished surface, but almost nobody sees the machinery underneath. And there is something inherently editorial about every edit. Someone, somewhere, is making a decision about what is true, what belongs, what needs fixing, or what should be removed.

Product decisions

The first product decision was subtraction. The raw stream was too much, so I filtered it down to English Wikipedia, human editors, and substantive edits. The bot activity and broader event volume still matter, but mostly as context.

The second product decision was to define the narrator’s job, and this required a fair bit of back-and-forth with Claude to get the prompt design right. What should the narrator notice and what should it ignore? How should it know when a quiet window is actually quiet, versus when it has missed the interesting thing? How do you make something feel live without making it noisy? The prompt to the LLM became less like an instruction and more like a product spec for a narrator: what to notice, what to ignore, when to be concise, when to connect dots, and how not to sound like a dashboard wearing a human mask. Here’s how it eventually worked: the app collects edits in rolling windows, gives the model summary context about the window, and asks it to write a short narrative about what is happening. It also keeps a little memory: what it noticed recently, which articles keep reappearing, whether this window is unusual compared with the session so far.

The third thing was to make it “feel” live. A single 45-second window of can be interesting, but continuity is what makes it feel alive. If the same article keeps reappearing, or the same kind of edit pattern keeps surfacing, the narrator should be able to notice that. Otherwise it is not really observing, it is just repeatedly glancing.

The filtering and AI prompting was, surprisingly, the easier part to build. The harder part was making the app experience feel live without being annoying. This was a UX challenge. I worked through cached stories on first load, a quiet collecting state, timestamps that age in place, and a progress bar that gives the page a pulse between updates.

The fourth and final layer was tradeoffs around cost, trust, security, and attention. Even for a toy, I had to make grown-up tradeoffs: cap the events passed into the model, cache the prompt, skip empty windows, and stop generating when nobody has the tab open (or else drain my Claude API credits).

For the technically curious: it’s a small Python app connected to Wikimedia’s recent changes stream, with rolling event windows passed to Claude for narration. The unglamorous parts turned out to matter a lot: filtering, context, latency, cost controls, and treating edit comments as untrusted input.

Screenshot of Wikipedia Live, an AI-powered live data storyteller showing Wikipedia edits arriving in real time

The experience of the thing itself

After it was built, I started using it. I expected to spend most of my time thinking about data formats, Python code, prompt engineering, tokens, optimisation, model choice, output quality. Technical stuff.

What happened is that I started watching the stream and could not stop clicking around. One of the first moments that got me was an editor on The Bear’s Wikipedia page leaving the comment “meh sorry fam” while reverting their own edit. The AI described it as “either self-correction mid-thought or something hastily committed to and immediately regretted.”

Then there was a user making edits to a page called The Twin Miracle, getting warned by another editor, warned again, flagged by a bot with a very high vandalism score, warned a third time, and finally signing off with “wikipedia is freaking dumb.” This was a whole drama that lasted 6 minutes start to finish.

There is quieter material too, which I found just as interesting.

“a lot of bots running categorisation work (749 events) while humans scatter across obscure corners — someone fixing a stray bracket in a plant taxonomy article, someone else adjusting Brazilian beer festival sort keys — the kind of invisible maintenance that keeps Wikipedia’s edges from fraying.”

That is the version of Wikipedia most of us never see, the constant maintenance layer underneath it. The work of keeping the edges from fraying.

At first, I had no idea what the stream might show me. Then I started to wonder if it might surface geopolitical signals, and I still think it might once I run it for longer and build out better geographic context. There are already hints: talk pages about the Iran war, the NFL draft showing up as it happened, 2026 edits “flowing in with the casual confidence of people documenting the present, not the speculative caution you’d expect for a year that hasn’t happened yet.”

But the thing it surfaces most clearly right now is the hum of human attention. People editing what they care about, like Aaliyah’s discography. People correcting small errors, people fighting over phrasing, people adding citations. People disputing whether haggis is uniquely Scottish and reconciling conflicting death dates for cast members of The Brady Bunch.

The point was not that AI can summarise Wikipedia edits (it can). More interesting to me is that AI can become an interpretive layer over messy, live systems; it can sit between raw activity and human understanding, not to replace the human, but to help the human notice what might otherwise be invisible, right now.

The shape of what’s happening

Real-time data is alive. Dashboards can measure it, but they don’t always help you feel what is changing, why it matters, or what might be emerging before you know what question to ask. The systems we already have are very useful… charts, data analysts, alerts, etc. But a lot of systems do not just need measurement, they also need sense-making, in ways that humans instinctively undertand. That’s where I like the idea of stories as an interface for streams of data.

An ecommerce team does not always need another chart of traffic and conversion. Mabe they would benefit from a plain-language read on what is happening in the store right now, and maybe some suggestions as to why (a sudden influx of visitors, a pattern starting to emerge based on what else is going on in the world). A social product does not always need another feed, sometimes it needs to know which conversations are picking up momentum, and which communities are suddenly paying attention. An operations team does not always need more logs. Sometimes it needs a plain-language read on what pattern is emerging before anyone names it.

Wikipedia Live is my first public pass at this model. It is small and slightly odd, but it gave me a way to play with the underlying idea: AI as a narrative interface for live data.

And this has been (and continues to be) a fun side project, one that started with a hunch and a skill I wanted to stretch, and enough curiousity to follow the weird stuff when it showed up. I started with the raw ingredients of the Wikipedia edits firehose and an Anthropic API key. And I made something that brought me more joy than I was expecting, a window into people caring about things in public, in real time, on one of the strangest and most important collaborative systems we have.

The site is live at datastories.live

It is still early, it probably has bugs, it definitely has opportunities for improvement. I welcome feedback!

And it has opened up the next set of experiments I want to build: ecommerce stories from synthetic store data, social stories from AT Protocol, and tech-news stories from signals across Hacker News, GitHub events, and other sources.

If you find value in Wikipedia, consider supporting the Wikimedia Foundation. The project only exists because Wikimedia makes this kind of public infrastructure available.