We deserve better boorus in 2026.

Boorus haven't evolved since 2005. Tags are flat strings, notes are unsearchable, and the query language is a space-split. A modern booru should use a DAG for tags, treat notes as first-class content, and have a real query language. None of this is novel, it just hasn't been built yet.

Most people who have been online for a while are likely familiar with boorus in concept. If you are not, they are websites serving as imageboards and image hosting, often focused on anime illustrations, where each entry is labeled with user-defined tags and assigned metadata like NSFW ratings.

Danbooru, launched in 2005, was one of the original inventors of the concept and is responsible for shaping the general idea of how they should function. It brought many great ideas to the table and still runs today. However, the requirements of 2005 are different from the requirements in 2026, and the data model has not meaningfully evolved since it was founded.

Most boorus are clones of Danbooru or forks of Gelbooru. They all share the same underlying design. I want to talk a little about how I imagine a “modern” booru to work and the features I would like to implement in the booru services I work on. My main opinions, up front, are that a modern booru should treat tags as structured data, notes as first-class content, and search as a core architectural concern. Danbooru clones do none of these things.

Tag Namespaces

Classic boorus have no concept of namespacing. All their tags are “flat,” meaning that a tag is simply a direct match with no added metadata of any sort. This causes ambiguity and is prone to mistakes.

Let's take a simple example, like black_cat. In a flat tag system, this could refer to:

  • a literal black cat
  • the character Black Cat (Marvel)
  • the anime Black Cat

This is confusing. The solution that flat-tagged boorus have come up with involves adding more info to the tag, like the full series title or further information. This makes tags clunky to work with and not intuitive to know for new users, especially viewers who are likely unfamiliar with the tagging system.

Namespaced tags are the fix to this problem. Tags like character:reimu, artist:zuizi remove all ambiguity about what type of tag you are referring to. Danbooru has pseudo-categories by simply prefixing tags with a namespace, but the tags do not actually get special treatment internally. It still is simply a flat tag system, just with a different color on the frontend. This is a label, not a namespace.

Namespaces need to be a part of the schema from day 1. They can only be added later with great effort and would require significant re-tagging of existing posts, which nobody really wants to do.

Rather than tags serving as identifiers based on their name, they need to have their namespace be part of the key. The tag identity becomes (namespace, name), making character:sakura and general:sakura two distinct tags that can coexist while not cluttering the sidebar with long prefixes.

This does require more design considerations. Do you make namespaces mandatory? I don't think you should. The default should land in a “general” namespace, while anything more specific should go into a dedicated namespace. The ones we know from existing boorus are

character, artist, copyright, general, meta and some more, but this is a product of them not supporting namespaces nicely.

If we look to Hydrus, a booru-like file manager, we can see a different namespacing concept. Hydrus hosts the PTR, a massive file-to-tag repository, which supports namespaces as I envision them. Namespaces like species:, weapon:, location:, and clothing: are perfectly expected in it and lead to much greater clarity.

To an average user, namespaced tags make little difference. It would mostly help the system actually work as they already think it does. Searching for a tag name would still show you the tags from different namespaces, and if you already know the tag to be in a specific namespace, you'd start your search with the fitting prefix instead.

For power users and uploaders, however, this makes life considerably easier. No longer do you need to make sure you aren't using the wrong tag because someone decided ages ago that a generic term actually refers to something else. There would be no more need to add multiple different spellings of the same term and the like.

This would, however, still be a mostly flat tag system, just with one more key added to the tag identity. I'm not quite done yet.

Tagging is hard

On the surface, tagging is the simplest possible way to organize files. However, past a certain point, there are more tags than any one human can possibly remember, and that's when issues begin.

Contributor A tags a post reimu_hakurei. Contributor B tags a different post hakurei_reimu. Contributor C, who doesn't know the character's full name, just writes reimu. A fourth person uses hakurei_miko. All four mean the same thing. Multiply this across tens of thousands of tags and millions of posts, and you have a dataset that's slowly rotting from the inside. Nobody is even in the wrong here! This happens simply because nobody can possibly hold the entire vocabulary in their head.

Boorus generally have two tools to address this problem: aliases and implications. Most systems implement both, and almost none implement them well.

An alias means “this tag is another name for that tag.” hakurei_reimu -> reimu_hakurei. It's a simple redirect. When someone types the alias, the system silently resolves it to the canonical form. The post is stored with the canonical tag, and search works regardless of which name the user remembers.

In contrast, an implication says “this tag's existence guarantees that tag's existence.” character:reimu_hakureicopyright:touhou_project. When a contributor adds character:reimu_hakurei to a post, copyright:touhou_project is also added automatically. The contributor doesn't need to remember every ancestor tag; the system can fill in the gaps and save a lot of manual typing work.

In a classic booru, aliases and implications are bolted-on features that exist as side tables that the system consults sometimes, during some operations, in some set order. The tag itself is still the fundamental unit, and implications are an afterthought.

In a DAG-based system, the graph is the data model. There is no separate “implication expansion” phase because the act of connecting two tags in the graph is the implication. Aliases are just a special edge type.

The entire resolution process, from expanding a search term, to auto-tagging a post or validating a new tag, is simply a graph traversal. There is only one algorithm, one data structure, and one set of invariants to maintain. Classic boorus fail in very specific ways that a graph makes structurally impossible or at least trivially detectable.

Allow me to elaborate.

The tag graph

The real step to sensible tagging is changing the booru from working on simple tag matching to a directed acyclic graph. Despite sounding complex, this makes the system much easier to build and maintain and allows you to easily visualize how tags are truly assembled and related.

A directed acyclic graph showing a booru tag containment hierarchy, flowing top-down from most general to most specific.  At the top sits a single copyright tag, `team_shanghai_alice`, with one arrow down to `touhou_project`, also a copyright tag. From this second tier, `brown_hair` and `blonde_hair` (both general tags) flank `touhou_project` on either side.  The third tier holds two character tags. `touhou_project` has arrows down to both `reimu_hakurei` and `marisa_kirisame`. `brown_hair` has an arrow down to `reimu_hakurei`; `blonde_hair` has one down to `marisa_kirisame`. A small alias node labeled `miko` connects to `reimu_hakurei` with a dashed arrow, indicating it resolves to the canonical tag.  The bottom tier holds three posts. `reimu_hakurei` tags posts #4821 and #7340 (blue arrows). `marisa_kirisame` tags posts #7340 and #9112 (blue arrows). Post #7340 receives both character tags, making it the most connected node at this level.  Below post #7340, two note nodes — "translation" and "source" — are attached with dashed amber lines, indicating first-class note entities that belong to the post but sit outside the tag hierarchy entirely.  All solid gray arrows mean "encompasses." All solid blue arrows mean "tags." Dashed gray means "alias of." Dashed amber means "has note." Nodes are color-coded by namespace: teal for copyright, gray for general, purple for character, blue for post, coral for alias, amber for note.

Rather than organizing your tags and posts separately, you build a single graph in which both are contained. Tags can point to other tags to encompass them. Tags point to posts that have said tag assigned. Post tags can be found by finding what points to them.

Let's say you want to search for any images with “touhou_project.” This is a copyright-namespaced tag, so you search for the fitting one in the graph to start from. The below UI is interactive; give it a try.

See how the search engine finds touhou_project, walks down to reimu_hakurei and marisa_kirisame, then collects every post those character tags point to? By walking down the tree, you have now also resolved every single tag alias, walked through every indirection, and whatever other feature you would like.

touhou_project -blonde_hair is the more interesting one. The positive and negative expansions happen independently: touhou expands down to both characters, collecting the 3 candidate posts. Then blonde_hair expands down to marisa, and every post reachable from marisa gets subtracted. Post #7340 is in both sets (it has reimu and marisa), but the negation wins. It gets excluded because it's reachable from the negative subtree.

This has performance implications, of course. They are solvable without too much issue, but I will not get into deep implementation details here.

Pitfalls in old Boorus

Old boorus can fail in many ways due to the incomplete way they emulate a graph-based system. A DAG fixes almost all of them.

Cycles? If someone adds A → B and later someone else adds B → A, a classic implication resolver would loop forever. In a DAG, “no cycles” is the defining structural invariant. Before inserting any new edge, walk the graph from the target back toward the source. If you reach the source, reject the edge. This is a standard topological sort check. It runs on every insert, and it's cheap.

Transitive chains? reimu_hakureitouhou_projectteam_shanghai_alicejapanese_media. Each link is reasonable on its own. But now tagging a single character auto-generates four tag entries, and if the high-level tags are prolific (japanese_media would likely encompass half the database), your join table blows up exponentially.

Alias chains? mikohakurei_mikoreimu_hakurei. If aliases can point to other aliases, you need chain resolution. In a classic system, multi-hop aliases require their own chain resolution logic, often just a loop with a recursion limit and orphan detection when a middle link is deleted. In a DAG, this just works. The traversal algorithm already walks arbitrary paths; an alias chain is nothing special.

However! Just because you can doesn't mean you should. You should not encourage multi-hop aliases ever. Every intermediate node is something a moderator has to understand. If hakurei_miko exists only as a waypoint between miko and reimu_hakurei, it should be flattened into two direct aliases pointing at the canonical tag. The graph makes multi-hop aliases safe, but they are unnecessary in the absolute majority of cases.

To remain performant, classic boorus typically materialize all implied tags into a post_tags table at write time. This makes reads fast but writes expensive, and retroactive changes to the implication graph require retagging potentially millions of posts. A graph-based system gives you the option to resolve implications lazily at query time instead. The search index caches the expansion, and when an implication changes, you only need to invalidate the relevant cache entries rather than rewriting half the database.

None of these problems are unsolvable in a classic system. But in a classic system, each one is a special case that requires a fix in the first place, all to badly emulate a graph. In a DAG, they are all consequences of one invariant: the graph must remain acyclic, and every edge must have a well-defined type. If you maintain that, the rest follows by itself.

Notes

Moving away from the nerd stuff for a second, let's talk about notes.

In most boorus, they are simply text attached to a given post. In some boorus, there is support for spatial notes/annotations that are pinned to regions of an image. They are used for translations, commentary, artist notes, or other things.

The Japanese dialogue text is overlaid with an English translation, stored as notes.

The typical implementation of them in boorus is also very basic: a JSON column on the post table or a notes table that's clearly an afterthought. They tend to have no revision history, no search, and no API parity with tags. Notes are treated as a nice-to-have.

I do not understand why this is still the case. Notes contain incredibly important information; often they are more useful for finding posts than the tags are! They deserve the same attention as tags do.

Multiple notes per image should be the baseline, not a special case. A single panel from a manga translation might have 8-12 notes. Most posts will have source information attached, and artists tend to title or describe their works in some way. All this data should be saved as attached notes.

Simple tasks like “find me every image where a note contains 「お前はもう死んでいる” or “search all English translation notes mentioning contract” are currently impossible on almost every booru. Notes are write-only data. Sure, you can create them, but you cannot actually use them to find anything. That's bad.

Notes will contain mixed languages, so the search engine used for them needs to speak many languages. A post could contain notes with translations into ten different languages, and the search engine needs to be able to stem all of them. Simple search syntax like note_text:"bruh" or note_lang:ja would already suffice for users to understand and find posts easier.

Mixing note and tag search within a DAG is also trivial; a note search would return note IDs, which are leaf nodes in the graph. From there, the normal tag-based filtering, sorting, and pagination apply. You just start from the bottom instead and don't have to maintain a separate system.

The query language

Every booru uses essentially the same query syntax. tag1 tag2 -tag3. Space-separated tags, implicitly ANDed, with a minus prefix for negation. That's the entire language. It has been the standard since 2005, and if you've used any booru, you already know it by heart.

It's also incredibly inadequate for what a modern tagging system can actually do.

The obvious missing feature in many is OR. If you want posts with reimu or marisa, you run two searches. If you want posts with any of five characters from a specific series, you run five searches, or you give up and search the series tag and scroll. There is no way to express “either of these” in a single query. The reason it doesn't exist in many isn't that it's hard to implement; it's that the original query parser was never designed to be a parser at all. It's a string split on spaces with a couple of special cases bolted on.

Modern Danbooru's query language is actually more sophisticated than most people give it credit for! The same is not true for Gelbooru and the many forks.

Once you add OR, you also likely immediately need grouping. (character:reimu OR character:marisa) touhou_project is different from character:reimu OR (character:marisa touhou_project). Without parentheses, the user can't distinguish between these, and the system has to guess. Nobody is happy there.

Range queries on metadata are next. Some boorus support score:>100 or width:>=1920 as hardcoded special syntax, but the set of supported fields is arbitrary and incomplete. A general-purpose approach is better: any numeric or date field on the post, like score, dimensions, file size, upload date, favorite count, or more, should be queryable with standard comparison operators. uploaded:>2025-01-01 width:>=1920 score:>50 should just work. These aren't tags and shouldn't pretend to be! They're filters on post metadata, and the query language should acknowledge them as such rather than encoding them into the tag system through stupid hacks like absurdres or highres tags as is standard.

Wildcard scoping within namespaces is in my opinion also essential. character:re* should match reimu_hakurei, remilia_scarlet, reisen_udongein_inaba. artist:* should match any post with an artist tag at all. The namespace prefix constrains the wildcard so it doesn't explode across the entire tag vocabulary. Without scoping, a wildcard like re* would match red_eyes, reimu_hakurei, restaurant, and retro all at once. Completely useless.

Note content search, as in the previous section, also belongs in the query language. note_text:"contract" or note_lang:ja are pseudo-tags that route to the FTS index under the hood. The user shouldn't need to learn a second query interface or visit a separate page. Notes are content. Content should be searchable from the search bar.

All of this does mean you can no longer get away with splitting a string on spaces and assembling some SQL query.

A proper implementation needs a formal grammar. Not a complicated one! A PEG or recursive descent parser that produces an AST covering tag terms, negation, OR groups, parenthesized sub-expressions, range comparisons, quoted phrases, and sort modifiers. This is maybe 200 lines of code for the parser itself, if even that. It sounds like overengineering until you've actually tried to build one without a parser, which is a growing pile of regexes and special cases that become a massive pain to work on. Do it properly once, and it'll work every time.

The AST would then go through a DAG expansion pass before it touches any database. Each tag term in the AST gets resolved against the containment graph: touhou_project expands to {touhou_project, reimu_hakurei, marisa_kirisame, ...} by walking every descendant edge. Negated tags expand the same way: -blonde_hair becomes the full set of tags encompassed by blonde_hair. Aliases resolve here too; if the user typed miko, the DAG lookup follows the alias edge to reimu_hakurei before expansion even begins.

Only after expansion does the compiler actually lower the AST into whatever the backend speaks. On Postgres, the expanded tag sets become WHERE post_id IN (SELECT post_id FROM post_tags WHERE tag_id = ANY(...)) with subqueries for negation. On a search engine like Meilisearch or Typesense, they become structured filter objects.

There are performance concerns here that are worth acknowledging, but they are not all that different from the ones you'd face in a classic booru. Caching helps even more than you'd expect. Parsed ASTs for popular queries can be cached. The DAG expansion for a given tag (the set of all tags it encompasses) can be cached and invalidated only when the graph changes, which is rare compared to how often searches run. Post ID sets for very common tags can be kept in memory. None of this is too exciting.

The query language is the user's interface to everything else I talked about in this post. Structured tags, a containment DAG, searchable notes, and metadata filters. None of this matters if the only way to access it is tag1 tag2 -tag3. The query language needs to be as expressive as the system behind it, or you've built something that can't actually be used. Come on.

Conclusion

None of what I've described is novel, at all. These are all solved problems with decades of literature behind them. The reason boorus don't have these things isn't that they're hard; it's that Danbooru made a set of reasonable decisions in 2005, and everyone who came after continued using them.

The hardest part of all of this is the migration. Every existing booru has years of accumulated tag data that was entered under the assumptions of a flat system. Retrofitting namespaces onto a million tags, building a DAG from a table of ad hoc implications, and backfilling note metadata onto posts that never had it is a massive pain in the ass. Danbooru alone has almost a million distinct tags and over 50 million posts. The scale of work that would be required is difficult to comprehend.

If you're starting from scratch, though… no reason not to modernize. The schema is not that complicated. A tags table with a composite (namespace, name) key. An edges table with a type column and a foreign key at each end. A notes table with geometry, language, body text, and a post reference. A small query parser, implemented in your favorite language. A simple DAG walker, within the DB itself or outside of it. Wire it up to Postgres and a search engine, and you have something that avoids so many ancient issues.

Or get fancy with it and build a booru fully on a real graph database! I'd love to see it and certainly don't see why it would not work. Build the one that should have existed ten years ago.