Product docs

Generate a durable map of your site for language models.

This product crawls a public website, turns the discovered inventory into a spec-compliant llms.txt, hosts it at a stable URL, and can keep it in sync as the site changes.

01

Using the product

From URL to hosted llms.txt

01

Paste a public URL

Start from the homepage or a canonical docs URL. The crawler normalizes the origin, discovers crawl candidates, and starts a bounded run.

02

Watch crawl progress

Progress moves through discovery, crawling, generation, and validation so it is clear whether the system is still gathering pages or publishing the file.

03

Review the result

The result page shows the rendered llms.txt, copy and download actions, the stable hosted URL, and the page inventory used to produce the file.

04

Keep it current

Monitoring can regenerate the file when site structure or page metadata changes. Version history records what changed and supports inline diffs.

02

Under the hood

The publishing pipeline

01

Web app

The Next.js interface owns presentation and sends typed requests to the Worker API.

02

Worker API

The Cloudflare Worker registers sites, starts crawl runs, serves hosted files, and exposes progress and history endpoints.

03

Coordinator

A Durable Object owns live crawl state: frontier, active progress, and drain behavior.

04

Queues

Queue consumers fetch pages, classify inventory, retire unseen pages, and trigger generation when a run completes.

05

Storage

D1 stores durable site, run, page, and version history. R2 stores the published llms.txt file.

06

Generator

The LLM pass summarizes known crawled URLs only, maps output back to the inventory, and validates the file before publishing.

03

Trust boundaries

What the system does and does not promise

  • Crawls are bounded by design: default page cap is 1,000 and depth cap is 3.
  • The file is metadata, not a full-content dump. It links and describes canonical pages.
  • The generator cannot invent URLs; output is mapped back to the crawled inventory.
  • Static fetch is the crawler floor for MVP; JavaScript-only sites may produce thinner inventories.
  • The spec validator gates every generated file before it is written to public storage.
  • The observer-first product does not require accounts or site ownership to generate a file.