Product docs
Generate a durable map of your site for language models.
This product crawls a public website, turns the discovered inventory into a spec-compliant llms.txt, hosts it at a stable URL, and can keep it in sync as the site changes.
Using the product
From URL to hosted llms.txt
Paste a public URL
Start from the homepage or a canonical docs URL. The crawler normalizes the origin, discovers crawl candidates, and starts a bounded run.
Watch crawl progress
Progress moves through discovery, crawling, generation, and validation so it is clear whether the system is still gathering pages or publishing the file.
Review the result
The result page shows the rendered llms.txt, copy and download actions, the stable hosted URL, and the page inventory used to produce the file.
Keep it current
Monitoring can regenerate the file when site structure or page metadata changes. Version history records what changed and supports inline diffs.
Under the hood
The publishing pipeline
Web app
The Next.js interface owns presentation and sends typed requests to the Worker API.
Worker API
The Cloudflare Worker registers sites, starts crawl runs, serves hosted files, and exposes progress and history endpoints.
Coordinator
A Durable Object owns live crawl state: frontier, active progress, and drain behavior.
Queues
Queue consumers fetch pages, classify inventory, retire unseen pages, and trigger generation when a run completes.
Storage
D1 stores durable site, run, page, and version history. R2 stores the published llms.txt file.
Generator
The LLM pass summarizes known crawled URLs only, maps output back to the inventory, and validates the file before publishing.
Trust boundaries
What the system does and does not promise
- Crawls are bounded by design: default page cap is 1,000 and depth cap is 3.
- The file is metadata, not a full-content dump. It links and describes canonical pages.
- The generator cannot invent URLs; output is mapped back to the crawled inventory.
- Static fetch is the crawler floor for MVP; JavaScript-only sites may produce thinner inventories.
- The spec validator gates every generated file before it is written to public storage.
- The observer-first product does not require accounts or site ownership to generate a file.