AHTML vs Firecrawl
site-emitted vs externally-crawled
Firecrawl reads your site from the outside. AHTML emits structured data from the inside.
Firecrawl and AHTML solve adjacent problems: Firecrawl turns any URL into clean markdown via external crawl, AHTML turns your own app into a typed agent surface. The two compose — Firecrawl can crawl AHTML endpoints and get even cleaner output.
Feature-by-feature
The honest table.
| Firecrawl | AHTML | |
|---|---|---|
| Architecture | Hosted crawler (external) | Framework plugin (in-app) |
| Works on sites you don’t own | ✓ | — |
| Zero per-request cost | — | ✓ |
| Latency | Crawl + parse round-trip | Single in-process call |
| Typed actions (cost / reversible / side-effects) | — | ✓ |
| MCP server emitted | — | ✓ |
| OpenAPI 3.1 emitted | — | ✓ |
| JSON-LD emitted | — | ✓ |
| llms.txt emitted | — | ✓ |
| Auth-walled data support | limited | ✓ (in your auth context) |
| Signed provenance | — | ✓ (v0.2 roadmap) |
| Pricing | Per-request SaaS | Free (MIT) |
Pick Firecrawl when
- You need to scrape sites you don’t own.
- You can’t deploy a plugin to the target site.
- You’re fine paying per-request for a hosted service.
Pick AHTML when
- You own the site and want zero per-request inference cost.
- Agents need to take typed actions, not just read.
- You want MCP and OpenAPI emitted alongside the snapshot.
- You want sub-100ms response times (no remote crawl in the loop).
- You want cryptographic provenance (v0.2 signed snapshots).
What they have in common
- Both make web content easier for LLMs to consume.
- Both reduce token cost vs raw HTML.
- Both expose JSON-shaped output.
Three minutes to install. Decide for yourself.
AHTML is MIT-licensed and runs entirely inside your app. No SaaS, no per-request cost, no lock-in.