AHTML vs Firecrawl

site-emitted vs externally-crawled

Firecrawl reads your site from the outside. AHTML emits structured data from the inside.

Firecrawl and AHTML solve adjacent problems: Firecrawl turns any URL into clean markdown via external crawl, AHTML turns your own app into a typed agent surface. The two compose — Firecrawl can crawl AHTML endpoints and get even cleaner output.

Install AHTMLFirecrawl
Feature-by-feature

The honest table.

FirecrawlAHTML
ArchitectureHosted crawler (external)Framework plugin (in-app)
Works on sites you don’t own
Zero per-request cost
LatencyCrawl + parse round-tripSingle in-process call
Typed actions (cost / reversible / side-effects)
MCP server emitted
OpenAPI 3.1 emitted
JSON-LD emitted
llms.txt emitted
Auth-walled data supportlimited✓ (in your auth context)
Signed provenance✓ (v0.2 roadmap)
PricingPer-request SaaSFree (MIT)
Pick Firecrawl when
  • You need to scrape sites you don’t own.
  • You can’t deploy a plugin to the target site.
  • You’re fine paying per-request for a hosted service.
Pick AHTML when
  • You own the site and want zero per-request inference cost.
  • Agents need to take typed actions, not just read.
  • You want MCP and OpenAPI emitted alongside the snapshot.
  • You want sub-100ms response times (no remote crawl in the loop).
  • You want cryptographic provenance (v0.2 signed snapshots).
What they have in common
  • Both make web content easier for LLMs to consume.
  • Both reduce token cost vs raw HTML.
  • Both expose JSON-shaped output.

Three minutes to install. Decide for yourself.

AHTML is MIT-licensed and runs entirely inside your app. No SaaS, no per-request cost, no lock-in.

Install AHTMLScore your site (free)GitHub