WARNING: DO NOT USE DEEPCRAWL IN PRODUCTION RIGHT NOW AS IT IS SUBJECT TO CHANGE AND STILL UNDER RAPID DEVELOPMENT. USE AT YOUR OWN RISK!
100% free and open-source Firecrawl alternative with better performance and flexibility.
Ask DeepWiki about this repo
NOTE: DeepCrawl doesn’t target anti-scraping or anti-bot purposes. It’s optimized for high‑frequency agent workloads that scrape public pages to extract cleaned Markdown and a hierarchical links tree.
Deepcrawl is an agent-oriented website data context extraction platform. It extracts cleaned markdown of page content, agent-favoured hierarchical links tree, and metadata that LLMs can digest with minimal token cost to reduce context switching and hallucination.
Full Platform (Nextjs Dashboard, API Workers, Auth Workers, and Database) is open and transparent.
Visit https://deepcrawl.dev/docs to view the documentation.
Please read the contributing guide.
Open Source. Open Code - built with ❤️ by @felixLu.
