Skip to content
#

web-crawling

Here are 182 public repositories matching this topic...

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

  • Updated Feb 24, 2026
  • Python

Omnisci3nt is an open-source web reconnaissance and intelligence tool for extracting deep technical insights from domains, including subdomains, SSL certificates, exposed services, archived content, and configuration data. — Omnisci3nt gives you the full picture in seconds.

  • Updated Jan 6, 2026
  • Python

🕷️ A lightweight Model Context Protocol (MCP) server that exposes Crawl4AI web scraping and crawling capabilities as tools for AI agents. Similar to Firecrawl's API but self-hosted and free. Perfect for integrating web scraping into your AI workflows with OpenAI Agents SDK, Cursor, Claude Code, and other MCP-compatible tools.

  • Updated Feb 6, 2026
  • Python

Improve this page

Add a description, image, and links to the web-crawling topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the web-crawling topic, visit your repo's landing page and select "manage topics."

Learn more