aicoolies logo

Scrapling

Adaptive web scraping library with anti-bot evasion and smart selectors

Share
open-sourceOpen Source
Visit Website →

Scrapling is a Python web scraping library that uses adaptive selectors and anti-bot evasion techniques to extract data from websites reliably. It generates selectors that survive website layout changes by understanding element context rather than relying on brittle CSS paths. Features stealth browser automation, automatic retry logic, and proxy rotation. 65K+ GitHub stars.

We have a review for this tool

A detailed review by the aicoolies team — click to read

Scrapling is a Python web scraping framework that uses learned element similarity to auto-relocate CSS and XPath selectors when target websites redesign their DOM structure. Rather than breaking when a website class names change or HTML tags shift, Scrapling adaptive parsing compares element positions, attributes, and visual characteristics against previously successful extraction patterns, automatically proposing updated selectors without manual intervention. This approach reduces maintenance friction significantly for scraping tasks that normally require human updates after site changes.

The framework bundles capabilities for single-request extraction, full-scale crawling, and everything in between: concurrent session management with proxy rotation, Cloudflare anti-bot bypass built-in, rich DOM navigation methods for parent/sibling/child traversal, and an interactive IPython shell for developing and debugging scripts. Performance benchmarks show Scrapling similarity matching at 2.39ms per operation versus AutoScraper at 12.45ms, and JSON serialization 10x faster than Python standard library. With 92% test coverage and full type hints, the library prioritizes reliability and developer experience.

Teams extracting price feeds, monitoring competitor websites, aggregating news data, and performing SEO research benefit from Scrapling robustness against site changes. The cost of maintaining scraping jobs drops significantly when selectors auto-adjust rather than requiring constant oversight. The spider framework handles distributed crawling with pause/resume semantics, addressing multi-hour scraping tasks where interruptions are common. Open-source adoption has grown steadily, establishing Scrapling as a practical alternative to heavier solutions like Scrapy for projects where adaptive parsing is valuable.

Pricing

Free and open-source

Platforms

Python, any OS, headless browser

Categories

Tags

Use Cases

Alternatives

Related Tools

Notion MCP Server

Official Notion MCP server for AI-agent workspace access

Notion MCP Server is Notion's official MIT-licensed MCP server for connecting AI assistants to Notion workspaces. It supports the vendor-backed remote OAuth path and tools designed for page, workspace, and Markdown-style operations, making it a safer default than unofficial Notion bridges for teams already using Notion for docs, projects, or internal knowledge bases.

open-sourceOpen SourceTelemetry

Linear MCP Server

Official authenticated remote MCP endpoint for Linear issues, projects, comments, and coding-agent workflows.

Linear MCP Server is Linear’s official authenticated remote MCP endpoint for agent access to issues, projects, and comments. It gives Claude, Codex, Cursor, VS Code, Windsurf, Zed, and other clients a centrally hosted way to find, create, and update Linear work items through OAuth-backed MCP without maintaining a local connector or brittle API glue.

freemiumTelemetry

Slack MCP Server

Official Slack MCP server for approved workspace search, messaging, canvas, and user-context actions.

Slack MCP Server is Slack’s official remote MCP layer for giving approved AI clients workspace context and controlled actions. It lets agents search messages, files, users, and channels, draft or send messages, read threads, manage canvases, and authenticate through Slack OAuth while workspace admins approve integrations and normal Slack rate limits still apply.

freemiumTelemetry

Spotlight by Backplanes

Session reports for Claude Code and Codex runs

Spotlight by Backplanes turns completed Claude Code and Codex sessions into concise reports for engineering, security, and spend review. The CLI installs on macOS, Linux, or WSL 2, watches sessions after they finish, redacts PII and credentials locally before upload, then summarizes files touched, commands run, external domains reached, scope drift, risky actions, and next-session improvements.

freemiumTelemetry
agmsg logo

agmsg

Cross-agent messaging for CLI coding agents

agmsg is an MIT-licensed Bash and SQLite messaging layer for CLI coding agents. It lets Claude Code, Codex, Gemini CLI, GitHub Copilot CLI, Antigravity, OpenCode, Hermes, and other terminal agents exchange messages through a shared local database instead of relying on a human copy-paste relay. It is intentionally not MCP, not a broker, and not a subagent framework.

open-sourceOpen Source
OpenHuman logo

OpenHuman

Local-first personal AI agent with memory trees, desktop integrations, and private workspace context.

OpenHuman is an open-source, local-first personal AI agent from TinyHumans. It combines a desktop app, persistent memory trees, Obsidian-compatible storage, OAuth integrations, and local model support into a private assistant harness. It is most interesting for users who want agentic workflows and long-term memory without handing every context detail to a fully cloud-hosted assistant.

open-sourceOpen SourceTelemetry