Skip to content

A robust Python solution using Botasaurus and CapSolver to automatically solve reCAPTCHA and Cloudflare Turnstile for web scraping.

License

Notifications You must be signed in to change notification settings

DenimEvert/botasaurus-capsolver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Botasaurus Captcha Solver Integration

Python Version License: MIT Stars

A robust, ready-to-use Python template for bypassing reCAPTCHA v2, reCAPTCHA v3, and Cloudflare Turnstile in web scraping projects using Botasaurus (for anti-detection) and CapSolver (for solving).


✨ Features

  • Seamless Integration: Combines the power of Botasaurus's anti-detection with CapSolver's API.
  • Multi-Captcha Support: Ready-to-use examples for reCAPTCHA v2, v3, and Cloudflare Turnstile.
  • Clean Architecture: Separated configuration, helper functions, and examples for easy maintenance.
  • Token Injection: Demonstrates how to correctly inject the solved token back into the browser context using Botasaurus.

πŸš€ Quick Start

1. Prerequisites

2. Installation

Clone the repository and install dependencies:

git clone https://github.com/your-username/this-repo.git
cd this-repo
pip install -r requirements.txt

3. Configuration

Create a .env file in the project root and add your API key:

# .env
CAPSOLVER_API_KEY=CAP-YOUR_API_KEY_HERE

4. Run Examples

Execute any of the example scripts located in the examples/ directory:

# Example for reCAPTCHA v2
python examples/recaptcha_v2.py

# Example for Cloudflare Turnstile
python examples/turnstile.py

πŸ“‚ Project Structure

.
β”œβ”€β”€ README.md
β”œβ”€β”€ LICENSE
β”œβ”€β”€ CONTRIBUTING.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env (ignored by git)
└── src/
    β”œβ”€β”€ config.py             # Loads API key and defines endpoints
    └── capsolver_helper.py   # Core functions for creating and polling CapSolver tasks
└── examples/
    β”œβ”€β”€ recaptcha_v2.py       # Complete example for reCAPTCHA v2
    β”œβ”€β”€ recaptcha_v3.py       # Complete example for reCAPTCHA v3
    └── turnstile.py          # Complete example for Cloudflare Turnstile

βš™οΈ Core Implementation

The core logic is split into configuration and the solving helper.

src/config.py

Handles environment variable loading and API endpoint definitions.

# src/config.py
import os
from pathlib import Path
from dotenv import load_dotenv

ROOT_DIR = Path(__file__).parent.parent
load_dotenv(ROOT_DIR / ".env")

class Config:
    """Configuration class for CapSolver integration."""
    CAPSOLVER_API_KEY: str = os.getenv("CAPSOLVER_API_KEY", "")
    CAPSOLVER_API_URL = "https://api.capsolver.com"
    CREATE_TASK_ENDPOINT = f"{CAPSOLVER_API_URL}/createTask"
    GET_RESULT_ENDPOINT = f"{CAPSOLVER_API_URL}/getTaskResult"

    @classmethod
    def validate(cls) -> bool:
        if not cls.CAPSOLVER_API_KEY:
            print("Error: CAPSOLVER_API_KEY not set! Check your .env file.")
            return False
        return True

src/capsolver_helper.py

Contains the reusable functions for solving different captcha types.

# src/capsolver_helper.py (Simplified for README)
import time
import requests
from src.config import Config

def _poll_task_result(payload: dict, timeout: int) -> dict:
    # ... (Polling logic as described in the article) ...
    pass

def solve_recaptcha_v2(website_url: str, website_key: str, is_invisible: bool = False) -> dict:
    """Solves reCAPTCHA v2 and returns the gRecaptchaResponse token."""
    if not Config.validate():
        raise Exception("Invalid configuration")
    
    task = {
        "type": "ReCaptchaV2TaskProxyLess",
        "websiteURL": website_url,
        "websiteKey": website_key,
        "isInvisible": is_invisible
    }
    # ... (Task creation and polling via _poll_task_result) ...
    # Returns {'gRecaptchaResponse': '...'}
    pass

def solve_recaptcha_v3(website_url: str, website_key: str, page_action: str) -> dict:
    """Solves reCAPTCHA v3 and returns the gRecaptchaResponse token."""
    # ... (Implementation similar to v2, but with pageAction) ...
    # Returns {'gRecaptchaResponse': '...'}
    pass

def solve_turnstile(website_url: str, website_key: str, action: str = None) -> dict:
    """Solves Cloudflare Turnstile and returns the token."""
    # ... (Implementation similar to v2, but with AntiTurnstileTaskProxyLess) ...
    # Returns {'token': '...'}
    pass

πŸ’‘ Best Practices

Practice Description
Immediate Use Captcha tokens expire quickly (~2 minutes). Inject and submit immediately after receiving the token.
Error Handling Always wrap API calls in try...except blocks to handle network or API failures gracefully.
Rate Limiting Use driver.sleep() between actions to mimic human behavior and avoid triggering anti-bot measures.
Configuration Use the Config.validate() method before making any API calls.

🎁 Special Offer

Boost your automation budget instantly! Use bonus code CAPN when topping up your CapSolver account to get an extra 5% bonus on every rechargeβ€”with no limits!

Redeem it now in your CapSolver Dashboard!


🀝 Contributing

We welcome contributions! Please see the CONTRIBUTING.md for details on how to submit pull requests, report bugs, and suggest features.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ”— Resources

About

A robust Python solution using Botasaurus and CapSolver to automatically solve reCAPTCHA and Cloudflare Turnstile for web scraping.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages