📖 Documentation • English • 简体中文
Document version:
v2.6.5Turn any AI website you already use, such as ChatGPT, DeepSeek, Claude, or Gemini, into a standard OpenAI-compatible API for free, with full local deployment.
- Overview
- Quick Start
- Dashboard Tour
- Adding a New Site
- Connecting to the API
- Tab Pool and Presets
- Key Configuration
- Advanced Configuration
- Important Notes
- FAQ
- Project Structure
- Dependencies and Roadmap
- Disclaimer
This project is built on DrissionPage browser automation. It works like a local robot operator: it opens a real browser on your machine, interacts with AI websites like a human user, and returns the result through a standard API endpoint.
- Free usage: reuse the website quota or account access you already have
- Local and private: your login stays on your own machine and cookies do not need to be uploaded
- AI-assisted adaptation: unsupported sites can be analyzed automatically
- Multi-tab scheduling: handle requests with multiple browser tabs in parallel
- Preset system: keep separate configs for chat, vision, coding, and more on the same site
| Site | URL | Notes |
|---|---|---|
| ChatGPT | chatgpt.com |
About 200k max single-send length |
| DeepSeek | chat.deepseek.com |
Reply-reading issues in thinking mode |
| Gemini | gemini.google.com |
About 30k on free accounts; no clear limit observed on Pro |
| Claude | claude.ai |
Site-specific parsing supported |
| Kimi | www.kimi.com |
Synced with the current built-in config |
| Qwen | chat.qwen.ai |
Qwen page adaptation supported |
| Grok | grok.com |
Supported |
| Doubao | www.doubao.com |
New domain already adapted |
| AI Studio | aistudio.google.com |
Supported |
| Arena AI | arena.ai |
Sensitive to IP quality |
Sites outside this list can still be adapted through AI-based page analysis or manual configuration.
If you need help, you can join the QQ group: 1073037753.
- Download the release package.
- Extract it into a folder without Chinese characters in the path, for example
D:\AI_Tools\Universal-Web-API. - Make sure at least one Chromium-based browser is installed:
- Google Chrome
- Microsoft Edge
- Brave / Vivaldi / Opera
- Double-click
start.batin the project folder. - The script will:
- check the environment
- install required dependencies
- apply the DrissionPage patch used for stealth-mode optimization
- When the terminal stops scrolling, you should see something like:
Web UI started, please visit: http://127.0.0.1:8199
- Open the dashboard by:
- holding
Ctrland clicking the link in the terminal - or manually opening
http://127.0.0.1:8199
- holding
After startup, the program opens a browser window automatically.
- Open the AI website you want to use in that controlled browser.
- Log in manually.
- Recommended: the API will inherit your account permissions, such as membership benefits or history.
- Optional: if the site works without login, you can use it directly.
- Do not close the controlled browser. Return to the Web UI on port
8199and continue configuration there.
The tutorial page no longer opens automatically inside the controlled browser. Keeping the docs open elsewhere is fine. What you should avoid is opening unrelated tabs inside the browser controlled by the script.
When you enter the dashboard for the first time, check these areas in order:
- Service status in the top-left corner
- Site list in the sidebar
- Site config page for selectors, workflow, and stream settings
- Tab pool to confirm the current page was recognized
- Logs page if your first test fails
The dashboard is the visual control center of the project, not just a settings page.
- Shows browser connection status, auth status, and site count
- Supports search, add, import, and export
- Lets you switch between Sites, Tabs, Extractors, Logs, and Settings
This is where you define how a site should be operated:
- Selectors: where to type, click, and read replies
- Workflow: what actions to perform and in what order
- Image extraction: how generated images are collected
- Response detection: how the system decides the answer is finished
- File attach: how oversized prompts are uploaded as files
- Presets: multiple configuration sets for the same site
- View each tab's index, state, and current URL
- Copy the dedicated API endpoint of a specific tab
- Assign different presets to different tabs
- Quickly locate busy or broken tabs
Useful when the page clearly has a reply but the API output is wrong:
- view available extractors
- set the global default extractor
- bind a site-specific extractor
- mark whether a binding is only configured or already verified
Logs help you check whether the problem happened before sending, during sending, while waiting, during extraction, or inside automation commands.
Settings are used for:
- environment values such as host, port, auth, proxy, and helper AI settings
- browser constants such as element timeout and stream thresholds
- AI element recognition
- update whitelist
There are two main ways to add a site: automatic recognition and manual configuration.
Use this when the page structure is fairly standard or you want a quick first draft.
Prerequisites
- Fill in
HELPER_API_KEY,HELPER_BASE_URL, andHELPER_MODELinSettings -> Environment. - Open the target site in the controlled browser.
- Stay on the real chat page, not the landing page or login page.
How it works
If a domain is not in config/sites.json, the first real API request to that domain will:
- read the current page HTML
- ask the helper AI to analyze the page
- generate
selectors + workflow + default preset - save the result to
config/sites.json
This means automatic recognition is triggered by the first real request to an unknown domain, not by the Add Site button.
Use this when:
- you do not want to spend helper-AI tokens
- the page structure is unusual
- you want full control over selectors and workflow
Recommended steps
- Click Add Site
- Enter the domain, for example
chat.example.com - Open the site's main preset
- Fill in the minimum required selectors
- Build the shortest working workflow
- Test selectors one by one
- Save and make one real API request
Start with these selectors:
| Key | Purpose |
|---|---|
input_box |
Chat input field |
send_btn |
Send button |
result_container |
AI reply container |
Start with a short workflow:
[
{ "action": "CLICK", "target": "new_chat_btn", "optional": true, "value": null },
{ "action": "WAIT", "target": "", "optional": false, "value": 0.5 },
{ "action": "FILL_INPUT", "target": "input_box", "optional": false, "value": null },
{ "action": "CLICK", "target": "send_btn", "optional": true, "value": null },
{ "action": "KEY_PRESS", "target": "Enter", "optional": true, "value": null },
{ "action": "STREAM_WAIT", "target": "result_container", "optional": false, "value": null }
]Debug in this order:
input_boxsend_btnresult_container- full workflow
- stream thresholds, extractors, image extraction, and file attach
- Do not add too many workflow steps at the beginning.
- Do not change global browser constants before the site itself works.
- Always manually review the result after auto recognition, especially
result_container. - If the website already replied but the API did not finish, inspect stream settings before blaming selectors.
The project exposes an OpenAI-compatible API, so it can be used by most clients that support OpenAI-style endpoints.
| Field | Value |
|---|---|
| Provider | OpenAI, OpenAI Compatible, or Custom |
| Base URL | http://127.0.0.1:8199/v1 |
| API key | Any value is fine, for example sk-any |
| Model name | Any value is fine; the actual model depends on the website tab |
| Route | Format | Description |
|---|---|---|
| Default route | /v1/chat/completions |
Automatically uses one idle tab |
| Fixed tab | /tab/{index}/v1/chat/completions |
Uses a specific tab index from the tab pool |
Some clients need the full path:
http://127.0.0.1:8199/v1/chat/completions
curl http://127.0.0.1:8199/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-any" \
-d '{
"model": "any",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}'SillyTavern's built-in API test can be unreliable. It is better to test by sending a real conversation instead of pressing the test button.
Since v2.5.8, the project documents compatibility with standard OpenAI tool-calling fields:
tools/tool_choice- legacy
functions/function_call
This is still not true native tool calling from the target website. The backend converts tool definitions and constraints into prompts that the web model can understand, then parses the structured response back into tool_calls.
In practice:
- stronger models work much better
- weaker models may break JSON, miss arguments, or mix normal text into tool output
- fewer tools and simpler schemas usually improve success rate
The tab pool is the main scheduling mechanism. The script scans supported AI-site tabs in the browser and assigns each recognized tab a persistent index.
How it works:
- You open one or more AI-site tabs
- The script detects them and assigns indices such as
1,2, and3 - Incoming API requests are routed to an idle tab
- After the request completes, the tab returns to the pool
You can use the dashboard to:
- view tab state in real time
- inspect current URL and request count
- copy a dedicated endpoint
- assign presets per tab
Blank pages such as chrome://newtab are not added to the pool.
Presets let you create multiple independent configurations for the same site and assign them to different tabs.
Typical use cases:
| Tab | Preset | Difference |
|---|---|---|
| Tab #1 | Pro chat | Longer timeout, deep extractor |
| Tab #2 | Fast vision | Simpler workflow, image extraction enabled |
| Tab #3 | Coding assistant | File attach enabled, higher thresholds |
Workflow:
- Create a preset in
Site -> Config -> + New Preset - Assign that preset in the
Tabspage - Call the tab with
/tab/{index}/v1/chat/completions
Every site needs a few CSS selectors that tell the program where to type and where to click.
| Selector | Required | Description |
|---|---|---|
input_box |
Yes | Chat input field |
send_btn |
Yes | Send button |
result_container |
Yes | AI reply container |
new_chat_btn |
No | New conversation button |
You can define additional selectors and reference them inside the workflow. Use the Test button in the dashboard to validate them quickly.
The workflow defines the sequence of actions.
[
{ "action": "CLICK", "target": "new_chat_btn" },
{ "action": "WAIT", "value": 0.5 },
{ "action": "FILL_INPUT", "target": "input_box" },
{ "action": "CLICK", "target": "send_btn" },
{ "action": "STREAM_WAIT" }
]Main action types:
| Action | Description |
|---|---|
CLICK |
Click an element identified by a selector key |
FILL_INPUT |
Put the user prompt into the input box |
WAIT |
Sleep for a fixed number of seconds |
KEY_PRESS |
Simulate a key press such as Enter |
STREAM_WAIT |
Wait for the response to finish |
Extractors control how clean Markdown is parsed from page HTML.
- Default mode extracts text directly from
result_container deep_modeis the recommended option for complex LaTeX and code blocks
Right now, deep_mode is the most polished option. If code extraction still looks wrong, network interception mode may work better on adapted sites.
Two common modes are available:
| Mode | Streaming | Description |
|---|---|---|
| DOM mode | Yes | Polls DOM changes and streams page updates |
| Network interception mode | Yes, site-dependent | Parses incremental network responses and is often faster when adapted |
Recommended DOM tuning:
| Scenario | Silence timeout | Stable checks | Initial wait |
|---|---|---|---|
| Fast models | 3 to 5 s | 3 to 5 | 60 s |
| Slow reasoning models | 10 to 15 s | 8 to 12 | 300 s |
| Code generation | 8 to 10 s | 6 to 8 | 180 s |
| Long-form writing | 12 to 15 s | 10 | 300 s |
If a site does not enable network interception by default, do not turn it on casually. It may be unsupported or increase detection risk.
Image extraction decides how generated images are captured from the page.
{
"enabled": true,
"selector": "img",
"container_selector": ".img-grid",
"download_blobs": true,
"mode": "all",
"max_size_mb": 10
}- Images you send are stored in
image/ - Images captured from the site are stored in
download_images/
When an input is too long, the system can write it to a temporary .txt file and try to upload that file to the website instead of pasting everything into the input box.
Important selectors for this feature:
upload_btnfile_inputdrop_zone
By default, this is already enabled for:
aistudio.google.comchatgpt.comchat.deepseek.comwww.doubao.comchat.qwen.ai
If the site does not truly support file uploads, the system will fall back to normal text input.
Since v2.5.6, the project includes an automation strategy engine for recovery, routing, and alerts.
Common triggers:
request_counterror_countidle_timeoutpage_checkcommand_triggeredcommand_result_matchnetwork_request_error
Common actions:
clear_cookiesrefresh_pagenew_chatwaitrun_jsexecute_presetexecute_workflownavigateswitch_proxysend_webhookabort_task
Recommended pattern for network failures:
switch_proxy -> wait(1~2s) -> refresh_page -> send_webhook
Add abort_task if you want a hard stop.
Stealth mode adds randomized delay and human-like browser behavior to reduce detection risk.
| Behavior | Normal mode | Stealth mode |
|---|---|---|
| Mouse click | Direct CDP click | Human-like press and release |
| Mouse movement | Instant jump | Curved motion |
| Idle state | No action | Small random drift |
| Action interval | Minimal delay | Randomized delay |
Recommended usage:
- turn it on for Cloudflare-protected sites such as
chatgpt.comorarena.ai - leave it off for lower-protection sites when speed matters
Patch commands:
python patch_drissionpage.py
python patch_drissionpage.py --restoreRe-apply the patch after every DrissionPage upgrade.
If a site is not in the built-in list, the system can call a helper AI to identify:
input_boxsend_btnresult_containernew_chat_btnmessage_wrappergenerating_indicator
Typical flow:
- configure your OpenAI-style helper API in the dashboard
- open the unsupported site and stay on the real chat page
- make the first request to that unknown domain
- let the system spend about
8000tokens to analyze the page - review the generated config in
config/sites.json
Environment settings are changed in Settings -> Environment and require a restart.
| Category | Item | Default | Description |
|---|---|---|---|
| Service | Listen host | 127.0.0.1 |
Use 0.0.0.0 for external access |
| Service | Listen port | 8199 |
HTTP service port |
| Service | Debug mode | On | Enables /docs |
| Auth | Enable auth | Off | Require Bearer token or not |
| Proxy | Proxy URL | None | Supports socks5:// or http:// |
| Browser | Chrome debug port | 9222 |
Remote debugging port |
Browser constants are changed in Settings -> Browser Constants and take effect immediately.
For the full parameter reference, see 参数解释.md. That file is currently Chinese-only.
The update whitelist controls which files and folders are preserved during the next automatic update.
Default preserved items include:
config/sites.local.jsonconfig/commands.local.jsonchrome_profile/venv/logs/image/updater.py.git/__pycache__/*.pycbackup_*/
If you only want to keep login state and personal configuration, the default selection is usually enough.
- Do not click buttons manually, especially the send button
- Do not switch tabs or resize the page in unusual ways
- Do not collapse page sections used by the workflow
- Manual interference can confuse the automation logic and cause deadlocks
- If the script is truly stuck, close the terminal window and restart
start.bat
In DOM extraction mode:
- code blocks, LaTeX, and hyperlinks may not always be captured perfectly
- for plain-text usage, DOM mode is usually enough
- for coding-heavy usage, network interception mode is often more stable on adapted sites
The project is still limited by how much text the website input box accepts at once. File attach can bypass some of that, but not all of it.
- DeepSeek has reply-reading issues in thinking mode
- the VSCode Codex plugin is not compatible for now
- the DrissionPage patch must be re-applied after every upgrade
arena.aiis highly sensitive to IP quality
The program searches browsers in this order:
Chrome -> Edge -> Brave -> Vivaldi -> Opera
If none are found, set one manually in .env:
BROWSER_PATH=C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe- Make sure the terminal window is still open
- Make sure port
8199is not already in use
- Check the controlled browser
- Refresh the page if it has not fully loaded
- Solve captcha manually if one appears
- Increase workflow wait time if needed
- Make sure you did not click anything during execution
Check in this order:
- manual interference
- collapsed UI or layout changes
- site-side issues such as captcha or blocked content
- network instability or proxy switching
- extra unrelated tabs in the controlled browser
- outdated site config after a UI change
That usually depends on the limits of the website account you logged into. For example, free ChatGPT accounts often have a much smaller effective context window.
- make sure the page has fully loaded
- wait 2 to 3 seconds and refresh the list
- restart the script if needed
Most of the time, you do not need to. The defaults are already tuned for common usage.
- turning web AI access into an OpenAI-style API
- inspecting how websites build context
- running multiple tabs and presets in parallel
- bypassing some long-input limits through file attach
Universal-Web-API/
├── app/ # Python backend core
│ ├── api/ # HTTP routes
│ ├── core/ # Browser automation and low-level logic
│ │ ├── extractors/ # Extraction strategies
│ │ ├── parsers/ # Site-specific parsers
│ │ └── workflow/ # Workflow engine
│ ├── models/ # Pydantic models
│ ├── services/ # Config engine and request scheduling
│ └── utils/ # Clipboard, image, and helper utilities
├── config/ # Configuration files
├── static/ # Web UI assets
├── .env # Environment variables
├── main.py # Program entry point
├── start.bat # One-click Windows launcher
├── requirements.txt # Python dependencies
└── 参数解释.md # Detailed parameter reference
Backend
- FastAPI
- uvicorn
- DrissionPage
- beautifulsoup4
- pydantic
Frontend
| Plan | Status |
|---|---|
| Improve concurrency | In progress |
| Cookie simulation mode with lower resource usage and no browser | Planned |
| Bug fixes | Ongoing |
Planned long-term modes:
| Mode | Best for | Advantages | Trade-offs | Status |
|---|---|---|---|---|
| Browser automation mode | Strictly protected sites such as ChatGPT, Claude, or Grok | Real-user simulation, higher compatibility | More resource usage, slower | Supported now |
| Cookie simulation mode | Lower-protection sites or local deployments | Lower resource usage, faster, no browser | Easier to detect, needs request analysis | Planned |
Please read the following carefully before using this project.
This project is provided for learning, research, and technical discussion only.
- You are responsible for complying with the Terms of Service and laws that apply to any third-party website you access with this project.
- Many websites prohibit or restrict automated access. Using this project may lead to account bans, IP blocks, or legal risk.
- Recommended practices:
- use it only where automation is clearly allowed
- prefer official APIs whenever possible
- limit request frequency
- avoid commercial or large-scale automated use
- your account may be restricted or banned
- automation may cause data loss or data leakage
- some jurisdictions may treat this behavior as unlawful
- third-party dependencies may contain security vulnerabilities
- the project runs locally and does not actively upload your data
- you are responsible for any helper AI API you configure
- do not use this project in production or with sensitive data
- the authors and contributors are not responsible for direct or indirect damage caused by usage, bugs, or third-party policy violations
This project is provided AS IS without express or implied warranty.
This project uses the AGPL-3.0 license. If you modify it and provide it as a network service, you must also publish the corresponding source code under the same license.
- do not use this project for illegal purposes
- do not use it to evade paid services or infringe intellectual property
- do not send high-frequency or malicious traffic to target sites
- prefer testing environments over production usage
- consult legal counsel before commercial use
Using this project means you understand and accept the risks above. If you disagree with any part of this disclaimer, stop using the project immediately.

