OpenCode-native proxy for LLM WebUIs (Qwen, Gemini, ChatGPT, DeepSeek) — no API key, just cookies.
Find a file
2026-06-03 04:40:47 +02:00
scripts feat: qwen, deepseek providers; gemini wip 2026-06-01 01:33:49 +02:00
src feat: chromium cookie support, auto age key gen, opencode config sync 2026-06-03 04:40:47 +02:00
tests feat: add chatgpt provider 2026-06-01 07:08:55 +02:00
.gitignore feat: chromium cookie support, auto age key gen, opencode config sync 2026-06-03 04:40:47 +02:00
pyproject.toml feat: chromium cookie support, auto age key gen, opencode config sync 2026-06-03 04:40:47 +02:00
pytest.ini chore: rewrite project from scratch 2026-06-01 01:33:49 +02:00
README.md feat: add qwen 3.7 plus models 2026-06-02 04:18:35 +02:00
uv.lock feat: chromium cookie support, auto age key gen, opencode config sync 2026-06-03 04:40:47 +02:00

web2cli

Local OpenAI-compatible adapter from web UI backends to opencode.

web2cli runs a FastAPI server that accepts OpenAI-style chat completion requests and forwards them to browser/WebUI providers such as Qwen, DeepSeek, Gemini, and ChatGPT. It keeps a local conversation ledger so opencode-style append, edit, fork, stop, and temporary-chat workflows can be mapped onto each provider's native chat model.

Features

  • OpenAI-compatible /v1/models and /v1/chat/completions endpoints.
  • Local chat management endpoints for state inspection, deletion, message deletion, fork, and stop.
  • Dynamic provider discovery from src/providers/<name>/provider.py.
  • Encrypted credential storage with age under ~/.local/share/web2cli/creds/.
  • opencode custom provider support through @ai-sdk/openai-compatible.
  • Multimodal input forwarding for providers/models that support attachments.

Requirements

  • Python 3.13+
  • uv
  • age for encrypted credentials
  • Firefox/browser cookies for automatic credential refresh against real WebUI providers
uv sync
brew install age

Credential encryption uses ~/.ssh/id_ed25519.pub as the age recipient by default. If that key is not available, create ~/.local/share/web2cli/recipients.txt with an age public key.

Quick Start

uv run web2cli creds refresh
uv run web2cli serve
uv run web2cli models

OpenAI-compatible request:

curl http://127.0.0.1:4981/v1/chat/completions \
  -H 'content-type: application/json' \
  -d '{
    "model": "qwen3.6-plus-fast",
    "messages": [{"role": "user", "content": "hello"}],
    "stream": false
  }'

opencode request after adding the config below:

opencode -m web2cli/qwen3.6-plus-fast run "hello"

Configuration

Runtime settings are loaded from environment variables with the WEB2CLI_ prefix and from .env.

Variable Default Purpose
WEB2CLI_HOST 127.0.0.1 Server bind host
WEB2CLI_PORT 4981 Server bind port
WEB2CLI_STATE_PATH ~/.local/share/web2cli/state.sqlite3 SQLite ledger path
WEB2CLI_MIN_REQUEST_INTERVAL 0.5 Provider request pacing in seconds
WEB2CLI_QWEN_TOKEN unset Manual Qwen token override
WEB2CLI_DEEPSEEK_TOKEN unset Manual DeepSeek token override
WEB2CLI_GEMINI_AT unset Manual Gemini at token override
WEB2CLI_CHATGPT_TOKEN unset Manual ChatGPT token override

Manual token overrides bypass the encrypted credential store. If an override expires, update it manually.

CLI

Run commands through uv run web2cli ... from the repo, or web2cli ... when installed.

Command Purpose
web2cli serve [--host HOST] [--port PORT] Start the FastAPI server.
web2cli models Print discovered models with provider, family, mode, and capabilities.
web2cli providers Print providers, model counts, provider capabilities, and credential status.
web2cli creds refresh [provider] Fetch fresh credentials from Firefox cookies. Omit provider or use all for every real provider.
web2cli creds check [provider] Print credential status without fetching anything.
web2cli creds set-token <provider> <token> Store a bearer token manually.
web2cli creds set-cookies <provider> <cookies> Store a raw cookie string manually.
web2cli creds delete <provider> Remove stored credentials for one provider.

HTTP API

Default base URL:

http://127.0.0.1:4981

GET /health

Returns:

{ "status": "ok" }

GET /v1/models

OpenAI-compatible model listing.

{
  "object": "list",
  "data": [
    {
      "id": "qwen3.6-plus-fast",
      "object": "model",
      "created": 0,
      "owned_by": "web2cli"
    }
  ]
}

POST /v1/chat/completions

Persistent OpenAI-compatible chat completion. The server stores a local ledger keyed by session ID.

Accepted fields:

  • model: required model ID from /v1/models.
  • messages: OpenAI-style messages.
  • tools: optional OpenAI-style function tools.
  • stream: true for SSE, false for normal JSON.
  • temperature, max_tokens, top_p: accepted and stored in the internal request object.

Session ID resolution order:

  • X-Session-ID header.
  • x-session-affinity header.
  • JSON body session_id, sessionId, or id.
  • Stable fingerprint of the model and first user message.

Streaming responses use OpenAI-compatible data: ... SSE chunks and terminate with data: [DONE].

POST /v1/chat/completions/temp

Ephemeral one-shot completion. The provider is asked to create a temporary/non-listed chat when supported, and no ledger row is persisted. Providers without temp_chat are rejected.

GET /v1/chats/{chat_id}/state

Return the local ledger for a persistent chat, including provider, model_family, provider_chat_id, provider_state, client_turns, and remote_turns.

DELETE /v1/chats/{chat_id}

Delete a chat upstream and remove the local ledger row.

DELETE /v1/chats/{chat_id}/messages/{message_id}

Delete a provider-side message when the provider implements standalone message deletion.

POST /v1/chats/{chat_id}/fork

Create a new local chat by forking an existing provider chat at an assistant turn.

{
  "from_sid": "demo-1",
  "at_turn_index": 3
}

The URL chat_id is the new local session ID. The source chat must exist, the target turn must be an assistant turn, and the provider must support fork_chat.

POST /v1/chats/{chat_id}/stop

Interrupt active generation for a persistent chat.

Models

The public model ID is the exact string sent in the OpenAI model field and used as the key under provider.web2cli.models in opencode.

Qwen

Model Family Mode Capabilities
qwen3.7-plus-auto qwen3.7-plus auto thinking, fast, image, video
qwen3.7-plus-fast qwen3.7-plus fast fast, image, video
qwen3.7-plus-thinking qwen3.7-plus thinking thinking, image, video
qwen3.6-plus-auto qwen3.6-plus auto thinking, fast, image, video
qwen3.6-plus-fast qwen3.6-plus fast fast, image, video
qwen3.6-plus-thinking qwen3.6-plus thinking thinking, image, video
qwen3.7-max-fast qwen3.7-max fast fast
qwen3.7-max-thinking qwen3.7-max thinking thinking
qwen3.6-max-preview-fast qwen3.6-max-preview fast fast
qwen3.6-max-preview-thinking qwen3.6-max-preview thinking thinking
qwen3.6-27b-fast qwen3.6-27b fast fast, image, video
qwen3.6-27b-thinking qwen3.6-27b thinking thinking, image, video

DeepSeek

Model Family Mode Capabilities
deepseek-instant-fast default fast fast
deepseek-instant-thinking default thinking thinking
deepseek-expert-fast expert fast fast
deepseek-expert-thinking expert thinking thinking

Gemini

Model Family Mode Capabilities
gemini-3.1-flash-lite-fast gemini-3.1-flash-lite fast fast, image, audio, video
gemini-3.1-flash-lite-thinking gemini-3.1-flash-lite thinking thinking, image, audio, video
gemini-3.5-flash-fast gemini-3.5-flash fast fast, image, audio, video
gemini-3.5-flash-thinking gemini-3.5-flash thinking thinking, image, audio, video
gemini-3.1-pro-fast gemini-3.1-pro fast fast, image, audio, video
gemini-3.1-pro-thinking gemini-3.1-pro thinking thinking, image, audio, video

ChatGPT

Model Family Mode Capabilities
chatgpt-5.5-fast gpt-5-5 fast thinking, fast, image, audio, video
chatgpt-5.5-thinking gpt-5-5 thinking thinking, fast, image, audio, video

Fake

Model Family Mode Capabilities
fake-model-auto fake-model auto thinking, fast, image, audio, video
fake-model-fast fake-model fast thinking, fast, image, audio, video
fake-model-thinking fake-model thinking thinking, fast, image, audio, video

Provider Capabilities

Provider Models Chat Ops Notes
qwen 12 edit, delete, fork, temp, stop Supports image/video-capable model families.
deepseek 4 edit, stop No standalone delete, fork, or temp chat.
gemini 6 edit-last-only, temp, stop Uses a fresh upstream conversation for middle edits.
chatgpt 2 edit, fork, temp, stop Supports uploads and provider-native fork flow.
fake 3 edit, delete, fork, temp, stop In-memory test provider.

opencode

Add a custom OpenAI-compatible provider to ~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "web2cli": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "web2cli",
      "options": {
        "baseURL": "http://localhost:4981/v1",
        "apiKey": "dummy"
      },
      "models": {
        "qwen3.6-plus-fast": {
          "id": "qwen3.6-plus-fast",
          "name": "Qwen 3.6 Plus (Fast)",
          "reasoning": false,
          "tool_call": true,
          "attachment": true,
          "modalities": {
            "input": ["text", "image", "video"],
            "output": ["text"]
          }
        },
        "gemini-3.1-pro-thinking": {
          "id": "gemini-3.1-pro-thinking",
          "name": "Gemini 3.1 Pro (Thinking)",
          "reasoning": true,
          "tool_call": true,
          "attachment": true,
          "modalities": {
            "input": ["text", "image", "audio", "video", "pdf"],
            "output": ["text"]
          }
        },
        "chatgpt-5.5-fast": {
          "id": "chatgpt-5.5-fast",
          "name": "ChatGPT 5.5 (Fast)",
          "reasoning": true,
          "tool_call": true,
          "attachment": true,
          "modalities": {
            "input": ["text", "image", "audio", "video", "pdf"],
            "output": ["text"]
          }
        }
      }
    }
  }
}

Add every model you want to see in opencode under provider.web2cli.models. The model key must match /v1/models exactly.

opencode -m web2cli/chatgpt-5.5-fast
opencode -m web2cli/gemini-3.1-pro-thinking
opencode -m web2cli/qwen3.6-plus-fast

Restart opencode after changing opencode.json; provider config is loaded at startup.

Tool Calls

WebUI providers do not expose the same native tool-calling interface as API providers. web2cli accepts OpenAI-style function tools from opencode and injects a system prompt that asks the model to return caller-executed tool calls as JSON. The response parser converts those JSON calls back into OpenAI tool_calls for opencode.

Attachments

Accepted OpenAI content parts:

  • text
  • image_url
  • image
  • audio
  • video
  • file

Media parts must use data URIs. Providers upload supported media before sending the prompt upstream. Unsupported media is represented in the serialized prompt as an attachment marker where provider upload support is unavailable.

State

Persistent chat state is stored in SQLite at WEB2CLI_STATE_PATH, defaulting to:

~/.local/share/web2cli/state.sqlite3

Credentials are stored separately as age-encrypted JSON blobs:

~/.local/share/web2cli/creds/<provider>.age

Development

uv run pytest tests/unit tests/contract -q
uv run ruff check src/ tests/
uv run python tests/live/run_live.py --provider qwen
uv run python tests/live/run_live.py --provider gemini --media
uv run python tests/live/run_live.py --provider chatgpt --only chat_basic chat_diff

Live tests start the app, run chat scenarios, write logs under tests/live/logs/<timestamp>/<provider>, and clean up created chats unless --no-cleanup is passed.

Adding a Provider

Create a package under src/providers/<name>/ with one concrete Provider subclass in provider.py. The factory imports src.providers.<name>.provider and requires exactly one non-abstract subclass of Provider.

Required methods:

  • list_models
  • create_chat
  • apply_sync
  • complete
  • interrupt
  • delete_chat

Optional methods:

  • delete_message
  • fork_chat
  • close

Set ProviderCapabilities accurately so the sync planner and management endpoints can reject unsupported flows before touching the upstream provider.