OpenCode-native proxy for LLM WebUIs (Qwen, Gemini, ChatGPT, DeepSeek) — no API key, just cookies.

Find a file

Timofei Shelestov c0d44936f3 feat: chromium cookie support, auto age key gen, opencode config sync		2026-06-03 04:40:47 +02:00
scripts	feat: qwen, deepseek providers; gemini wip	2026-06-01 01:33:49 +02:00
src	feat: chromium cookie support, auto age key gen, opencode config sync	2026-06-03 04:40:47 +02:00
tests	feat: add chatgpt provider	2026-06-01 07:08:55 +02:00
.gitignore	feat: chromium cookie support, auto age key gen, opencode config sync	2026-06-03 04:40:47 +02:00
pyproject.toml	feat: chromium cookie support, auto age key gen, opencode config sync	2026-06-03 04:40:47 +02:00
pytest.ini	chore: rewrite project from scratch	2026-06-01 01:33:49 +02:00
README.md	feat: add qwen 3.7 plus models	2026-06-02 04:18:35 +02:00
uv.lock	feat: chromium cookie support, auto age key gen, opencode config sync	2026-06-03 04:40:47 +02:00

README.md

web2cli

Local OpenAI-compatible adapter from web UI backends to opencode.

web2cli runs a FastAPI server that accepts OpenAI-style chat completion requests and forwards them to browser/WebUI providers such as Qwen, DeepSeek, Gemini, and ChatGPT. It keeps a local conversation ledger so opencode-style append, edit, fork, stop, and temporary-chat workflows can be mapped onto each provider's native chat model.

Features

OpenAI-compatible /v1/models and /v1/chat/completions endpoints.
Local chat management endpoints for state inspection, deletion, message deletion, fork, and stop.
Dynamic provider discovery from src/providers/<name>/provider.py.
Encrypted credential storage with age under ~/.local/share/web2cli/creds/.
opencode custom provider support through @ai-sdk/openai-compatible.
Multimodal input forwarding for providers/models that support attachments.

Requirements

Python 3.13+
uv
age for encrypted credentials
Firefox/browser cookies for automatic credential refresh against real WebUI providers

uv sync
brew install age

Credential encryption uses ~/.ssh/id_ed25519.pub as the age recipient by default. If that key is not available, create ~/.local/share/web2cli/recipients.txt with an age public key.

Quick Start

uv run web2cli creds refresh
uv run web2cli serve
uv run web2cli models

OpenAI-compatible request:

curl http://127.0.0.1:4981/v1/chat/completions \
  -H 'content-type: application/json' \
  -d '{
    "model": "qwen3.6-plus-fast",
    "messages": [{"role": "user", "content": "hello"}],
    "stream": false
  }'

opencode request after adding the config below:

opencode -m web2cli/qwen3.6-plus-fast run "hello"

Configuration

Runtime settings are loaded from environment variables with the WEB2CLI_ prefix and from .env.

Variable	Default	Purpose
`WEB2CLI_HOST`	`127.0.0.1`	Server bind host
`WEB2CLI_PORT`	`4981`	Server bind port
`WEB2CLI_STATE_PATH`	`~/.local/share/web2cli/state.sqlite3`	SQLite ledger path
`WEB2CLI_MIN_REQUEST_INTERVAL`	`0.5`	Provider request pacing in seconds
`WEB2CLI_QWEN_TOKEN`	unset	Manual Qwen token override
`WEB2CLI_DEEPSEEK_TOKEN`	unset	Manual DeepSeek token override
`WEB2CLI_GEMINI_AT`	unset	Manual Gemini `at` token override
`WEB2CLI_CHATGPT_TOKEN`	unset	Manual ChatGPT token override

Manual token overrides bypass the encrypted credential store. If an override expires, update it manually.

CLI

Run commands through uv run web2cli ... from the repo, or web2cli ... when installed.

Command	Purpose
`web2cli serve [--host HOST] [--port PORT]`	Start the FastAPI server.
`web2cli models`	Print discovered models with provider, family, mode, and capabilities.
`web2cli providers`	Print providers, model counts, provider capabilities, and credential status.
`web2cli creds refresh [provider]`	Fetch fresh credentials from Firefox cookies. Omit provider or use `all` for every real provider.
`web2cli creds check [provider]`	Print credential status without fetching anything.
`web2cli creds set-token <provider> <token>`	Store a bearer token manually.
`web2cli creds set-cookies <provider> <cookies>`	Store a raw cookie string manually.
`web2cli creds delete <provider>`	Remove stored credentials for one provider.

HTTP API

Default base URL:

http://127.0.0.1:4981

`GET /health`

Returns:

{ "status": "ok" }

`GET /v1/models`

OpenAI-compatible model listing.

{
  "object": "list",
  "data": [
    {
      "id": "qwen3.6-plus-fast",
      "object": "model",
      "created": 0,
      "owned_by": "web2cli"
    }
  ]
}

`POST /v1/chat/completions`

Persistent OpenAI-compatible chat completion. The server stores a local ledger keyed by session ID.

Accepted fields:

model: required model ID from /v1/models.
messages: OpenAI-style messages.
tools: optional OpenAI-style function tools.
stream: true for SSE, false for normal JSON.
temperature, max_tokens, top_p: accepted and stored in the internal request object.

Session ID resolution order:

X-Session-ID header.
x-session-affinity header.
JSON body session_id, sessionId, or id.
Stable fingerprint of the model and first user message.

Streaming responses use OpenAI-compatible data: ... SSE chunks and terminate with data: [DONE].

`POST /v1/chat/completions/temp`

Ephemeral one-shot completion. The provider is asked to create a temporary/non-listed chat when supported, and no ledger row is persisted. Providers without temp_chat are rejected.

`GET /v1/chats/{chat_id}/state`

Return the local ledger for a persistent chat, including provider, model_family, provider_chat_id, provider_state, client_turns, and remote_turns.

`DELETE /v1/chats/{chat_id}`

Delete a chat upstream and remove the local ledger row.

`DELETE /v1/chats/{chat_id}/messages/{message_id}`

Delete a provider-side message when the provider implements standalone message deletion.

`POST /v1/chats/{chat_id}/fork`

Create a new local chat by forking an existing provider chat at an assistant turn.

{
  "from_sid": "demo-1",
  "at_turn_index": 3
}

The URL chat_id is the new local session ID. The source chat must exist, the target turn must be an assistant turn, and the provider must support fork_chat.

`POST /v1/chats/{chat_id}/stop`

Interrupt active generation for a persistent chat.

Models

The public model ID is the exact string sent in the OpenAI model field and used as the key under provider.web2cli.models in opencode.

Qwen

Model	Family	Mode	Capabilities
`qwen3.7-plus-auto`	`qwen3.7-plus`	auto	thinking, fast, image, video
`qwen3.7-plus-fast`	`qwen3.7-plus`	fast	fast, image, video
`qwen3.7-plus-thinking`	`qwen3.7-plus`	thinking	thinking, image, video
`qwen3.6-plus-auto`	`qwen3.6-plus`	auto	thinking, fast, image, video
`qwen3.6-plus-fast`	`qwen3.6-plus`	fast	fast, image, video
`qwen3.6-plus-thinking`	`qwen3.6-plus`	thinking	thinking, image, video
`qwen3.7-max-fast`	`qwen3.7-max`	fast	fast
`qwen3.7-max-thinking`	`qwen3.7-max`	thinking	thinking
`qwen3.6-max-preview-fast`	`qwen3.6-max-preview`	fast	fast
`qwen3.6-max-preview-thinking`	`qwen3.6-max-preview`	thinking	thinking
`qwen3.6-27b-fast`	`qwen3.6-27b`	fast	fast, image, video
`qwen3.6-27b-thinking`	`qwen3.6-27b`	thinking	thinking, image, video

DeepSeek

Model	Family	Mode	Capabilities
`deepseek-instant-fast`	`default`	fast	fast
`deepseek-instant-thinking`	`default`	thinking	thinking
`deepseek-expert-fast`	`expert`	fast	fast
`deepseek-expert-thinking`	`expert`	thinking	thinking

Gemini

Model	Family	Mode	Capabilities
`gemini-3.1-flash-lite-fast`	`gemini-3.1-flash-lite`	fast	fast, image, audio, video
`gemini-3.1-flash-lite-thinking`	`gemini-3.1-flash-lite`	thinking	thinking, image, audio, video
`gemini-3.5-flash-fast`	`gemini-3.5-flash`	fast	fast, image, audio, video
`gemini-3.5-flash-thinking`	`gemini-3.5-flash`	thinking	thinking, image, audio, video
`gemini-3.1-pro-fast`	`gemini-3.1-pro`	fast	fast, image, audio, video
`gemini-3.1-pro-thinking`	`gemini-3.1-pro`	thinking	thinking, image, audio, video

ChatGPT

Model	Family	Mode	Capabilities
`chatgpt-5.5-fast`	`gpt-5-5`	fast	thinking, fast, image, audio, video
`chatgpt-5.5-thinking`	`gpt-5-5`	thinking	thinking, fast, image, audio, video

Fake

Model	Family	Mode	Capabilities
`fake-model-auto`	`fake-model`	auto	thinking, fast, image, audio, video
`fake-model-fast`	`fake-model`	fast	thinking, fast, image, audio, video
`fake-model-thinking`	`fake-model`	thinking	thinking, fast, image, audio, video

Provider Capabilities

Provider	Models	Chat Ops	Notes
`qwen`	12	edit, delete, fork, temp, stop	Supports image/video-capable model families.
`deepseek`	4	edit, stop	No standalone delete, fork, or temp chat.
`gemini`	6	edit-last-only, temp, stop	Uses a fresh upstream conversation for middle edits.
`chatgpt`	2	edit, fork, temp, stop	Supports uploads and provider-native fork flow.
`fake`	3	edit, delete, fork, temp, stop	In-memory test provider.

opencode

Add a custom OpenAI-compatible provider to ~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "web2cli": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "web2cli",
      "options": {
        "baseURL": "http://localhost:4981/v1",
        "apiKey": "dummy"
      },
      "models": {
        "qwen3.6-plus-fast": {
          "id": "qwen3.6-plus-fast",
          "name": "Qwen 3.6 Plus (Fast)",
          "reasoning": false,
          "tool_call": true,
          "attachment": true,
          "modalities": {
            "input": ["text", "image", "video"],
            "output": ["text"]
          }
        },
        "gemini-3.1-pro-thinking": {
          "id": "gemini-3.1-pro-thinking",
          "name": "Gemini 3.1 Pro (Thinking)",
          "reasoning": true,
          "tool_call": true,
          "attachment": true,
          "modalities": {
            "input": ["text", "image", "audio", "video", "pdf"],
            "output": ["text"]
          }
        },
        "chatgpt-5.5-fast": {
          "id": "chatgpt-5.5-fast",
          "name": "ChatGPT 5.5 (Fast)",
          "reasoning": true,
          "tool_call": true,
          "attachment": true,
          "modalities": {
            "input": ["text", "image", "audio", "video", "pdf"],
            "output": ["text"]
          }
        }
      }
    }
  }
}

Add every model you want to see in opencode under provider.web2cli.models. The model key must match /v1/models exactly.

opencode -m web2cli/chatgpt-5.5-fast
opencode -m web2cli/gemini-3.1-pro-thinking
opencode -m web2cli/qwen3.6-plus-fast

Restart opencode after changing opencode.json; provider config is loaded at startup.

Tool Calls

WebUI providers do not expose the same native tool-calling interface as API providers. web2cli accepts OpenAI-style function tools from opencode and injects a system prompt that asks the model to return caller-executed tool calls as JSON. The response parser converts those JSON calls back into OpenAI tool_calls for opencode.

Attachments

Accepted OpenAI content parts:

text
image_url
image
audio
video
file

Media parts must use data URIs. Providers upload supported media before sending the prompt upstream. Unsupported media is represented in the serialized prompt as an attachment marker where provider upload support is unavailable.

State

Persistent chat state is stored in SQLite at WEB2CLI_STATE_PATH, defaulting to:

~/.local/share/web2cli/state.sqlite3

Credentials are stored separately as age-encrypted JSON blobs:

~/.local/share/web2cli/creds/<provider>.age

Development

uv run pytest tests/unit tests/contract -q
uv run ruff check src/ tests/
uv run python tests/live/run_live.py --provider qwen
uv run python tests/live/run_live.py --provider gemini --media
uv run python tests/live/run_live.py --provider chatgpt --only chat_basic chat_diff

Live tests start the app, run chat scenarios, write logs under tests/live/logs/<timestamp>/<provider>, and clean up created chats unless --no-cleanup is passed.

Adding a Provider

Create a package under src/providers/<name>/ with one concrete Provider subclass in provider.py. The factory imports src.providers.<name>.provider and requires exactly one non-abstract subclass of Provider.

Required methods:

list_models
create_chat
apply_sync
complete
interrupt
delete_chat

Optional methods:

delete_message
fork_chat
close

Set ProviderCapabilities accurately so the sync planner and management endpoints can reject unsupported flows before touching the upstream provider.