An interactive Vite + React client paired with a Fastify streaming API for exploring token-level log probabilities from OpenAI-compatible chat models. Use it to inspect how models score alternative completions, tune parameters, and compare model behaviors in real time.
- Node.js 20 (LTS) and npm 10
.env.localcontainingOPENAI_API_KEYand optional overrides (OPENAI_BASE_URL,PORT,HOST)- Docker Desktop (for containerized workflows)
- Azure CLI (
az) for publishing to Azure Container Apps
src/– React application code (components, hooks, pages, shared utilities)server/– Fastify backend (index.ts) plusmodels.jsondescribing available modelspublic/– Static assets bundled by Vitedocker/– Container scripts and configurationscripts/publish-azure.sh– Azure Container Apps deployment helperdist/– Production build artifacts (generated)
graph TD
subgraph Client
UI[React Components]
Hooks[Data Hooks]
Router[React Router]
end
subgraph Build
Vite[Vite Dev Server]
Tailwind[Tailwind + PostCSS]
end
subgraph Server
Fastify[Fastify API]
Zod[Zod Validation]
Models[models.json]
end
subgraph External
OpenAI[OpenAI-Compatible API]
end
UI -->|HTTP /api/models| Fastify
UI -->|Stream /api/complete/stream| Fastify
Fastify -->|chat.completions.stream| OpenAI
Fastify --> Models
Vite --> UI
Tailwind --> UI
sequenceDiagram
participant User
participant UI as React UI
participant API as Fastify API
participant LLM as OpenAI API
User->>UI: Configure prompt & parameters
UI->>API: POST /api/complete/stream (NDJSON)
API->>API: Validate payload with Zod
API->>LLM: Forward streaming request
LLM-->>API: Emit token deltas + logprobs
API->>API: Annotate & normalize logprob data
API-->>UI: Send NDJSON chunks per token
UI->>UI: Accumulate tokens, update charts/tables
UI-->>User: Render interactive visualization
- Copy
.env.local.exampleto.env.local(create if absent) and setOPENAI_API_KEY; optionally overrideOPENAI_BASE_URL,PORT, orHOST. - Install dependencies:
npm install - Run the stack:
npm run dev– Vite dev server onhttp://localhost:5173npm run server– Fastify API onhttp://localhost:8787npm run dev:all– Concurrent client + API with shared logging
- Client uses React Router for views, TanStack Query hooks for data fetching, and Tailwind tokens for theming.
/api/modelsexposes selectable models fromserver/models.json; update this file to add providers or rename models./api/complete/streamvalidates requests (Zod), forwards them to the OpenAI SDK, and streams NDJSON chunks with token text, probability, and top alternatives.- UI accumulates streamed tokens, computes percent/odds deltas, and renders charts, tables, and textual overlays.
- UI primitives live in
src/components; follow PascalCase filenames and import Tailwind classes via semantic tokens defined intailwind.config.ts. - Shared logic belongs in
src/hooksandsrc/lib; prefer discriminated unions or branded types for state. - Add routed views under
src/pagesand register them insrc/main.tsx. - Extend the API in
server/index.ts; update Zod schemas and client-side request contracts together to preserve type safety.
- Type safety:
npm run typecheck - Static analysis & formatting:
npm run lint,npm run lint:fix,npm run pretty - Manual smoke:
npm run preview(afternpm run build) and hit/api/health - When introducing automated tests, colocate
*.test.ts(x)files near their modules and wire them into Vitest.
- Production bundle:
npm run build(ornpm run build:devfor a debug bundle) – artifacts indist/ - Local production preview:
npm run preview
npm run docker:build– buildslogprob-visualizer:demo- Ensure
.env.localcontains runtime variables or pass them via--envflags npm run docker– runs the container (maps port8000to the container’s80by default)
npm run publish invokes scripts/publish-azure.sh.
Required:
APP_NAME– Container App name
Optional:
AZ_SUBSCRIPTION,RESOURCE_GROUP,ENVIRONMENT,LOCATION,INGRESS,TARGET_PORT,ENV_ARGS
Example:
APP_NAME=my-logprob-viewer RESOURCE_GROUP=rg-llm ENVIRONMENT=aca-env \
LOCATION=eastus TARGET_PORT=8000 npm run publishIf ENV_ARGS is omitted, the script converts .env.local entries into --env-vars pairs automatically.
- Logging: Fastify emits structured logs with request IDs derived from headers or generated UUIDs.
- Rate limiting and CORS are centralized in
server/index.ts; modify them there to maintain consistent hardening. - Streaming: The server writes NDJSON to keep latency low while the client progressively renders tokens.
az account showfailing → runaz loginbefore publishing.- 500 errors from
/api/complete/stream→ verifyOPENAI_API_KEYand network reachability to the provider. - Missing models in the UI → ensure each entry in
server/models.jsonincludesidandname.
Consult AGENTS.md for coding standards, branching strategy, and review expectations before opening a pull request.