feat(csn): separate text-to-SQL LLM from narrator LLM + per-message UI picker + per-step api_usage logging
reckg7c1hQxMxVXV0
- Project
- LEBBA
- Build Model
- opus
- Review Model
- codex-5.5
- Runtime Model
- sonnet
- Rounds
- 1
- Work Item
- Open in Airtable
- Output
- Open PR
Time by Role
Orchestrator190.0s(13.2%)
Builder919.8s(63.7%)
Reviewer334.2s(23.1%)
Total: 1444.0s
Feature Waterfall (3 features)
F1 Backend: separate SQL LLM + per-step usage logging
0s - APPROVE
F2 Frontend: dual model picker + sql_model_id wiring
0s - APPROVE
F3 Surface sql_default_model_id on tenant-config + frontend defaults
0s - APPROVE
Builder
Reviewer
Agent Dispatches
Orchestrator5
Builder1
Reviewer3
Tokens by Role
Orchestrator45.8k(11.5%)
Builder116.2k(29.1%)
Reviewer237.9k(59.5%)
Total: 399.9k tokens
Token breakdown by type pending; only aggregate usage is available.
Code
+1,734
Lines Added
-1,850
Lines Removed
52
Files Changed
Source vs Test
Source532(30.7%)Test1,202(69.3%)
Coverage
Coverage: not reported
Features
Validation Contract
# Validation Contract — recPyXqkU9XZb8k0s feat(csn): separate text-to-SQL LLM from narrator LLM + per-message UI picker + per-step api_usage logging
Generated from: https://airtable.com/appqmGXa65GwGjJzU/tblAjO4JA5EcXkMjA/recPyXqkU9XZb8k0s
Generated at: 2026-05-31T09:14:44Z
## Assertions
1. [A] `lebba_core/config/dataclasses.py` exposes an optional `sql_model: str | None = None` field on the generation-provider dataclass; `tenants/csn/config.yaml` sets `sql_model: "anthropic/claude-haiku-4.5"` under `infrastructure.generation`.
2. [A] `QueryRequest` in `lebba_core/api/models.py` declares `sql_model_id: str | None = Field(None, ...)` next to the existing `model_id`.
3. [A] `SqlLLMBackend.generate_sql` accepts `model_override: str | None = None` in both `GeminiSqlBackend` and `OpenRouterSqlBackend`; when supplied, the override replaces `_get_model()` in the outbound call. Verified by unit tests in `tests/test_generation/test_sql_llm_openrouter.py` + new `tests/test_generation/test_sql_llm_gemini.py`.
4. [A] `generate_sql` returns a result object exposing `model` / `input_tokens` / `output_tokens` parsed from the OpenRouter `usage` block (Gemini: `usage_metadata`). Verified by mocked-response unit tests.
5. [A] `SqlRetrievalStrategy.search` accepts `sql_model_override: str | None = None`, forwards it to `sql_backend.generate_sql`, and stashes the usage block on `RetrievalResult.metadata["sql_usage"]`. Verified by a unit test using a fake `SqlLLMBackend`.
6. [A] A CSN `POST /api/v2/query` produces TWO `_record_usage` calls — one with `service="{provider}_sql_generation", operation="sql_translation"` and one with `service="{provider}_generation", operation="rag_query"`. Verified by an integration test that mocks the strategy + asserts the two distinct calls.
7. [A] `payload.sql_model_id` with an id not in `available_models` returns HTTP 400 with the existing model-validation error shape; a valid id is threaded into the SQL backend. Verified by API tests.
8. [A] When `payload.sql_model_id` is omitted, resolution falls back to `config.generation_provider.sql_model`, then to `config.generation_provider.model`. Verified by unit tests on the fallback chain.
9. [A] `frontend/src/lib/api.ts` query-request body type adds `sql_model_id?: string | null`; `frontend/src/hooks/useChat.ts` sends `sql_model_id` on the request body. Verified by source check + tsc.
10. [A] A new `useSelectedSqlModel` (or parametrized `useSelectedModel`) persists the SQL choice under localStorage key `lebba_sql_model_id:<tenant>` independently of `lebba_model_id:<tenant>`. Verified by a vitest unit test in `frontend/src/test/`.
11. [A] `frontend/src/components/ChatPanel.tsx` renders a second `<ModelTune>` labeled via `chat.sqlModel`, conditioned on `retrieval_strategy === 'sql_retrieval'`. Verified by an RTL test mounting ChatPanel under both conditions.
12. [A] Locale files `frontend/src/i18n/locales/chat.en.json` and `frontend/src/i18n/locales/chat.pt.json` carry `chat.sqlModel` and `chat.sqlModelAria` keys. Verified by file/content check. (Note: ticket originally cited `frontend/src/locales/*/common.json` but planner verified actual locale path is `frontend/src/i18n/locales/chat.*.json`.)
13. [A] All verification commands pass cleanly: `python -m ruff check lebba_core tests scripts tenants`, `python -m ruff format --check lebba_core tests scripts tenants`, `python -m pytest -q`, `python -m mypy lebba_core/ --ignore-missing-imports`, `cd frontend && npm run typecheck && npm run build`.
14. [R] With `python scripts/dev.py` + frontend dev server up, the CSN chat tune-strip shows TWO model pickers side-by-side. Narrator default matches the existing `is_default: true` entry (Claude Opus 4.7); SQL picker default matches `sql_model` YAML (Claude Haiku 4.5).
15. [R] Submitting a CSN query with narrator=Opus, SQL=Haiku writes TWO new `api_usage` rows correlated to the same request: one with `service='openrouter_sql_generation'` and `model='anthropic/claude-haiku-4.5'`, one with `service='openrouter_generation'` and `model='anthropic/claude-opus-4.7'`.
16. [R] Switching to a non-SQL tenant (`edut710`) hides the SQL picker — only the narrator picker is visible. Reloading the page restores both pickers' selections independently from localStorage.
## Coverage Rules
- `[A]` assertions MUST have a corresponding failing test (or grep/file check) in the dev's first commit.
- `[R]` assertions MUST be verified by the runtime validator against a running instance.
- The reviewer MUST reject any first commit that has no `[A]` verification per `[A]` assertion.Event Timeline (9 events)
| # | Role | Event Type | Summary | Verdict | Time |
|---|---|---|---|---|---|
| 1 | Orchestrator | contract_generated | Validation contract approved (13 [A] / 3 [R] assertions) | 5/31/2026, 9:14:44 AM | |
| 2 | Orchestrator | dispatch_started | Dispatching F1 (backend) + F2 (frontend) in parallel | 5/31/2026, 9:17:00 AM | |
| 3 | Reviewer | validator_verdict | F1 backend APPROVE — 8/8 [A] covered, clean SqlGenerationResult(str) backwards-compat, 2 non-blocking NITs | Approve | 5/31/2026, 9:38:30 AM |
| 4 | Reviewer | validator_verdict | F2 frontend APPROVE — 4/4 [A] covered, one notable NIT on SQL picker visible default | Approve | 5/31/2026, 9:38:30 AM |
| 5 | Orchestrator | completed | [merger] F1+F2 merged into mission (strategy A: FF to F2, then merge F1) | 5/31/2026, 9:53:00 AM | |
| 6 | Builder | dispatch_started | F3 dev (sql_default_model_id surfacing) completed | 5/31/2026, 10:09:00 AM | |
| 7 | Reviewer | validator_verdict | F3 APPROVE — 4/4 [A] covered, no findings beyond explanatory NIT on getattr defensive read | Approve | 5/31/2026, 10:11:00 AM |
| 8 | Orchestrator | pr_opened | PR #144 opened against staging; auto-merge (squash) enabled | 5/31/2026, 10:18:00 AM | |
| 9 | Orchestrator | completed | Run complete — PR #144 open with auto-merge enabled | 5/31/2026, 10:18:00 AM |