feat(csn): separate text-to-SQL LLM from narrator LLM + per-message UI picker + per-step api_usage logging

reckg7c1hQxMxVXV0

Approved

Project: LEBBA
Build Model: opus
Review Model: codex-5.5
Runtime Model: sonnet
Rounds: 1
Work Item: Open in Airtable
Output: Open PR

Time by Role

Orchestrator190.0s(13.2%)

Builder919.8s(63.7%)

Reviewer334.2s(23.1%)

Total: 1444.0s

Feature Waterfall (3 features)

F1 Backend: separate SQL LLM + per-step usage logging

0s - APPROVE

F2 Frontend: dual model picker + sql_model_id wiring

0s - APPROVE

F3 Surface sql_default_model_id on tenant-config + frontend defaults

0s - APPROVE

Builder

Reviewer

Agent Dispatches

Orchestrator5

Builder1

Reviewer3

Tokens by Role

Orchestrator45.8k(11.5%)

Builder116.2k(29.1%)

Reviewer237.9k(59.5%)

Total: 399.9k tokens

Token breakdown by type pending; only aggregate usage is available.

Code

+1,734

Lines Added

-1,850

Lines Removed

Files Changed

Source vs Test

Source532(30.7%)Test1,202(69.3%)

Coverage

Coverage: not reported

Features

Validation Contract

# Validation Contract — recPyXqkU9XZb8k0s feat(csn): separate text-to-SQL LLM from narrator LLM + per-message UI picker + per-step api_usage logging

Generated from: https://airtable.com/appqmGXa65GwGjJzU/tblAjO4JA5EcXkMjA/recPyXqkU9XZb8k0s
Generated at: 2026-05-31T09:14:44Z

## Assertions

1. [A] `lebba_core/config/dataclasses.py` exposes an optional `sql_model: str | None = None` field on the generation-provider dataclass; `tenants/csn/config.yaml` sets `sql_model: "anthropic/claude-haiku-4.5"` under `infrastructure.generation`.

2. [A] `QueryRequest` in `lebba_core/api/models.py` declares `sql_model_id: str | None = Field(None, ...)` next to the existing `model_id`.

3. [A] `SqlLLMBackend.generate_sql` accepts `model_override: str | None = None` in both `GeminiSqlBackend` and `OpenRouterSqlBackend`; when supplied, the override replaces `_get_model()` in the outbound call. Verified by unit tests in `tests/test_generation/test_sql_llm_openrouter.py` + new `tests/test_generation/test_sql_llm_gemini.py`.

4. [A] `generate_sql` returns a result object exposing `model` / `input_tokens` / `output_tokens` parsed from the OpenRouter `usage` block (Gemini: `usage_metadata`). Verified by mocked-response unit tests.

5. [A] `SqlRetrievalStrategy.search` accepts `sql_model_override: str | None = None`, forwards it to `sql_backend.generate_sql`, and stashes the usage block on `RetrievalResult.metadata["sql_usage"]`. Verified by a unit test using a fake `SqlLLMBackend`.

6. [A] A CSN `POST /api/v2/query` produces TWO `_record_usage` calls — one with `service="{provider}_sql_generation", operation="sql_translation"` and one with `service="{provider}_generation", operation="rag_query"`. Verified by an integration test that mocks the strategy + asserts the two distinct calls.

7. [A] `payload.sql_model_id` with an id not in `available_models` returns HTTP 400 with the existing model-validation error shape; a valid id is threaded into the SQL backend. Verified by API tests.

8. [A] When `payload.sql_model_id` is omitted, resolution falls back to `config.generation_provider.sql_model`, then to `config.generation_provider.model`. Verified by unit tests on the fallback chain.

9. [A] `frontend/src/lib/api.ts` query-request body type adds `sql_model_id?: string | null`; `frontend/src/hooks/useChat.ts` sends `sql_model_id` on the request body. Verified by source check + tsc.

10. [A] A new `useSelectedSqlModel` (or parametrized `useSelectedModel`) persists the SQL choice under localStorage key `lebba_sql_model_id:<tenant>` independently of `lebba_model_id:<tenant>`. Verified by a vitest unit test in `frontend/src/test/`.

11. [A] `frontend/src/components/ChatPanel.tsx` renders a second `<ModelTune>` labeled via `chat.sqlModel`, conditioned on `retrieval_strategy === 'sql_retrieval'`. Verified by an RTL test mounting ChatPanel under both conditions.

12. [A] Locale files `frontend/src/i18n/locales/chat.en.json` and `frontend/src/i18n/locales/chat.pt.json` carry `chat.sqlModel` and `chat.sqlModelAria` keys. Verified by file/content check. (Note: ticket originally cited `frontend/src/locales/*/common.json` but planner verified actual locale path is `frontend/src/i18n/locales/chat.*.json`.)

13. [A] All verification commands pass cleanly: `python -m ruff check lebba_core tests scripts tenants`, `python -m ruff format --check lebba_core tests scripts tenants`, `python -m pytest -q`, `python -m mypy lebba_core/ --ignore-missing-imports`, `cd frontend && npm run typecheck && npm run build`.

14. [R] With `python scripts/dev.py` + frontend dev server up, the CSN chat tune-strip shows TWO model pickers side-by-side. Narrator default matches the existing `is_default: true` entry (Claude Opus 4.7); SQL picker default matches `sql_model` YAML (Claude Haiku 4.5).

15. [R] Submitting a CSN query with narrator=Opus, SQL=Haiku writes TWO new `api_usage` rows correlated to the same request: one with `service='openrouter_sql_generation'` and `model='anthropic/claude-haiku-4.5'`, one with `service='openrouter_generation'` and `model='anthropic/claude-opus-4.7'`.

16. [R] Switching to a non-SQL tenant (`edut710`) hides the SQL picker — only the narrator picker is visible. Reloading the page restores both pickers' selections independently from localStorage.

## Coverage Rules

- `[A]` assertions MUST have a corresponding failing test (or grep/file check) in the dev's first commit.
- `[R]` assertions MUST be verified by the runtime validator against a running instance.
- The reviewer MUST reject any first commit that has no `[A]` verification per `[A]` assertion.

Event Timeline (9 events)

#	Role	Event Type	Summary	Verdict	Time
1	Orchestrator	contract_generated	Validation contract approved (13 [A] / 3 [R] assertions)		5/31/2026, 9:14:44 AM
2	Orchestrator	dispatch_started	Dispatching F1 (backend) + F2 (frontend) in parallel		5/31/2026, 9:17:00 AM
3	Reviewer	validator_verdict	F1 backend APPROVE — 8/8 [A] covered, clean SqlGenerationResult(str) backwards-compat, 2 non-blocking NITs	Approve	5/31/2026, 9:38:30 AM
4	Reviewer	validator_verdict	F2 frontend APPROVE — 4/4 [A] covered, one notable NIT on SQL picker visible default	Approve	5/31/2026, 9:38:30 AM
5	Orchestrator	completed	[merger] F1+F2 merged into mission (strategy A: FF to F2, then merge F1)		5/31/2026, 9:53:00 AM
6	Builder	dispatch_started	F3 dev (sql_default_model_id surfacing) completed		5/31/2026, 10:09:00 AM
7	Reviewer	validator_verdict	F3 APPROVE — 4/4 [A] covered, no findings beyond explanatory NIT on getattr defensive read	Approve	5/31/2026, 10:11:00 AM
8	Orchestrator	pr_opened	PR #144 opened against staging; auto-merge (squash) enabled		5/31/2026, 10:18:00 AM
9	Orchestrator	completed	Run complete — PR #144 open with auto-merge enabled		5/31/2026, 10:18:00 AM