D9 cleanup: borrar NLU/handlers/machine/replyTemplates legacy + activar agente + prompt caching
Después de validar el agente E2E con DeepSeek, el legacy queda muerto. 51 archivos cambiados (la mayoría borrados), el motor único es ahora el agente tool-calling. Borrados (~3500 LOC): - src/modules/3-turn-engine/nlu/ (router + 4 specialists + promptLoader + schemas + humanFallback + 6 default prompts) — reemplazado por systemPrompt.js - src/modules/3-turn-engine/stateHandlers/ (cart.js, cartHelpers.js, idle.js, shipping.js, utils.js, index.js) — reemplazado por tools del agente - src/modules/3-turn-engine/stateHandlers.js (re-export shim) - src/modules/3-turn-engine/openai.js (NLU clásico v3 + jsonCompletion + llmRecommendWriter + llmPlanningRecommend) — el agente crea su propio cliente OpenAI con tools nativos - src/modules/3-turn-engine/replyRewriter.js (rewriting LLM) — el agente escribe say directo, no necesita reescribir - src/modules/3-turn-engine/replyTemplates.js + test (rotación de variantes) — el agente varía naturalmente con tool_choice=required + temperature - src/modules/3-turn-engine/recommendations.js (cross-sell + planning) — el agente decide cuándo recomendar via tool calls - src/modules/3-turn-engine/machine/ (XState v5 completo + 19 tests) — reemplazado por la FSM podada en fsm.js + agent/runTurn.js - src/modules/3-turn-engine/turnEngineV3.helpers.js, .units.js, .pendingSelection.js (helpers del legacy) - src/modules/0-ui/controllers/prompts.js, handlers/prompts.js, db/promptsRepo.js — admin de prompts NLU (ya no hay prompts editables) - public/components/prompts-crud.js + nav entry en ops-shell turnEngineV3.js se reduce a un thin wrapper que exporta runTurnV3 (alias de runTurnAgent) + safeNextState (re-export de fsm.js). Mantiene la firma pública para no tocar pipeline.js. Activado: - AGENT_MAX_TOOL_CALLS=10 y AGENT_TURN_TIMEOUT_MS=25000 son los únicos flags. Borradas: USE_MODULAR_NLU, USE_XSTATE, XSTATE_SHADOW, XSTATE_SETTLE_MS, REPLY_REWRITER, REPLY_REWRITER_TIMEOUT_MS, TURN_ENGINE, AGENT_TURN_ENGINE, AGENT_TURN_ENGINE_SHADOW (el agente es default). Prompt caching DeepSeek: - systemPrompt.js: era función con storeName interpolado → ahora export const SYSTEM_PROMPT (100% estático). storeName se pasa por user message via working_memory.store.name. Cualquier cambio al system invalida cache, por eso es estático estricto. - runTurn.js: captura usage.prompt_cache_hit_tokens (DeepSeek) o prompt_tokens_details.cached_tokens (OpenAI compat) y suma a métricas. - /api/metrics/agent ahora reporta prompt_tokens_total, completion_tokens_total, prompt_cache_hit_tokens, cache_hit_ratio. - Smoke test 3 turnos: cache_hit_ratio = 0.72 (17664 cached / 24546 total prompt tokens). Saving directo en costo: ~$0.02/M cached vs $0.27/M no cached en DeepSeek. Tests: 148/148 (perdimos 90 tests del legacy XState/replyTemplates que ya no aplican). Sim flow E2E confirmado: hola → agent responde, multi-turn con cache caliente. Si más adelante hace falta volver al legacy: git revert este commit (c c9c69cf8 es el último estado verde con doble motor). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
58
CLAUDE.md
58
CLAUDE.md
@@ -47,36 +47,51 @@ The DB schema retains `tenant_id` columns (it was originally multi-tenant) but t
|
||||
### Request flow
|
||||
|
||||
```
|
||||
WhatsApp → Evolution API webhook → /webhook/evolution
|
||||
WhatsApp → Evolution API webhook → /webhook/evolution (or /sim/send)
|
||||
↓
|
||||
1-intake: route & normalize message
|
||||
↓
|
||||
3-turn-engine: NLU → FSM → state handler
|
||||
2-identity/pipeline.processMessage (idempotency, history, side effects)
|
||||
↓
|
||||
3-turn-engine/agent: tool-calling LLM loop
|
||||
↓
|
||||
Response persisted to DB + sent back via Evolution API
|
||||
```
|
||||
|
||||
### Turn engine: tool-calling agent (DeepSeek)
|
||||
|
||||
`src/modules/3-turn-engine/agent/` es el único motor. Cada turno arma un **WorkingMemory** (cart, pending, last_shown_options, store, history truncado, customer_profile, preparsed quantity) y se lo pasa al LLM como user message. El LLM decide qué tools llamar:
|
||||
|
||||
- `search_catalog`, `add_to_cart`, `set_quantity`, `select_candidate`, `remove_from_cart`
|
||||
- `set_shipping`, `set_address`, `confirm_order`
|
||||
- `pause`, `escalate_to_human`
|
||||
- `say` (último siempre — es el reply al usuario)
|
||||
|
||||
El system prompt es **estático** (en `agent/systemPrompt.js` como `SYSTEM_PROMPT` const) para que DeepSeek lo cachée prefix-cache automáticamente. Cache hit ratio típico ≥70% después de 2 turnos. El parser de cantidades (`agent/quantityParser.js`) preprocesa el texto y se pasa como `working_memory.preparsed` (fracciones, "media docena", "cuarto kilo", etc.).
|
||||
|
||||
La FSM (`fsm.js`) sigue siendo guardrail: estados `IDLE / CART / SHIPPING / PAUSED / AWAITING_HUMAN` con transiciones validadas. PAUSED tiene TTL 7d (cart preservado para "después te digo").
|
||||
|
||||
### Module structure (numbered layers)
|
||||
|
||||
- **`src/modules/0-UI/`** — Admin dashboard: REST controllers for products, conversations, settings, prompts, takeovers, recommendations, aliases. Each controller has a `db/` sub-layer for persistence.
|
||||
- **`src/modules/0-UI/`** — Admin dashboard: REST controllers para products, conversations, settings, takeovers, recommendations, aliases.
|
||||
|
||||
- **`src/modules/1-intake/`** — Message ingestion. Routes: `/simulator` (dev UI), `/webhook/evolution` (WhatsApp). Normalizes incoming messages before passing to turn engine.
|
||||
- **`src/modules/1-intake/`** — Message ingestion. Routes: `/simulator` (dev UI), `/webhook/evolution` (WhatsApp).
|
||||
|
||||
- **`src/modules/2-identity/`** — Tenant and user management. Maps WhatsApp numbers to WooCommerce customers. Stores encrypted WooCommerce credentials per tenant in `tenant_ecommerce_config`. Routes WooCommerce webhooks.
|
||||
- **`src/modules/2-identity/`** — User mapping (WhatsApp ↔ WooCommerce customer), encrypted WooCommerce credentials, pipeline orchestrator.
|
||||
|
||||
- **`src/modules/3-turn-engine/`** — Core logic. NLU classifies intents; FSM transitions states (`IDLE → CART → SHIPPING → PAYMENT → WAITING_WEBHOOKS`). Two NLU versions controlled by `USE_MODULAR_NLU` env flag. Two turn engine versions controlled by `TURN_ENGINE` env flag. State handlers map to FSM states.
|
||||
- **`src/modules/3-turn-engine/`** — Agente tool-calling (`agent/`), FSM (`fsm.js`), order model (`orderModel.js`), catalog retrieval (`catalogRetrieval.js`), store context (`storeContext.js`).
|
||||
|
||||
- **`src/modules/4-woo-orders/`** — WooCommerce order sync. Fetches and caches customer order history for conversation context.
|
||||
- **`src/modules/4-woo-orders/`** — WooCommerce order sync (lectura). El bot crea orders nuevas vía `wooOrders.createOrder` desde `pipeline.js` cuando emite la action `create_order`.
|
||||
|
||||
- **`src/modules/shared/`** — DB pool (PostgreSQL via `pg`), SSE for real-time admin UI updates, WooSnapshot (product catalog cache), debug utilities.
|
||||
- **`src/modules/shared/`** — DB pool, SSE, WooSnapshot, tenant resolver (`getTenantId()`), debug.
|
||||
|
||||
### Key integrations
|
||||
|
||||
| System | Purpose | Config |
|
||||
|--------|---------|--------|
|
||||
| OpenAI | NLU intent classification & response generation | `OPENAI_API_KEY`, `OPENAI_MODEL` |
|
||||
| Evolution API | WhatsApp send/receive | `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE_NAME`, `EVOLUTION_SEND_ENABLED` |
|
||||
| WooCommerce REST API | Products, orders, customers | `WOO_*` env vars or per-tenant in DB |
|
||||
| LLM (DeepSeek) | Agente tool-calling — único motor | `OPENAI_API_KEY`, `OPENAI_BASE_URL=https://api.deepseek.com/v1`, `OPENAI_MODEL=deepseek-chat` |
|
||||
| Evolution API | WhatsApp send/receive | `EVOLUTION_*`, `EVOLUTION_SEND_ENABLED` |
|
||||
| WooCommerce REST API | Products, orders, customers | `WOO_BASE_URL`, `WOO_CONSUMER_KEY`, `WOO_CONSUMER_SECRET` |
|
||||
| PostgreSQL | Primary database | `DATABASE_URL` |
|
||||
|
||||
### Database
|
||||
@@ -84,18 +99,23 @@ WhatsApp → Evolution API webhook → /webhook/evolution
|
||||
Migrations live in `db/migrations/` as timestamped SQL files managed by `dbmate`. Key tables:
|
||||
- `tenants`, `tenant_config`, `tenant_settings`, `tenant_ecommerce_config`, `tenant_channels`
|
||||
- `wa_identity_map` — WhatsApp ↔ WooCommerce customer mapping
|
||||
- `wa_conversation_state` — FSM state + context per conversation
|
||||
- `wa_messages` — Message history
|
||||
- `woo_products_snapshot` — Cached product catalog
|
||||
- `prompt_templates` — Versioned LLM prompts
|
||||
- `wa_conversation_state` — FSM state + context (cart, pending, last_shown_options, paused_until) en JSONB
|
||||
- `wa_messages` — Message history (idempotencia por message_id)
|
||||
- `woo_products_snapshot` — Cached product catalog (con índices pg_trgm en aliases)
|
||||
- `product_aliases`, `alias_product_mappings` — fuzzy alias resolution
|
||||
- `woo_orders_cache` + `woo_order_items` — orders sync para customer_profile / stats
|
||||
- `human_takeovers`, `audit_log`, `conversation_runs`
|
||||
|
||||
### Feature flags (env vars)
|
||||
|
||||
- `TURN_ENGINE=v1|v2` — Which turn engine version to use
|
||||
- `USE_MODULAR_NLU=1` — Use modular NLU (prompt templates from DB) vs. v3 hardcoded
|
||||
- `EVOLUTION_SEND_ENABLED=1` — Actually send messages to WhatsApp (disable in dev/test)
|
||||
- `DEBUG_PERF`, `DEBUG_WOO_HTTP`, `DEBUG_LLM`, `DEBUG_EVOLUTION` — Granular debug logging
|
||||
- `AGENT_MAX_TOOL_CALLS=10` — cap de tool calls por turno
|
||||
- `AGENT_TURN_TIMEOUT_MS=25000` — timeout total del turno
|
||||
- `EVOLUTION_SEND_ENABLED=1` — enviar a WhatsApp real (off en dev)
|
||||
- `DEBUG_PERF`, `DEBUG_WOO_HTTP`, `DEBUG_LLM`, `DEBUG_EVOLUTION` — debug logs granular
|
||||
|
||||
### Métricas
|
||||
|
||||
- `GET /api/metrics/agent` — turns, avg tool calls, fallback rate, escalations, **cache_hit_ratio** (prompt caching de DeepSeek)
|
||||
|
||||
### Local development
|
||||
|
||||
|
||||
Reference in New Issue
Block a user