Semantic Query Deep Dive

Text -> Intent -> SQLDB semantic model (ce_semantic_*)Read-only + validation + retry

One-line mental model

Semantic runtime is not direct text-to-SQL.
It is Question -> CanonicalIntent -> constrained SQL -> execution -> answer.

Pipeline Architecture

Semantic Runtime

Planner -> Interpret -> Query -> Postgres -> Answer

Minimal anchors

Chat + Trace

What to look at	Why it matters
Tool call order + stage transitions	Tells you where behavior diverged

Canonical Intent + SQL

What to inspect	Why it matters
Intent JSON and compiled SQL	Explains exactly why this SQL was produced

Config + Table Dependencies

What to inspect	Why it matters
`ce_semantic_*` rows used in this request	Most wrong outputs are metadata issues

Prompt Suite

What to inspect	Why it matters
Interpret prompt + SQL prompt + retry prompt	Shows what the LLM was allowed to do

Running example

User question

TEXT

find all requests where notes text says cancelled due to duplicate

User

find all requests where notes text says cancelled due to duplicate

Runtime tables used in this trace (sample rows)

`ce_semantic_concept` (10 sample rows)

ce_semantic_concept

ID	CONCEPT_CODE	DESCRIPTION
1	DisconnectRequest	Customer disconnect request
2	DisconnectOrder	Downstream disconnect order
3	Account	Customer account
4	SubAccount	Sub-account
5	Connection	Physical connection
6	Billing	Billing records
7	Payment	Payment records
8	SupportTicket	Support cases
9	Installation	Install request
10	ServicePlan	Service plan

`ce_semantic_synonym` (10 sample rows)

ce_semantic_synonym

ID	SYNONYM	MAPS_TO
1	request	DisconnectRequest
2	requests	DisconnectRequest
3	notes	notesText
4	note	notesText
5	remarks	notesText
6	comments	notesText
7	cancelled	CANCELLED_SIGNAL
8	duplicate	DUPLICATE_SIGNAL
9	last day	LAST_24_HOURS
10	today	TODAY

`ce_semantic_query_class` (10 sample rows)

ce_semantic_query_class

ID	QUERY_CLASS	DESCRIPTION
1	LIST	list results
2	FAILURE_ANALYSIS	failure analytics
3	COUNT	count
4	TREND	trend
5	SUMMARY	summary
6	DETAIL	detail
7	JOIN_VIEW	join-heavy lookup
8	ALERT	alert lookup
9	STATUS	status lookup
10	RECENT	recent lookup

`ce_semantic_mapping` (10 sample rows)

ce_semantic_mapping

ID	ENTITY	FIELD	TABLE	COLUMN
1	DisconnectRequest	requestId	zp_disco_request	zp_request_id
2	DisconnectRequest	status	zp_disco_request	status
3	DisconnectRequest	createdAt	zp_disco_request	created_ts
4	DisconnectRequest	subAccountId	zp_disco_request	sub_account_id
5	DisconnectRequest	city	zp_disco_request	city
6	DisconnectTransData	requestId	zp_disco_trans_data	zp_request_id
7	DisconnectTransData	notesText	zp_disco_trans_data	notes_text
8	DisconnectTransData	action	zp_disco_trans_data	action
9	DisconnectTransData	createdAt	zp_disco_trans_data	created_ts
10	DisconnectOrder	orderId	zp_disconnect_order	order_id

`ce_semantic_query_failures` (10 sample rows)

ce_semantic_query_failures

ID	BAD_SQL_PATTERN	CORRECTION_HINT	ROOT_CAUSE
1	r.notes_text	use t.notes_text	WRONG_TABLE_COLUMN
2	missing trans join	join on zp_request_id	MISSING_JOIN
3	SELECT without DISTINCT	use DISTINCT for request-level results	DUPLICATE_ROWS
4	wrong alias for trans table	alias trans table as t consistently	ALIAS_MISMATCH
5	unsafe write keyword	read-only select only	DENY_OPERATION
6	wrong timestamp column	use mapped created_ts	WRONG_TIME_COLUMN
7	missing limit	apply semantic max limit	LIMIT_POLICY
8	invalid field spelling	use mapped field names	FIELD_MAPPING_ERROR
9	bad join direction	follow relationship path	JOIN_PATH_ERROR
10	literal concatenation	use named params	PARAM_BINDING_POLICY

Step-by-step deep trace (who does what)

Step 1: question enters runtime

Java entry

JAVA

// Java does this:
String question = request.getQuestion();

// LLM not called yet.
// DB not read yet.

Current state

Java state	LLM state	DB state
question is present	not called	no read yet

Step 2: planner chooses tool sequence

Planner outcome

JSON

{
  "mcpAction": "CALL_TOOL",
  "toolOrder": [
    "db.semantic.interpret",
    "db.semantic.query",
    "postgres.query"
  ]
}

Who did what

Component	Did	Concluded
Java (McpPlanner)	loaded tool metadata	semantic chain should run
LLM	not deciding SQL here	none
DB	read planner/tool config	available tools + order

Step 3: Java loads semantic metadata for interpret stage

Java metadata load

JAVA

// Java does this:
var concepts = conceptRepo.findAllEnabled();
var synonyms = synonymRepo.findAllEnabled();
var queryClasses = queryClassRepo.findAllEnabled();
var mappings = mappingRepo.findAllEnabled();

// LLM still not called.
// DB returns rows from ce_semantic_* tables.

Tables read now

Table	Used for
ce_semantic_concept	allowed concept vocabulary
ce_semantic_synonym	language normalization hints
ce_semantic_query_class	query shape options
ce_semantic_mapping	allowed semantic fields

Step 4: Java builds interpret prompt

Interpret prompt package

TEXT

System prompt contains:
- Allowed entities from ce_semantic_concept
- Allowed query classes from ce_semantic_query_class
- Field vocabulary from ce_semantic_mapping
- Synonym hints from ce_semantic_synonym

User prompt contains:
find all requests where notes text says cancelled due to duplicate

Who did what

Component	Did	Concluded
Java	assembled strict prompt + output schema	interpret call is ready
LLM	not run yet	none
DB	already provided metadata rows	constraints available

Step 5: LLM interpret call happens

Interpret call

JAVA

// Java does this call.
var llmOutput = llmClient.generateJsonStrict(systemPrompt, userPrompt, interpretSchema);

// LLM does this inside the call:
// - chooses entity
// - chooses queryClass
// - chooses filters
// Java does NOT manually parse words into business meaning here.

LLM interpret output

JSON

{
  "entity": "DisconnectRequest",
  "queryClass": "LIST",
  "filters": [
    {
      "field": "notesText",
      "op": "CONTAINS",
      "value": "cancelled due to duplicate"
    }
  ],
  "confidence": 0.91,
  "needsClarification": false
}

db.semantic.interpret

intent: SEMANTIC_QUERYstate: SEMANTIC_INTERPRET

Interpret completed. Canonical meaning selected from allowed model terms.

Step 6: Java validates interpret output

Validation

JAVA

// Java validates that LLM output is legal against DB metadata.
boolean entityOk = conceptRepo.existsByConceptCode(intent.getEntity());
boolean classOk = queryClassRepo.existsByQueryClass(intent.getQueryClass());
boolean fieldsOk = intent.getFilters().stream()
.allMatch(f -> mappingRepo.existsByFieldCode(f.getField()));

// If invalid -> reject / clarify.

Current state

Java state	LLM state	DB state
canonicalIntent accepted	completed interpret step	validated entity/class/field existence

Step 7: CanonicalIntent becomes official runtime contract

CanonicalIntent

JSON

{
  "entity": "DisconnectRequest",
  "queryClass": "LIST",
  "filters": [
    {
      "field": "notesText",
      "op": "CONTAINS",
      "value": "cancelled due to duplicate"
    }
  ],
  "sort": [
    {
      "field": "createdAt",
      "dir": "DESC"
    }
  ],
  "limit": 100
}

Important

At this point, runtime has semantic meaning only. SQL is still not generated.

Step 8: Java loads query-stage metadata

Query-stage metadata load

JAVA

// Java does this before SQL generation:
var mappings = mappingRepo.findAllEnabled();
var failures = failureRepo.findRecent();

// DB does this:
// - returns field -> table.column mappings
// - returns prior SQL failure/correction memory

Tables read now

Table	Contribution
ce_semantic_mapping	physical column resolution + join feasibility
ce_semantic_query_failures	retry guidance from past mistakes
ce_semantic_relationship	join-path preference when relevant

Step 9: Java builds SQL prompt from CanonicalIntent + metadata

SQL prompt package

TEXT

Includes:
- CanonicalIntent JSON (entity, queryClass, filters)
- mapping rows (field -> table.column)
- failure hints (what previously failed)
- read-only requirement
- strict SQL JSON output schema

Who decided what so far

Decision	Decider
Meaning of user text	LLM in interpret stage
Allowed fields/tables	DB metadata + Java validation
SQL not yet final	pending query-stage LLM call

Step 10: LLM query call generates SQL candidate

SQL generation call

JAVA

// Java does this call:
var sqlOutput = llmClient.generateJsonStrict(sqlSystemPrompt, sqlUserPrompt, sqlSchema);

// LLM does this inside:
// - drafts SQL from CanonicalIntent
// - chooses joins from mapping context
// - returns sql + params + flags

SQL candidate

SQL

select
  DISTINCT r.zp_request_id, r.status,
  r.created_ts
from zp_disco_request r
  JOIN zp_disco_trans_data t
    on r.zp_request_id = t.zp_request_id
where t.notes_text ILIKE :notesText
order by r.created_ts DESC
limit :__limit; -- params -- :notesText = "%cancelled due to duplicate%" -- :__limit = 100

db.semantic.query

intent: SEMANTIC_QUERYstate: SEMANTIC_QUERY

SQL candidate compiled using CanonicalIntent + mappings + failure memory.

Step 11: Java validates SQL candidate

Validation + guardrail

JAVA

// Java safety checks before execution:
sqlGuardrail.assertReadOnly(sqlOutput.getSql());
sqlValidator.validateAgainstMappings(sqlOutput.getSql(), mappings);

// If validation fails:
// Java triggers retry with corrective feedback.

Validation responsibility

Check	Owner
Read-only enforcement	Java guardrail
Column/table legality vs mapping	Java validator + ce_semantic_mapping
Retry decision	Java orchestration

Step 12: Postgres executes named-parameter SQL

Execution

JAVA

// Java executes SQL against Postgres using named params.
var rows = namedParameterJdbcTemplate.queryForList(sqlOutput.getSql(), sqlOutput.getParams());

Current state

Java state	LLM state	DB state
execution complete	not active in this step	business rows returned

Step 13: final answer synthesis

Final Response

intent: SEMANTIC_QUERYstate: COMPLETED

Found 2 requests whose notes text contains cancelled due to duplicate.

Why this SQL was generated (exact reason chain)

Reason chain

Signal	Decided by	SQL effect
`entity = DisconnectRequest`	Interpret-stage LLM	`FROM zp_disco_request r`
`field = notesText`	Interpret-stage LLM	need mapped notes column
`notesText -> zp_disco_trans_data.notes_text`	ce_semantic_mapping	`WHERE t.notes_text ...`
shared request key mapping	ce_semantic_mapping	`JOIN ... ON r.zp_request_id = t.zp_request_id`
duplicate risk memory	ce_semantic_query_failures	`SELECT DISTINCT`
read-only policy	Java guardrail + deny rules	no update/delete SQL allowed

If output is wrong: who decided it

Root-cause map

Observed issue	Primary decider	First place to debug
Wrong entity	Interpret-stage LLM	interpret prompt + `ce_semantic_concept`/`ce_semantic_synonym`
Wrong field in intent	Interpret-stage LLM	intent JSON + `ce_semantic_mapping` coverage
Missing join	Query-stage LLM	SQL prompt + mapping availability
Invalid column blocked	Java validator (correctly)	mapping completeness
Unexpected empty rows	Business data/filter	executed SQL + runtime params
Clarification asked unexpectedly	Interpret stage confidence logic	ambiguity options + intent confidence

Prompt suite (concise)

Interpret prompt shape

TEXT

Goal: return CanonicalIntent JSON only.
Inputs: question + allowed entities/classes/fields/synonyms.
Constraint: choose only from allowed semantic model terms.
Output: entity, queryClass, filters, confidence, needsClarification.

SQL prompt shape

TEXT

Goal: compile read-only SQL from CanonicalIntent.
Inputs: canonicalIntent + mapping + relationship context + failure hints.
Constraint: named params only, read-only only, mapped columns only.
Output: sql, params, needsClarification, unsupported.

Retry prompt shape

TEXT

Goal: fix previous invalid SQL.
Inputs: previous SQL + exact validation error + allowed mapping.
Constraint: preserve intent meaning while fixing legality.
Output: corrected sql + params.

Drill set (clean, unique, no repetition)

Drill 1: clarification path

User

show status for request R1001

Semantic Agent

intent: SEMANTIC_QUERYstate: CLARIFICATION

Do you mean request status or downstream order status?

Drill 2: count path

User

count failed requests in last 24 hours

Semantic Agent

intent: SEMANTIC_QUERYstate: SEMANTIC_QUERY

Resolved queryClass=COUNT and compiled aggregate SQL.

Drill 3: unsupported write intent

User

delete duplicate requests

Semantic Agent

intent: SEMANTIC_QUERYstate: BLOCKED

Write operation rejected. Semantic runtime is read-only.

Drill 4: retry recovery

User

find notes containing duplicate for recent requests

Semantic Agent

intent: SEMANTIC_QUERYstate: RETRY

Attempt 1 failed validation; attempt 2 succeeded after failure-memory hint.

Final checklist

Read this page and you should now be able to

Capability	Outcome
Trace one question end-to-end	Know exactly where meaning and SQL were decided
Debug wrong output quickly	Jump to the right stage/table without guesswork
Explain ownership clearly	Java orchestrates, LLM interprets/compiles, DB constrains
Justify final SQL	Tie each SQL clause to intent + mapping + guardrails

Lock this model

Java orchestrates. LLM interprets and compiles. DB semantic tables constrain and validate. Postgres executes. The answer comes from rows, not guesswork.

Pipeline Architecture​

Minimal anchors​

Chat + Trace

Canonical Intent + SQL

Config + Table Dependencies

Prompt Suite

Running example​

Runtime tables used in this trace (sample rows)​

ce_semantic_concept (10 sample rows)​

ce_semantic_concept

ce_semantic_synonym (10 sample rows)​

ce_semantic_synonym

ce_semantic_query_class (10 sample rows)​

ce_semantic_query_class

ce_semantic_mapping (10 sample rows)​

ce_semantic_mapping

ce_semantic_query_failures (10 sample rows)​

ce_semantic_query_failures

Step-by-step deep trace (who does what)​

Step 1: question enters runtime​

Current state

Step 2: planner chooses tool sequence​

Who did what

Step 3: Java loads semantic metadata for interpret stage​

Tables read now

Step 4: Java builds interpret prompt​

Who did what

Step 5: LLM interpret call happens​

Step 6: Java validates interpret output​

Current state

Step 7: CanonicalIntent becomes official runtime contract​

Step 8: Java loads query-stage metadata​

Tables read now

Step 9: Java builds SQL prompt from CanonicalIntent + metadata​

Who decided what so far

Step 10: LLM query call generates SQL candidate​

Step 11: Java validates SQL candidate​

Validation responsibility

Step 12: Postgres executes named-parameter SQL​

Current state

Step 13: final answer synthesis​

Why this SQL was generated (exact reason chain)​

Reason chain

If output is wrong: who decided it​

Root-cause map

Prompt suite (concise)​

Drill set (clean, unique, no repetition)​

Drill 1: clarification path​

Drill 2: count path​

Drill 3: unsupported write intent​

Drill 4: retry recovery​

Final checklist​

Read this page and you should now be able to

Quick links​

Pipeline Architecture

Minimal anchors

Running example

Runtime tables used in this trace (sample rows)

`ce_semantic_concept` (10 sample rows)

`ce_semantic_synonym` (10 sample rows)

`ce_semantic_query_class` (10 sample rows)

`ce_semantic_mapping` (10 sample rows)

`ce_semantic_query_failures` (10 sample rows)

Step-by-step deep trace (who does what)

Step 1: question enters runtime

Step 2: planner chooses tool sequence

Step 3: Java loads semantic metadata for interpret stage

Step 4: Java builds interpret prompt

Step 5: LLM interpret call happens

Step 6: Java validates interpret output

Step 7: CanonicalIntent becomes official runtime contract

Step 8: Java loads query-stage metadata

Step 9: Java builds SQL prompt from CanonicalIntent + metadata

Step 10: LLM query call generates SQL candidate

Step 11: Java validates SQL candidate

Step 12: Postgres executes named-parameter SQL

Step 13: final answer synthesis

Why this SQL was generated (exact reason chain)

If output is wrong: who decided it

Prompt suite (concise)

Drill set (clean, unique, no repetition)

Drill 1: clarification path

Drill 2: count path

Drill 3: unsupported write intent

Drill 4: retry recovery

Final checklist

Quick links