Semantic Query AI Tool

Purpose

Use this page to generate high-quality semantic seed SQL (INSERT statements) for ConvEngine by giving an AI agent your real business schema metadata.

Target semantic tables

The AI agent should generate inserts for these tables:

Semantic tables to seed

Table
ce_semantic_join_hint
ce_semantic_value_pattern
ce_semantic_concept
ce_semantic_synonym
ce_semantic_concept_embedding
ce_semantic_mapping
ce_semantic_join_path
ce_semantic_query_class
ce_semantic_ambiguity_option
ce_semantic_query_failures
ce_semantic_entity
ce_semantic_relationship

Agent instruction file (`semantic_query_agent.md`)

Use this markdown as your AI agent instruction file. Paste your extracted schema metadata (from SQL below) under the placeholders.

semantic_query_agent.md

# ConvEngine Semantic SQL Seed Generator

You are generating SQL `INSERT` statements for ConvEngine semantic runtime tables.

## Goal
Generate deterministic, production-safe seed inserts for:
- ce_semantic_join_hint
- ce_semantic_value_pattern
- ce_semantic_concept
- ce_semantic_synonym
- ce_semantic_concept_embedding
- ce_semantic_mapping
- ce_semantic_join_path
- ce_semantic_query_class
- ce_semantic_ambiguity_option
- ce_semantic_query_failures
- ce_semantic_entity
- ce_semantic_relationship

## Business context input
You will receive:
1. Business table list
2. Column metadata (type/nullability/default)
3. PK metadata
4. FK metadata
5. Optional sample values or domain dictionary
6. Business glossary (entity meanings, status code meanings, abbreviations)
7. Query intent examples (real user questions + expected output shape)
8. Join cardinality hints (1:1, 1:N, optional vs mandatory joins)
9. Sensitive/blocked fields and table allow/deny rules
10. Time semantics (created vs updated vs effective timestamps, timezone)
11. Value patterns (ID prefixes/formats and enum normalization)

## Hard rules
1. Generate `INSERT` SQL only (no `DROP`, `DELETE`, `TRUNCATE`, `ALTER`).
2. Respect existing naming conventions used in ConvEngine tables.
3. `ce_semantic_mapping` must map semantic business fields to exact physical `table.column`.
4. Generate join paths only from real FK relationships.
5. Generate synonyms from business vocabulary, abbreviations, and user phrasing.
6. Generate ambiguity options only where multiple fields can plausibly match one phrase.
7. Do not invent columns/tables not present in metadata.
8. Keep SQL idempotent where possible (`ON CONFLICT DO NOTHING` or equivalent strategy section).
9. Use glossary and intent examples to ground `ce_semantic_concept`, `ce_semantic_synonym`, and `ce_semantic_query_class`.
10. Use cardinality + optionality to avoid invalid joins and to prioritize safer join paths.
11. Apply sensitive/blocked-field policy in join hints and ambiguity options.
12. Reflect time semantics in mappings and value patterns (e.g., "last 24h" target timestamp column).

## Output format
Return sections in this exact order:
1. Assumptions
2. Entity model summary
3. Mapping summary (semantic field -> physical table.column)
4. SQL inserts by table (grouped table-by-table)
5. Validation queries (`SELECT` checks)
6. Known gaps / follow-up questions

## Required quality checks before final output
- Every mapped field points to a real `table.column` in metadata.
- Every join path is backed by PK/FK relationship.
- Query classes cover at least LIST, COUNT, DETAIL, STATUS, RECENT.
- Synonyms include business short forms and user wording.
- Value patterns include ID/prefix patterns when available.
- Entity definitions are aligned with business glossary terms and aliases.
- Ambiguity options exist for overloaded words (for example `status`, `id`, `date`).
- Blocked/sensitive fields are not exposed through semantic mappings.
- Time filters map to the correct business timestamp column.

## Input block starts
### BUSINESS_SCHEMA_TABLES
{{PASTE_TABLE_LIST_HERE}}

### BUSINESS_SCHEMA_COLUMNS
{{PASTE_COLUMN_METADATA_HERE}}

### BUSINESS_SCHEMA_PK
{{PASTE_PK_METADATA_HERE}}

### BUSINESS_SCHEMA_FK
{{PASTE_FK_METADATA_HERE}}

### OPTIONAL_SAMPLE_VALUES
{{PASTE_OPTIONAL_SAMPLE_VALUES_HERE}}

### BUSINESS_GLOSSARY
{{PASTE_BUSINESS_GLOSSARY_HERE}}

### QUERY_INTENT_EXAMPLES
{{PASTE_QUERY_INTENT_EXAMPLES_HERE}}

### JOIN_CARDINALITY_HINTS
{{PASTE_JOIN_CARDINALITY_HINTS_HERE}}

### SENSITIVE_OR_BLOCKED_FIELDS
{{PASTE_SENSITIVE_OR_BLOCKED_FIELDS_HERE}}

### TIME_SEMANTICS
{{PASTE_TIME_SEMANTICS_HERE}}

### VALUE_PATTERNS
{{PASTE_VALUE_PATTERNS_HERE}}

Minimum vs recommended

Minimum to start: tables + columns + PK + FK.
Recommended for production-quality seeds: add glossary, intent examples, cardinality, sensitive-field policy, time semantics, and value patterns.

Filled example (`semantic_query_agent_filled_example.md`)

semantic_query_agent_filled_example.md

# ConvEngine Semantic SQL Seed Generator

You are generating SQL `INSERT` statements for ConvEngine semantic runtime tables.

## Goal
Generate deterministic, production-safe seed inserts for:
- ce_semantic_join_hint
- ce_semantic_value_pattern
- ce_semantic_concept
- ce_semantic_synonym
- ce_semantic_concept_embedding
- ce_semantic_mapping
- ce_semantic_join_path
- ce_semantic_query_class
- ce_semantic_ambiguity_option
- ce_semantic_query_failures
- ce_semantic_entity
- ce_semantic_relationship

## Business context input
You will receive:
1. Business table list
2. Column metadata (type/nullability/default)
3. PK metadata
4. FK metadata
5. Optional sample values or domain dictionary
6. Business glossary (entity meanings, status code meanings, abbreviations)
7. Query intent examples (real user questions + expected output shape)
8. Join cardinality hints (1:1, 1:N, optional vs mandatory joins)
9. Sensitive/blocked fields and table allow/deny rules
10. Time semantics (created vs updated vs effective timestamps, timezone)
11. Value patterns (ID prefixes/formats and enum normalization)

## Hard rules
1. Generate `INSERT` SQL only (no `DROP`, `DELETE`, `TRUNCATE`, `ALTER`).
2. Respect existing naming conventions used in ConvEngine tables.
3. `ce_semantic_mapping` must map semantic business fields to exact physical `table.column`.
4. Generate join paths only from real FK relationships.
5. Generate synonyms from business vocabulary, abbreviations, and user phrasing.
6. Generate ambiguity options only where multiple fields can plausibly match one phrase.
7. Do not invent columns/tables not present in metadata.
8. Keep SQL idempotent where possible (`ON CONFLICT DO NOTHING` or equivalent strategy section).
9. Use glossary and intent examples to ground `ce_semantic_concept`, `ce_semantic_synonym`, and `ce_semantic_query_class`.
10. Use cardinality + optionality to avoid invalid joins and to prioritize safer join paths.
11. Apply sensitive/blocked-field policy in join hints and ambiguity options.
12. Reflect time semantics in mappings and value patterns (e.g., \"last 24h\" target timestamp column).

## Output format
Return sections in this exact order:
1. Assumptions
2. Entity model summary
3. Mapping summary (semantic field -> physical table.column)
4. SQL inserts by table (grouped table-by-table)
5. Validation queries (`SELECT` checks)
6. Known gaps / follow-up questions

## Required quality checks before final output
- Every mapped field points to a real `table.column` in metadata.
- Every join path is backed by PK/FK relationship.
- Query classes cover at least LIST, COUNT, DETAIL, STATUS, RECENT.
- Synonyms include business short forms and user wording.
- Value patterns include ID/prefix patterns when available.
- Entity definitions are aligned with business glossary terms and aliases.
- Ambiguity options exist for overloaded words (for example `status`, `id`, `date`).
- Blocked/sensitive fields are not exposed through semantic mappings.
- Time filters map to the correct business timestamp column.

## Input block starts
### BUSINESS_SCHEMA_TABLES
public.zp_disco_request
public.zp_disco_trans_data
public.zp_disconnect_order

### BUSINESS_SCHEMA_COLUMNS
public.zp_disco_request.zp_request_id (varchar, not null)
public.zp_disco_request.status (varchar, not null)
public.zp_disco_request.created_ts (timestamp, not null)
public.zp_disco_trans_data.zp_request_id (varchar, not null)
public.zp_disco_trans_data.notes_text (text, null)
public.zp_disconnect_order.order_id (varchar, not null)

### BUSINESS_SCHEMA_PK
zp_disco_request: (zp_request_id)
zp_disco_trans_data: (zp_request_id, created_ts)
zp_disconnect_order: (order_id)

### BUSINESS_SCHEMA_FK
zp_disco_trans_data.zp_request_id -> zp_disco_request.zp_request_id
zp_disco_request.disconnect_order_id -> zp_disconnect_order.order_id

### OPTIONAL_SAMPLE_VALUES
status: FAILED, ERROR, SUCCESS
notes_text examples: \"cancelled due to duplicate\", \"customer requested cancel\"

### BUSINESS_GLOSSARY
Disconnect Request: customer-level disconnect workflow request
Disconnect Transaction Data: event/history payload for request lifecycle
DON: Disconnect Order Number

### QUERY_INTENT_EXAMPLES
\"find failed requests in last 24 hours\" -> list rows
\"count requests cancelled due to duplicate\" -> count
\"show status of request R1001\" -> detail/status

### JOIN_CARDINALITY_HINTS
zp_disco_request (1) -> (N) zp_disco_trans_data by zp_request_id
zp_disconnect_order (1) -> (N) zp_disco_request by order_id

### SENSITIVE_OR_BLOCKED_FIELDS
Do not expose: customer_phone, customer_email, ssn
Deny write operations: INSERT/UPDATE/DELETE

### TIME_SEMANTICS
\"last 24 hours\" maps to zp_disco_request.created_ts
Timezone: America/Chicago

### VALUE_PATTERNS
^DON[0-9]+$ -> disconnect_order_id
^R[0-9]+$ -> zp_request_id
\"failed\" -> status IN ('FAILED','ERROR')

Extract schema metadata from DB

PostgreSQL (DBeaver / psql)

Postgres: tables

SQL

select
  t.table_schema, t.table_name,
  t.table_type
from information_schema.tables t
where t.table_schema NOT IN ('pg_catalog', 'information_schema')
order by t.table_schema, t.table_name;

Postgres: columns

SQL

select
  c.table_schema, c.table_name,
  c.ordinal_position, c.column_name,
  c.data_type, c.is_nullable,
  c.column_default, c.character_maximum_length,
  c.numeric_precision, c.numeric_scale
from information_schema.columns c
where c.table_schema NOT IN ('pg_catalog', 'information_schema')
order by c.table_schema, c.table_name, c.ordinal_position;

Postgres: primary keys

SQL

select
  tc.table_schema, tc.table_name,
  kcu.column_name, kcu.ordinal_position AS key_ordinal
from information_schema.table_constraints tc
  JOIN information_schema.key_column_usage kcu
    on tc.constraint_name = kcu.constraint_name AND tc.table_schema = kcu.table_schema
where tc.constraint_type = 'PRIMARY KEY'
order by tc.table_schema, tc.table_name, kcu.ordinal_position;

Postgres: foreign keys

SQL

select
  tc.table_schema AS fk_table_schema, tc.table_name AS fk_table_name,
  kcu.column_name AS fk_column_name, ccu.table_schema AS pk_table_schema,
  ccu.table_name AS pk_table_name, ccu.column_name AS pk_column_name,
  tc.constraint_name AS fk_constraint_name
from information_schema.table_constraints tc
  JOIN information_schema.key_column_usage kcu
    on tc.constraint_name = kcu.constraint_name AND tc.table_schema = kcu.table_schema
  JOIN information_schema.constraint_column_usage ccu
    on ccu.constraint_name = tc.constraint_name AND ccu.table_schema = tc.table_schema
where tc.constraint_type = 'FOREIGN KEY'
order by fk_table_schema, fk_table_name, fk_constraint_name;

Oracle (SQL Developer)

Oracle: tables

SQL

select
  owner, table_name,
  tablespace_name, num_rows
from all_tables
where owner = UPPER(:schema_owner)
order by owner, table_name;

Oracle: columns

SQL

select
  owner, table_name,
  column_id, column_name,
  data_type, data_length,
  data_precision, data_scale,
  nullable, data_default
from all_tab_columns
where owner = UPPER(:schema_owner)
order by owner, table_name, column_id;

Oracle: primary keys

SQL

select
  c.owner, c.table_name,
  cc.column_name, cc.position,
  c.constraint_name
from all_constraints c
  JOIN all_cons_columns cc
    on c.owner = cc.owner AND c.constraint_name = cc.constraint_name
where c.constraint_type = 'P' AND c.owner = UPPER(:schema_owner)
order by c.owner, c.table_name, c.constraint_name, cc.position;

Oracle: foreign keys

SQL

select
  c.owner AS fk_owner, c.table_name AS fk_table_name,
  cc.column_name AS fk_column_name, r.owner AS pk_owner,
  r.table_name AS pk_table_name, rcc.column_name AS pk_column_name,
  c.constraint_name AS fk_constraint_name, cc.position
from all_constraints c
  JOIN all_cons_columns cc
    on c.owner = cc.owner AND c.constraint_name = cc.constraint_name
  JOIN all_constraints r
    on c.r_owner = r.owner AND c.r_constraint_name = r.constraint_name
  JOIN all_cons_columns rcc
    on r.owner = rcc.owner AND r.constraint_name = rcc.constraint_name AND cc.position = rcc.position
where c.constraint_type = 'R' AND c.owner = UPPER(:schema_owner)
order by fk_owner, fk_table_name, fk_constraint_name, cc.position;

How to use with ChatGPT / Codex / Copilot

Execution workflow

Step	Action
1	Run the schema metadata SQL in DBeaver or SQL Developer
2	Paste outputs into `semantic_query_agent.md` placeholders
3	Give the agent file + metadata to your AI tool
4	Ask AI to generate grouped `INSERT` SQL for all semantic tables
5	Review mapping/join correctness and run validation selects
6	Apply inserts and test real user questions through ConvEngine

Prompt to AI assistant

TEXT

Use the attached semantic_query_agent.md instructions and the schema metadata I provided.
Generate complete SQL INSERT statements for:
ce_semantic_join_hint,
ce_semantic_value_pattern,
ce_semantic_concept,
ce_semantic_synonym,
ce_semantic_concept_embedding,
ce_semantic_mapping,
ce_semantic_join_path,
ce_semantic_query_class,
ce_semantic_ambiguity_option,
ce_semantic_query_failures,
ce_semantic_entity,
ce_semantic_relationship.

Constraints:
- Do not invent tables/columns
- Use only metadata provided
- Group SQL by table
- Include validation SELECT queries
- Keep output deterministic and runnable

Validation SQL after seeding

Quick coverage check

SQL

select
  'ce_semantic_concept' AS table_name, COUNT(*) AS row_count
from ce_semantic_concept UNION ALL SELECT 'ce_semantic_synonym', COUNT(*) FROM ce_semantic_synonym UNION ALL SELECT 'ce_semantic_mapping', COUNT(*) FROM ce_semantic_mapping UNION ALL SELECT 'ce_semantic_query_class', COUNT(*) FROM ce_semantic_query_class UNION ALL SELECT 'ce_semantic_entity', COUNT(*) FROM ce_semantic_entity UNION ALL SELECT 'ce_semantic_relationship', COUNT(*) FROM ce_semantic_relationship UNION ALL SELECT 'ce_semantic_join_hint', COUNT(*) FROM ce_semantic_join_hint UNION ALL SELECT 'ce_semantic_value_pattern', COUNT(*) FROM ce_semantic_value_pattern UNION ALL SELECT 'ce_semantic_join_path', COUNT(*) FROM ce_semantic_join_path UNION ALL SELECT 'ce_semantic_ambiguity_option', COUNT(*) FROM ce_semantic_ambiguity_option UNION ALL SELECT 'ce_semantic_query_failures', COUNT(*) FROM ce_semantic_query_failures;

Outcome

Once these tables are seeded from your real business schema, ConvEngine semantic query can resolve user intent and compile much more accurate SQL for runtime questions.

Target semantic tables​

Semantic tables to seed

Agent instruction file (semantic_query_agent.md)​

Filled example (semantic_query_agent_filled_example.md)​

Extract schema metadata from DB​

PostgreSQL (DBeaver / psql)​

Oracle (SQL Developer)​

How to use with ChatGPT / Codex / Copilot​

Execution workflow

Validation SQL after seeding​

Quick links​