AI & Cubes

Why pointing Athena at a cube is safer, more accurate, and more privacy-preserving than letting AI write raw SQL against your database.

Why Cubes Make AI Better
Privacy: What the AI Sees
Asking Questions Through a Cube
Writing Cube DSL With AI
Best Practices

Why Cubes Make AI Better

When you ask an AI assistant "what was our revenue last quarter?" against a raw database, the model has to:

Guess which table holds revenue (orders? invoices? line_items? payments?)
Guess which column is the amount (amount? total? gross_total? net_total?)
Guess whether to include refunds, taxes, currency conversion
Hope all those guesses match how your business actually defines revenue

Even with a perfect schema, this gets wrong answers more often than people admit. And you don't see that it's wrong — you just see a number.

A cube fixes this by giving the AI a vocabulary that's already correct. When revenue is a named measure with one definition, the AI doesn't guess — it picks revenue from the cube and the SQL is generated from your reviewed, named, versioned definition. Same number every time, same number across every consumer.

Privacy: What the AI Sees

DataPallas's data-privacy stance is that the AI never sees your data — only your schema and metadata. Cubes strengthen this:

What the AI sees	When raw SQL	When using a cube
Full database schema (every table, column, type)	Yes	No — only the cube's declared dimensions and measures
Joined-table column lists	Yes	No — only the cube's joined-table aliases
Free-form column names that may hint at PII	Yes	No — the cube's `title` and `description` fields are what the AI reads
Row data	No	No

A cube is effectively a redacted view of your schema specifically for AI. You decide what's in it — exclude sensitive tables, rename columns to user-friendly titles, write descriptions in business language — and the AI works only with that surface.

Asking Questions Through a Cube

In the Chat2DB AI panel, after you've created cubes, you'll see them listed alongside raw database connections. Pick a cube as the context and ask questions in plain English:

"Show monthly revenue trend for the last 6 months, broken down by country."

The AI translates this into a cube selection — dimensions: order_date, country, measures: revenue, segments: recent_6_months — and the cube generates the SQL. You see both the question, the cube selection, and the resulting chart side by side. If the answer surprises you, you can drill into the cube selection itself rather than re-reading 40 lines of SQL.

📷 Screenshot placeholder: 200_40_cube-in-chat.png — Athena chat answering a natural-language question by picking fields on a cube, with the result chart and the cube selection visible.

Writing Cube DSL With AI

Cubes are not only consumed by AI — they're also created with AI. Inside the Cube editor, the Hey AI, Help Me… button:

Reads the live database schema for the selected connection
Lets you describe the cube you want in natural language ("a cube on orders with revenue by region and time-based segments")
Drafts the Groovy DSL for you, ready to review and tweak

This is by far the fastest way to get the first draft of a cube right. Review the generated DSL against the DSL Reference, adjust titles and descriptions to match your business language, and save.

📷 Screenshot placeholder: 200_45_cube-ai-help.png — The "Hey AI, Help Me…" dialog showing a natural-language prompt and the generated cube DSL.

Best Practices