Cube DSL Reference

Every keyword and attribute of the DataPallas Cube DSL — dimensions, measures, joins, segments, hierarchies — with examples.


Table of Contents

Top-Level Cube Block

Every cube starts with a single cube { ... } block. Inside, only one sql_table is required; everything else is optional.

cube {
  sql_table 'public.orders'              // required — the main fact table
  title 'Orders'                         // optional — display name in UI
  description 'Customer orders'          // optional — shown in pickers and AI prompts
 
  // dimension { ... }
  // measure   { ... }
  // join      { ... }
  // segment   { ... }
  // hierarchy { ... }
}
AttributeRequiredDescription
sql_tableYesThe main table the cube is built on. Use schema-qualified names (public.orders) when your database has multiple schemas.
titleNoHuman-readable label shown in <rb-cube-renderer> and in cube pickers. Defaults to the cube's name.
descriptionNoFree-text explanation. AI consumers (Athena) read this — write it as if explaining to a new analyst.

Dimensions

A dimension is a column you can slice or group by. One dimension { ... } block per attribute.

dimension { name 'status'; title 'Order Status'; sql 'status'; type 'string' }
dimension { name 'created_at'; title 'Created At'; sql 'created_at'; type 'time' }
dimension { name 'order_id'; sql 'id'; type 'number'; primary_key true }
AttributeRequiredValuesDescription
nameYesidentifierInternal name. Lowercase + underscores. Must be unique within the cube.
titleNostringDisplay label in <rb-cube-renderer>. Defaults to name.
descriptionNostringTooltip / AI hint.
sqlYesSQL expressionThe column or expression that produces the value. Can reference joined tables (customers.country) or use SQL functions (UPPER(status)).
typeYesstring | number | time | booleanDrives formatting and which operators are valid (date pickers for time, sum/avg only for number, etc.).
primary_keyNotrueMarks the row identifier. Required for joins to work cleanly. Exactly one dimension per cube should have it.

Measures

A measure is an aggregation. Counts, sums, averages, min/max — anything that collapses many rows into one number.

measure { name 'count'; title 'Order Count'; type 'count' }
measure { name 'revenue'; title 'Total Revenue'; sql 'amount'; type 'sum'; format 'currency' }
measure { name 'avg_order'; title 'Average Order'; sql 'amount'; type 'avg'; format 'currency' }
AttributeRequiredValuesDescription
nameYesidentifierInternal name.
titleNostringDisplay label.
descriptionNostringTooltip / AI hint.
sqlConditionalSQL expressionRequired for everything except count and count_distinct-without-column. The column being aggregated.
typeYescount | count_distinct | sum | avg | min | max | numberThe aggregation. number is for arithmetic combinations of other measures.
formatNocurrency | percent | numberRendering hint. The component picks an appropriate format.

Joins

A join declares how the main table connects to a related table. Once joined, you can reference its columns in dimensions (customers.country).

join {
  name 'customers'
  sql '${CUBE}.customer_id = customers.id'
  relationship 'many_to_one'
}
AttributeRequiredValuesDescription
nameYestable nameThe joined table. Used as a prefix in sql expressions.
sqlYesSQL ON clauseThe join condition. Use $\{CUBE\} to reference the main table — see Templating.
relationshipYesmany_to_one | one_to_many | one_to_oneCardinality. Drives whether DataPallas warns about fan-out before producing potentially-misleading aggregates.

Segments

A segment is a reusable, named filter. Think "saved WHERE clause".

segment {
  name 'recent'
  title 'Last 30 Days'
  sql "${CUBE}.created_at >= CURRENT_DATE - INTERVAL '30 days'"
}
 
segment {
  name 'high_value'
  title 'Orders over $1000'
  sql '${CUBE}.amount > 1000'
}
AttributeRequiredDescription
nameYesInternal name.
titleNoDisplay label in the segment chooser.
descriptionNoTooltip.
sqlYesA boolean SQL expression. Combine with AND/OR if needed.

End-users tick segments alongside dimensions and measures in the cube renderer; multiple ticked segments are combined with AND.

Hierarchies

A hierarchy declares a drill-down path through dimensions. The renderer presents these as collapsible groups.

hierarchy {
  name 'geography'
  title 'Customer Geography'
  levels 'country', 'region', 'city'
}
 
hierarchy {
  name 'time'
  title 'Order Time'
  levels 'order_year', 'order_quarter', 'order_month', 'order_day'
}

The values in levels must match name attributes of dimensions defined in the same cube.

Templating: the CUBE Token

The DSL supports two templating conveniences inside any sql expression:

TokenReplaces with
$\{CUBE\}The cube's main sql_table (with the right alias for the current query).
<join_name>.<col>The joined table's column. The name of a join block becomes the prefix.

This keeps the DSL portable — the same cube works whether sql_table 'orders' or sql_table 'public.orders_v2'.

Complete Worked Example

A cube on Northwind-style orders with one join, two segments, two hierarchies:

cube {
  sql_table 'public.orders'
  title 'Orders'
  description 'Customer orders with revenue and geography analysis'
 
  // Primary key
  dimension { name 'order_id'; title 'Order ID'; sql 'id'; type 'number'; primary_key true }
 
  // Main-table dimensions
  dimension { name 'status'; title 'Status'; sql 'status'; type 'string' }
  dimension { name 'created_at'; title 'Created'; sql 'created_at'; type 'time' }
 
  // Joined-table dimensions
  dimension { name 'customer_name'; title 'Customer'; sql 'customers.company_name'; type 'string' }
  dimension { name 'country'; sql 'customers.country'; type 'string' }
  dimension { name 'region'; sql 'customers.region'; type 'string' }
  dimension { name 'city'; sql 'customers.city'; type 'string' }
 
  // Measures
  measure { name 'count'; title 'Order Count'; type 'count' }
  measure { name 'revenue'; title 'Revenue'; sql 'amount'; type 'sum'; format 'currency' }
  measure { name 'avg_order'; title 'Avg Order'; sql 'amount'; type 'avg'; format 'currency' }
 
  // Join
  join { name 'customers'; sql '${CUBE}.customer_id = customers.id'; relationship 'many_to_one' }
 
  // Segments
  segment { name 'recent'; title 'Last 30 Days'; sql "${CUBE}.created_at >= CURRENT_DATE - INTERVAL '30 days'" }
  segment { name 'completed'; title 'Completed Orders'; sql "${CUBE}.status = 'shipped'" }
 
  // Hierarchies
  hierarchy { name 'geography'; title 'Geography'; levels 'country', 'region', 'city' }
}

Tips and Gotchas

  • Match your DB's quoting. PostgreSQL is case-insensitive for unquoted identifiers; SQL Server preserves case. If a column is OrderDate on disk, write it OrderDate (or whatever your driver expects) in the sql attribute.
  • primary_key matters for joins. Without one declared, fan-out warnings won't fire and aggregations across one_to_many joins can silently double-count.
  • Format is a hint, not a converter. format 'currency' does not multiply by 100 or apply a locale — it tells the renderer to display with currency formatting. Make sure the underlying sql already produces the value in the right unit.
  • Don't put business logic in dimensions if you can put it in segments. Dimensions are evaluated for every row; segments only when ticked.
  • Avoid SELECT * in sql_table. Always name a real table or view. Subqueries work but obscure the schema for AI consumers.
  • Use the "Hey AI" button when you don't remember the syntax — it drafts a cube against your live schema given a plain-English description.