-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
When no alias is specified for a column in a SQL query, the column name in the result set is the raw expression (e.g., COUNT(*), SUM(quantity), unit_price * quantity). This can cause issues with downstream tools that expect simple, unique column names.
Example
SELECT COUNT(*), SUM(quantity), unit_price * quantity FROM ecommerceCurrent result columns: COUNT(*), SUM(quantity), unit_price * quantity
Expected Behaviour
When no alias is provided, generate a positional alias based on the column's position in the result set:
__c1, __c2, __c3, ...
This ensures:
- Column names are always valid identifiers
- Column names are unique within the result set
- Downstream tools (Superset, DuckDB, JDBC clients) can reference columns unambiguously
Notes
- This should only apply when the user has not specified an explicit
ASalias - The naming convention
__c<N>(double underscore prefix) avoids collisions with user-defined column names - Named columns (e.g.,
SELECT name, age FROM ...) should retain their original names
Resolution
Added fieldsWithComputedAliases as a separate lazy val in Select that generates positional aliases __c1, __c2, ... for expression fields (aggregates, arithmetic, transforms, window functions) without explicit AS aliases. The original fields property is unchanged, preserving SQL round-trip safety.
Detection rule
A field gets a computed alias when fieldAlias.isEmpty && identifier.functions.nonEmpty.
Strategy: separate property, not mutation
fieldsWithComputedAliases is used only where output names matter (aggregation naming, script fields, output column names). Select.sql uses original fields — no AS __cN injected into SQL representation.
Additional fix: dotted field names in GROUP BY
Removed the historical replace(".", "_") in Bucket.path — ES allows dots in aggregation names. This ensures dotted fields like profile.city keep their original name as the bucket key, matching the output column name from extractOutputFieldNames.
Modified files
| File | Change |
|---|---|
sql/.../query/Select.scala |
Added fieldsWithComputedAliases, updated fieldAliases to use it |
sql/.../query/package.scala |
Use fieldsWithComputedAliases in aggregates, scriptFields, windowFields, nestedHitsMappings |
sql/.../query/GroupBy.scala |
Removed replace(".", "_") from Bucket.path |
core/.../client/SearchApi.scala |
Use fieldsWithComputedAliases in extractOutputFieldNames |
Tests
- Parser unit tests:
ParserSpec— 4 new tests covering computed aliases for aggregates, mixed named/unnamed fields, explicit alias preservation, arithmetic expressions - Integration test:
ReplGatewayIntegrationSpec— end-to-end test withprofile.city,COUNT(*),AVG(age)without aliases, verifying__c2and__c3column names