Skip to content

Rewrite VARBINARY to BINARY in SparkSqlRewriter#593

Open
c-h-afzal wants to merge 1 commit intolinkedin:masterfrom
c-h-afzal:afzal/fix-varbinary-cast-spark-dialect
Open

Rewrite VARBINARY to BINARY in SparkSqlRewriter#593
c-h-afzal wants to merge 1 commit intolinkedin:masterfrom
c-h-afzal:afzal/fix-varbinary-cast-spark-dialect

Conversation

@c-h-afzal
Copy link
Copy Markdown

Calcite's type system uses VARBINARY for binary data, but Spark's SQL parser does not recognize VARBINARY as a valid cast target — it uses BINARY instead. When Hive views call base64() on string columns, Calcite inserts an implicit CAST(... AS VARBINARY) which produces unparseable Spark SQL, causing CoralSpark translation failures and downstream function registry poisoning via the DaliSpark Hive fallback path.

This follows the same pattern already used by TrinoSqlRewriter, which rewrites BINARY/VARBINARY to Trino's VARBINARY in its convertTypeSpec method. Here we do the equivalent for Spark: rewrite VARBINARY to BINARY in SparkSqlRewriter.visit(SqlDataTypeSpec).

Verified existing test cases pass and added a new test case.

@c-h-afzal c-h-afzal changed the title Rewrite VARBINARY to BINARY in SparkSqlRewriter (incident-10814) Rewrite VARBINARY to BINARY in SparkSqlRewriter Apr 9, 2026
Calcite's type system uses VARBINARY for binary data, but Spark's SQL
parser does not recognize VARBINARY as a valid cast target — it uses
BINARY instead. When Hive views call base64() on string columns, Calcite
inserts an implicit CAST(... AS VARBINARY) which produces unparseable
Spark SQL, causing CoralSpark translation failures and downstream
function registry poisoning via the DaliSpark Hive fallback path.

This follows the same pattern already used by TrinoSqlRewriter, which
rewrites BINARY/VARBINARY to Trino's VARBINARY in its convertTypeSpec
method. Here we do the equivalent for Spark: rewrite VARBINARY to BINARY
in SparkSqlRewriter.visit(SqlDataTypeSpec).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@c-h-afzal c-h-afzal force-pushed the afzal/fix-varbinary-cast-spark-dialect branch from a4e1adf to f137382 Compare April 11, 2026 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants