Skip to content

Conversation

@asl3
Copy link
Contributor

@asl3 asl3 commented Dec 29, 2025

What changes were proposed in this pull request?

Add doc about arrow by default enablement in Spark 4.2, for this page: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.udf.html

Also add an example specifying how to opt out of arrow optimization, on a per-UDF and per-session level.

Why are the changes needed?

In Spark 4.2.0, we will enable arrow-optimization for Python UD(T)Fs by default. (see: SPARK-54555). Docs should be updated to note the change and include more code examples.

Does this PR introduce any user-facing change?

No, this is a documentation-only update.

How was this patch tested?

Docs build tests

Was this patch authored or co-authored using generative AI tooling?

No

@asl3 asl3 changed the title [SPARK-54859] Arrow by default API reference doc for PySpark UDFs (Spark 4.2) [SPARK-54859] Arrow by default PySpark UDF API reference doc Dec 29, 2025
@asl3 asl3 changed the title [SPARK-54859] Arrow by default PySpark UDF API reference doc [SPARK-54859][PYTHON] Arrow by default PySpark UDF API reference doc Dec 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants