[SPARK-54859][PYTHON] Arrow by default PySpark UDF API reference doc #53632
+25
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Add doc about arrow by default enablement in Spark 4.2, for this page: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.udf.html
Also add an example specifying how to opt out of arrow optimization, on a per-UDF and per-session level.
Why are the changes needed?
In Spark 4.2.0, we will enable arrow-optimization for Python UD(T)Fs by default. (see: SPARK-54555). Docs should be updated to note the change and include more code examples.
Does this PR introduce any user-facing change?
No, this is a documentation-only update.
How was this patch tested?
Docs build tests
Was this patch authored or co-authored using generative AI tooling?
No