Skip to content

Commit fc49dbd

Browse files
zhengruifengLuciferYang
authored andcommitted
[SPARK-54211][PYTHON][FOLLOW-UP] Fix doctests of mapInArrow
### What changes were proposed in this pull request? Fix doctests of mapInArrow ### Why are the changes needed? to make CI happy ``` batch.filter(pa.compute.field("id") == 1) ``` the expression input `pa.compute.field("id") == 1` is supported since pyarrow 17.0 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? PR builder with ``` default: '{"PYSPARK_IMAGE_TO_TEST": "python-minimum", "PYTHON_TO_TEST": "python3.10"}' ``` see https://github.com/zhengruifeng/spark/actions/runs/19222092639/job/54941916951 ### Was this patch authored or co-authored using generative AI tooling? NO Closes #52965 from zhengruifeng/fix_map_in_arrow_doctest. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: yangjie01 <yangjie01@baidu.com>
1 parent 21c3122 commit fc49dbd

File tree

2 files changed

+16
-2
lines changed

2 files changed

+16
-2
lines changed

python/pyspark/sql/classic/dataframe.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1976,6 +1976,12 @@ def _test() -> None:
19761976
if not have_pyarrow:
19771977
del pyspark.sql.dataframe.DataFrame.toArrow.__doc__
19781978
del pyspark.sql.dataframe.DataFrame.mapInArrow.__doc__
1979+
else:
1980+
import pyarrow as pa
1981+
from pyspark.loose_version import LooseVersion
1982+
1983+
if LooseVersion(pa.__version__) < LooseVersion("17.0.0"):
1984+
del pyspark.sql.dataframe.DataFrame.mapInArrow.__doc__
19791985

19801986
spark = (
19811987
SparkSession.builder.master("local[4]").appName("sql.classic.dataframe tests").getOrCreate()

python/pyspark/sql/connect/dataframe.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2363,12 +2363,20 @@ def _test() -> None:
23632363
del pyspark.sql.dataframe.DataFrame.rdd.__doc__
23642364

23652365
if not have_pandas or not have_pyarrow:
2366-
del pyspark.sql.dataframe.DataFrame.toArrow.__doc__
23672366
del pyspark.sql.dataframe.DataFrame.toPandas.__doc__
2368-
del pyspark.sql.dataframe.DataFrame.mapInArrow.__doc__
23692367
del pyspark.sql.dataframe.DataFrame.mapInPandas.__doc__
23702368
del pyspark.sql.dataframe.DataFrame.pandas_api.__doc__
23712369

2370+
if not have_pyarrow:
2371+
del pyspark.sql.dataframe.DataFrame.toArrow.__doc__
2372+
del pyspark.sql.dataframe.DataFrame.mapInArrow.__doc__
2373+
else:
2374+
import pyarrow as pa
2375+
from pyspark.loose_version import LooseVersion
2376+
2377+
if LooseVersion(pa.__version__) < LooseVersion("17.0.0"):
2378+
del pyspark.sql.dataframe.DataFrame.mapInArrow.__doc__
2379+
23722380
globs["spark"] = (
23732381
PySparkSession.builder.appName("sql.connect.dataframe tests")
23742382
.remote(os.environ.get("SPARK_CONNECT_TESTING_REMOTE", "local[4]"))

0 commit comments

Comments
 (0)