Commit 5fb072e
[SPARK-54285][PYTHON] Cache timezone info to avoid expensive timestamp conversion
### What changes were proposed in this pull request?
We cache the tzinfo on local machine for timestamp conversion to avoid extra latency for calling `datetime.datetime.fromtimestamp()`
### Why are the changes needed?
In Python, a forked process on Unix (that uses glibc I believe) will have a bad lock/cache state for timezone, which result in a extremely slow `datetime.datetime.from_timestamp()` (2000x slower on my machine).
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
It was tested locally by hand to confirm the timestamp result is the same and the performance is normal.
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #52980 from gaogaotiantian/fix-timestamp-convert.
Authored-by: Tian Gao <gaogaotiantian@hotmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>1 parent d823ccf commit 5fb072e
2 files changed
+19
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
441 | 441 | | |
442 | 442 | | |
443 | 443 | | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
444 | 450 | | |
445 | 451 | | |
446 | 452 | | |
| |||
454 | 460 | | |
455 | 461 | | |
456 | 462 | | |
457 | | - | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
458 | 469 | | |
459 | 470 | | |
460 | 471 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
71 | 72 | | |
72 | 73 | | |
73 | 74 | | |
74 | | - | |
| 75 | + | |
75 | 76 | | |
76 | 77 | | |
77 | 78 | | |
| |||
3302 | 3303 | | |
3303 | 3304 | | |
3304 | 3305 | | |
| 3306 | + | |
| 3307 | + | |
| 3308 | + | |
| 3309 | + | |
| 3310 | + | |
3305 | 3311 | | |
3306 | 3312 | | |
3307 | 3313 | | |
| |||
0 commit comments