Skip to content

Conversation

@Alvaro-Kothe
Copy link
Member


This patch decrements all references obtained from PyObject_GetAttrString to fix the memory leak.

I used this script to reproduce the memory leak:

import pandas as pd

for _ in range(10_000):
    df = pd.DataFrame({'col1': [12.34]}, 
                      index=pd.date_range('1/1/2019', '10/1/2019', freq="D", tz="UTC"))
    result = df.reset_index().to_json()

and pinpointed the problem with valgrind, which have shown this:

==59316== 87,674,752 bytes in 2,739,836 blocks are definitely lost in loss record 32,561 of 32,561
==59316==    at 0x4842B26: malloc (vg_replace_malloc.c:446)
==59316==    by 0x49AC853: UnknownInlinedFun (obmalloc.c:62)
==59316==    by 0x49AC853: UnknownInlinedFun (obmalloc.c:982)
==59316==    by 0x49AC853: UnknownInlinedFun (obmalloc.c:2238)
==59316==    by 0x49AC853: UnknownInlinedFun (obmalloc.c:1400)
==59316==    by 0x49AC853: UnknownInlinedFun (longobject.c:209)
==59316==    by 0x49AC853: PyLong_FromLong (longobject.c:305)
==59316==    by 0x43559F5C: __pyx_getprop_6pandas_5_libs_6tslibs_10timestamps_10_Timestamp_year (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/tslibs/timestamps.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x49E765C: _PyObject_GenericGetAttrWithDict (object.c:1665)
==59316==    by 0x49B8B4A: UnknownInlinedFun (object.c:1751)
==59316==    by 0x49B8B4A: PyObject_GetAttr (object.c:1261)
==59316==    by 0x49F6741: PyObject_GetAttrString (object.c:1131)
==59316==    by 0x42F9C027: convert_pydatetime_to_datetimestruct (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/pandas_datetime.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x42F9C2C1: PyDateTimeToEpoch (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/pandas_datetime.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x52ACA92D: Object_beginTypeContext (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/json.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x52ACCB9C: encode (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/json.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x52ACCE41: encode (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/json.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x52ACCE41: encode (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/json.cpython-313-x86_64-linux-gnu.so)

@mroeschke mroeschke added this to the 3.0 milestone Aug 28, 2025
@mroeschke mroeschke added Performance Memory or execution speed performance IO JSON read_json, to_json, json_normalize labels Aug 28, 2025
@mroeschke mroeschke merged commit 0e21777 into pandas-dev:main Aug 28, 2025
46 checks passed
@mroeschke
Copy link
Member

Thanks @Alvaro-Kothe

@swt2c
Copy link
Contributor

swt2c commented Sep 4, 2025

@meeseeksdev backport 2.3.x

@lumberbot-app
Copy link

lumberbot-app bot commented Sep 4, 2025

Awww, sorry swt2c you do not seem to be allowed to do that, please ask a repository maintainer.

swt2c pushed a commit to swt2c/pandas that referenced this pull request Sep 4, 2025
jorisvandenbossche pushed a commit that referenced this pull request Sep 5, 2025
@jorisvandenbossche jorisvandenbossche modified the milestones: 3.0, 2.3.3 Sep 5, 2025
eicchen pushed a commit to eicchen/pandas that referenced this pull request Oct 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO JSON read_json, to_json, json_normalize Performance Memory or execution speed performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: memory leak in to_json when converting DateTime values

4 participants