Python copy-on-write memory leak free dict

For really huge dictionaries used in multi-process python programs.

Here the article describing python copy-on-write memory problem with: https://karbachinsky.medium.com/python-server-is-gradually-running-out-of-memory-1de8d2b7ef29

Usage:

from fork_aware_dict import  import ForkAwareDict

# Creating index
filename: str = ForkAwareDict.create(
    {
        "foo": "aaa",
        "bar": "bbbb",
        "baz": "ccccc",
    }.items()
)

# Reading and using
index = ForkAwareDict(filename=filename)

assert index.get("bar") == "bbbb"

Or using complex iterable data:

    import json

    data = [
        {"word": "foo", "data": {"x": 0, "y": 0, "payload": "x"*1000}},
        {"word": "bar", "data": {"x": 0, "y": 1, "payload": "y"*1000}},
        {"word": "baz", "data": {"x": 1, "y": 0, "payload": "z"*1000}},
    ]

    filename: str = ForkAwareDict.create(
        data,
        key_function=lambda entry: entry["word"],
        encoder=lambda entry: json.dumps(entry["data"]).encode("utf-8")
    )

    index = ForkAwareDict(
        filename=filename,
        decoder=lambda data: json.loads(data)
    )

Run tests:

python -m pytest tests.py

TODO

Support not only str keys.
Think about keys and their leakage.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
fork_aware_dict.py		fork_aware_dict.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
tests.py		tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Python copy-on-write memory leak free dict

About

Uh oh!

Releases

Packages

Languages

License

karbachinsky/fork_aware_dict

Folders and files

Latest commit

History

Repository files navigation

Python copy-on-write memory leak free dict

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages