Skip to content

In-Place Timsort? #25

@haneefmubarak

Description

@haneefmubarak

I was wondering if it would be possible to do an in-place Timsort. I was thinking something along the lines of doing the run finding stage at the beginning followed by an in-place mergesort.

I've yet to write it up (I intend to), but I did some extensive tests using the sorts here against 1+ GB of data (same data for all tests - fetched once from /dev/urandom`). What I found, unsurprisingly, was that you can use multithreading to speed up the process (I suppose for large volumes you could even go hadoop-style Big Data). When multithreading, nested mergesorts were faster than any other sorts.

What did surprise me, though, was that an in-place mergesort was the fastest. Timsort came second, and normal mergesort third, but still. It came to me, however, that the reason was due to cache locality.

I was wondering though, if we could do the run finding from Timsort and then do an in-place mergesort, perhaps that would be the fastest? It could take advantage of cache locality AND pre-existing runs.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions