In-Place Timsort?

I was wondering if it would be possible to do an in-place Timsort. I was thinking something along the lines of doing the run finding stage at the beginning followed by an in-place mergesort.

I've yet to write it up (I intend to), but I did some extensive tests using the sorts here against 1+ GB of data (same data for all tests - fetched once from /dev/urandom`). What I found, unsurprisingly, was that you can use multithreading to speed up the process (I suppose for large volumes you could even go hadoop-style Big Data). When multithreading, nested mergesorts were faster than any other sorts.

What did surprise me, though, was that an in-place mergesort was the fastest. Timsort came second, and normal mergesort third, but still. It came to me, however, that the reason was due to cache locality.

I was wondering though, if we could do the run finding from Timsort and then do an in-place mergesort, perhaps that would be the fastest? It could take advantage of cache locality AND pre-existing runs.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In-Place Timsort? #25

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

In-Place Timsort? #25

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions