Skip to content

Better way to manage bulk inserts #75

@coderdj

Description

@coderdj

Sort of a late-night mongo performance brain dump.

Currently bulk inserts are controlled by setting the min_insert_size parameter. This tells each thread how many documents must be collected in that thread's buffer before a bulk insert is performed. This number is difficult to tune and must be optimized separately for different data rates and document sizes.

A better way to do this might be to introduce something like ms_between_inserts. This would be the number of milliseconds that must pass between two inserts, creating basically a maximum rate at which inserts could be performed. Each thread tracks when the last insert took place. If this amount of time has not passed, it stores documents in its buffer rather than sending them to the DB.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions