Skip to content

Conversation

@tdenniston
Copy link

Hello,

We needed the ability to parse larger-than-memory CSV files, so this is my attempt at implementing that (issue #45). It's used something like this:

ParaText::CSV::ColBasedLoader loader;
ParaText::ParseParams params;
params.num_threads = 4;
params.chunked_file_reading = true;
params.file_chunk_size = 1024 * 1024; // Approximate number of bytes to read from the input file

loader.load(inputfile, params);
do {
  std::vector<float> col0vals;
  auto inserter = std::back_inserter(col0vals);
  loader.copy_column<decltype(inserter), size_t>(0, inserter);
} while (loader.load_next());

I'm grateful for any feedback on this, and I'd be happy to make any changes you guys may want.

This allows for reading through larger-than-memory CSV files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant