Conversation
| if self.readline_buffer: | ||
| result, self.readline_buffer = self.readline_buffer[:size], self.readline_buffer[size:] | ||
| return result | ||
| chunk = self.buffer or self.stream.read(size) |
There was a problem hiding this comment.
I've noticed a new issue here. When the buffering argument is passed, the string reader still reads DEFAULT_BUFFER_SIZE from the stream. Working on a fix...
|
While working on a 2nd attempt at implementing this in the Rust tokenizer (smheidrich/py-json-stream-rs-tokenizer#89), I noticed that my benchmarking test (which uses large randomly generated JSON files) exhibits transient failures for the Python tokenizer from this branch: pytest error log@daggaz Could that be related to the bug you mentioned in #45 (comment)? |
|
hmm...I need to get back on this! |
|
Maybe worth mentioning: While doing benchmarks to check for performance regressions in smheidrich/py-json-stream-rs-tokenizer#91, I noticed that this branch here is only ~3-4 times slower than the Rust tokenizer. Thought I had a regression at first but tested against the other branches and they remained at 10-15 times slower. So I guess doing |
In response to #45