Skip to content

Conversation

jakedorne
Copy link

fixes #100

Currently, the parquet file batcher calls hasNext while seeking the file, which itself checks if seeked == true. This leads to the filereader repeatedly reading the second batch and never completes. Using the existing hasNextRecord fixes this and I assume was originally intended to be used here.

This PR doesn't contain tests, sorry. To reproduce this in tests I had to replace the mocking with stubs, which broke other tests and fixing it would be a bigger change than I think this fix warrants. Here is a commit showing what I did to reproduce.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ParquetFileReader - Maximum of two batches processed using file_reader.batch_size set to > 0

1 participant