@@ -304,6 +304,8 @@ Statistics are enabled by default for all columns. You can disable statistics fo
304
304
all columns or specific columns using ``disable_statistics `` on the builder.
305
305
There is a ``max_statistics_size `` which limits the maximum number of bytes that
306
306
may be used for min and max values, useful for types like strings or binary blobs.
307
+ If a column has enabled page index using ``enable_write_page_index ``, then it does
308
+ not write statistics to the page header because it is duplicated in the ColumnIndex.
307
309
308
310
There are also Arrow-specific settings that can be configured with
309
311
:class: `parquet::ArrowWriterProperties `:
@@ -573,20 +575,14 @@ Miscellaneous
573
575
+--------------------------+----------+----------+---------+
574
576
| Feature | Reading | Writing | Notes |
575
577
+==========================+==========+==========+=========+
576
- | Column Index | ✓ | | \( 1) |
578
+ | Column Index | ✓ | ✓ | |
577
579
+--------------------------+----------+----------+---------+
578
- | Offset Index | ✓ | | \( 1) |
580
+ | Offset Index | ✓ | ✓ | |
579
581
+--------------------------+----------+----------+---------+
580
- | Bloom Filter | ✓ | ✓ | \( 2 ) |
582
+ | Bloom Filter | ✓ | ✓ | \( 1 ) |
581
583
+--------------------------+----------+----------+---------+
582
- | CRC checksums | ✓ | ✓ | \( 3) |
584
+ | CRC checksums | ✓ | ✓ | |
583
585
+--------------------------+----------+----------+---------+
584
586
585
- * \( 1) Access to the Column and Offset Index structures is provided, but
586
- data read APIs do not currently make any use of them.
587
-
588
- * \( 2) APIs are provided for creating, serializing and deserializing Bloom
587
+ * \( 1) APIs are provided for creating, serializing and deserializing Bloom
589
588
Filters, but they are not integrated into data read APIs.
590
-
591
- * \( 3) For now, only the checksums of V1 Data Pages and Dictionary Pages
592
- are computed.
0 commit comments