Skip to content

Commit 7d80037

Browse files
committed
update docstring
1 parent 2afc37d commit 7d80037

File tree

2 files changed

+11
-12
lines changed

2 files changed

+11
-12
lines changed

cpp/src/parquet/properties.h

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -525,7 +525,8 @@ class PARQUET_EXPORT WriterProperties {
525525
/// Enable writing page index in general for all columns. Default disabled.
526526
///
527527
/// Page index contains statistics for data pages and can be used to skip pages
528-
/// when scanning data in ordered and unordered columns.
528+
/// when scanning data in ordered and unordered columns. Note that it does not
529+
/// write statistics to the page header once page index is enabled.
529530
///
530531
/// Please check the link below for more details:
531532
/// https://github.com/apache/parquet-format/blob/master/PageIndex.md
@@ -541,6 +542,8 @@ class PARQUET_EXPORT WriterProperties {
541542
}
542543

543544
/// Enable writing page index for column specified by `path`. Default disabled.
545+
/// Note that it does not write statistics to the page header once page index is
546+
/// enabled.
544547
Builder* enable_write_page_index(const std::string& path) {
545548
page_index_enabled_[path] = true;
546549
return this;

docs/source/cpp/parquet.rst

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,8 @@ Statistics are enabled by default for all columns. You can disable statistics fo
304304
all columns or specific columns using ``disable_statistics`` on the builder.
305305
There is a ``max_statistics_size`` which limits the maximum number of bytes that
306306
may be used for min and max values, useful for types like strings or binary blobs.
307+
If a column has enabled page index using ``enable_write_page_index``, then it does
308+
not write statistics to the page header because it is duplicated in the ColumnIndex.
307309

308310
There are also Arrow-specific settings that can be configured with
309311
:class:`parquet::ArrowWriterProperties`:
@@ -573,20 +575,14 @@ Miscellaneous
573575
+--------------------------+----------+----------+---------+
574576
| Feature | Reading | Writing | Notes |
575577
+==========================+==========+==========+=========+
576-
| Column Index || | \(1) |
578+
| Column Index || | |
577579
+--------------------------+----------+----------+---------+
578-
| Offset Index || | \(1) |
580+
| Offset Index || | |
579581
+--------------------------+----------+----------+---------+
580-
| Bloom Filter ||| \(2) |
582+
| Bloom Filter ||| \(1) |
581583
+--------------------------+----------+----------+---------+
582-
| CRC checksums ||| \(3) |
584+
| CRC checksums ||| |
583585
+--------------------------+----------+----------+---------+
584586

585-
* \(1) Access to the Column and Offset Index structures is provided, but
586-
data read APIs do not currently make any use of them.
587-
588-
* \(2) APIs are provided for creating, serializing and deserializing Bloom
587+
* \(1) APIs are provided for creating, serializing and deserializing Bloom
589588
Filters, but they are not integrated into data read APIs.
590-
591-
* \(3) For now, only the checksums of V1 Data Pages and Dictionary Pages
592-
are computed.

0 commit comments

Comments
 (0)