You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/library-user-guide/upgrading.md
+80Lines changed: 80 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -182,6 +182,86 @@ let indices = projection_exprs.column_indices();
182
182
_execution plan_ of the query. With this release, `DESCRIBE query` now outputs
183
183
the computed _schema_ of the query, consistent with the behavior of `DESCRIBE table_name`.
184
184
185
+
### Refactoring of `FileSource` constructors and `FileScanConfigBuilder` to accept schemas upfront
186
+
187
+
The way schemas are passed to file sources and scan configurations has been significantly refactored. File sources now require the schema (including partition columns) to be provided at construction time, and `FileScanConfigBuilder` no longer takes a separate schema parameter.
188
+
189
+
**Who is affected:**
190
+
191
+
- Users who create `FileScanConfig` or file sources (`ParquetSource`, `CsvSource`, `JsonSource`, `AvroSource`) directly
192
+
- Users who implement custom `FileFormat` implementations
193
+
194
+
**Key changes:**
195
+
196
+
1.**FileSource constructors now require TableSchema**: All built-in file sources now take the schema in their constructor:
197
+
```diff
198
+
- let source = ParquetSource::default();
199
+
+ let source = ParquetSource::new(table_schema);
200
+
```
201
+
202
+
2.**FileScanConfigBuilder no longer takes schema as a parameter**: The schema is now passed via the FileSource:
203
+
```diff
204
+
- FileScanConfigBuilder::new(url, schema, source)
205
+
+ FileScanConfigBuilder::new(url, source)
206
+
```
207
+
208
+
3.**Partition columns are now part of TableSchema**: The `with_table_partition_cols()` method has been removed from `FileScanConfigBuilder`. Partition columns are now passed as part of the `TableSchema` to the FileSource constructor:
+ let source = Arc::new(CsvSource::new(table_schema).with_csv_options(options));
261
+
+ let config = FileScanConfigBuilder::new(url, source)
262
+
.build();
263
+
```
264
+
185
265
### Introduction of `TableSchema` and changes to `FileSource::with_schema()` method
186
266
187
267
A new `TableSchema` struct has been introduced in the `datafusion-datasource` crate to better manage table schemas with partition columns. This struct helps distinguish between:
0 commit comments