-
Notifications
You must be signed in to change notification settings - Fork 20
feat: Add support for interpolator from files #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for running the interpolator on GRIB files from forecaster models, enabling interpolation on file-based inputs rather than just analysis data.
Key changes:
- Added support for interpolation from GRIB files with different reference date handling
- Implemented distributed computing support with process group initialization and cleanup
- Enhanced pre-processors to work with State objects instead of just FieldList objects
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
src/anemoi/inference/runners/interpolator.py | Added file-based interpolation support and device management |
src/anemoi/inference/runners/default.py | Added distributed process group cleanup and reference date handling |
src/anemoi/inference/runner.py | Enhanced with distributed computing, improved tensor operations, and better device handling |
src/anemoi/inference/pre_processors/no_missing_values.py | Updated to process State objects instead of FieldList |
src/anemoi/inference/pre_processors/forward_transform_filter.py | Updated to process State objects instead of FieldList |
src/anemoi/inference/pre_processors/extract.py | New processor for extracting subsets of data using masks or slices |
src/anemoi/inference/inputs/gribfile.py | Added support for reference date index parameter |
src/anemoi/inference/inputs/ekd.py | Enhanced state creation with reference date support and improved pre-processing |
src/anemoi/inference/inputs/cutout.py | Added kwargs support for input state creation |
src/anemoi/inference/device.py | Enhanced device selection with distributed computing support |
src/anemoi/inference/config/init.py | Relaxed configuration validation to allow extra fields |
pyproject.toml | Updated omegaconf version constraint |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
||
super().__init__(config) | ||
|
||
self.from_analysis = True if hasattr(config.input, "test") else False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use direct boolean assignment instead of conditional expression. This can be simplified to self.from_analysis = hasattr(config.input, "test")
.
self.from_analysis = True if hasattr(config.input, "test") else False | |
self.from_analysis = hasattr(config.input, "test") |
Copilot uses AI. Check for mistakes.
input_state = input.create_input_state( | ||
date=window_start_date, ref_date_index=0 | ||
) # for interpolator, the first date is present and the last is future. For AR models with multiple input states, the last date is the current date. This is why the distinction is made here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inline comment is too long and detailed for a single line. Consider moving this explanation to a docstring or a separate comment block above the code for better readability.
input_state = input.create_input_state( | |
date=window_start_date, ref_date_index=0 | |
) # for interpolator, the first date is present and the last is future. For AR models with multiple input states, the last date is the current date. This is why the distinction is made here. | |
# For interpolator, the first date is present and the last is future. | |
# For AR models with multiple input states, the last date is the current date. | |
# This is why the distinction is made here. | |
input_state = input.create_input_state( | |
date=window_start_date, ref_date_index=0 | |
) |
Copilot uses AI. Check for mistakes.
# No need to prepare next input tensor if we are at the last step | ||
if is_last_step: | ||
break | ||
break # noqa: F701 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The noqa comment for F701 (break outside loop) is incorrect here since this break statement is properly inside a for loop. This noqa directive should be removed.
break # noqa: F701 | |
break |
Copilot uses AI. Check for mistakes.
if check[pmask_in_np].any(): | ||
# Report which ones are conflicting | ||
conflicting = [self._input_tensor_by_name[i] for i in pmask_in_np[check[pmask_in_np]]] | ||
raise AssertionError(f"Overwriting prognostic input slots for {conflicting}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message could be more descriptive. Consider adding context about what this means and potential solutions, such as 'Attempting to overwrite prognostic input slots that have already been set for variables: {conflicting}. This may indicate a configuration issue with variable mappings.'
raise AssertionError(f"Overwriting prognostic input slots for {conflicting}") | |
raise AssertionError( | |
f"Attempting to overwrite prognostic input slots that have already been set for variables: {conflicting}. " | |
"This may indicate a configuration issue with variable mappings. " | |
"Please check your variable configuration to ensure that each prognostic input slot is only set once." | |
) |
Copilot uses AI. Check for mistakes.
dates = sorted([to_datetime(d) for d in dates]) | ||
date_to_index = {d.isoformat(): i for i, d in enumerate(dates)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code duplicates the sorting and date_to_index creation that was already done at lines 235-237. The duplicated logic should be removed to avoid redundancy and potential inconsistencies.
dates = sorted([to_datetime(d) for d in dates]) | |
date_to_index = {d.isoformat(): i for i, d in enumerate(dates)} | |
# dates and date_to_index are already sorted and created earlier |
Copilot uses AI. Check for mistakes.
Thanks for this! As you say, I think we first need to discuss how to bring this functionality into the codebase. I think this would be a good topic for in two weeks when Harrison and I are in Zurich ? |
Description
This PR adds support for running the interpolator on (GRIB) files from a forecaster model.
What issue or task does this change relate to?
N/A
Additional notes
This draft includes all the changes introduced by @OpheliaMiralles in anemoi-inference in order to run interpolation. We should have a discussion on what changes should be factored out in a separate PR (e.g. support for model sharding).
As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/
By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.