Skip to content

Conversation

dnerini
Copy link

@dnerini dnerini commented Sep 29, 2025

Description

This PR adds support for running the interpolator on (GRIB) files from a forecaster model.

What issue or task does this change relate to?

N/A

Additional notes

This draft includes all the changes introduced by @OpheliaMiralles in anemoi-inference in order to run interpolation. We should have a discussion on what changes should be factored out in a separate PR (e.g. support for model sharding).

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

@github-project-automation github-project-automation bot moved this to To be triaged in Anemoi-dev Sep 29, 2025
@github-actions github-actions bot added dependencies Pull requests that update a dependency file config labels Sep 29, 2025
@dnerini dnerini requested a review from frazane September 29, 2025 08:41
@dnerini dnerini changed the title Add support for interpolator from files feat: Add support for interpolator from files Sep 29, 2025
@dnerini dnerini requested a review from Copilot September 29, 2025 11:01
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for running the interpolator on GRIB files from forecaster models, enabling interpolation on file-based inputs rather than just analysis data.

Key changes:

  • Added support for interpolation from GRIB files with different reference date handling
  • Implemented distributed computing support with process group initialization and cleanup
  • Enhanced pre-processors to work with State objects instead of just FieldList objects

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/anemoi/inference/runners/interpolator.py Added file-based interpolation support and device management
src/anemoi/inference/runners/default.py Added distributed process group cleanup and reference date handling
src/anemoi/inference/runner.py Enhanced with distributed computing, improved tensor operations, and better device handling
src/anemoi/inference/pre_processors/no_missing_values.py Updated to process State objects instead of FieldList
src/anemoi/inference/pre_processors/forward_transform_filter.py Updated to process State objects instead of FieldList
src/anemoi/inference/pre_processors/extract.py New processor for extracting subsets of data using masks or slices
src/anemoi/inference/inputs/gribfile.py Added support for reference date index parameter
src/anemoi/inference/inputs/ekd.py Enhanced state creation with reference date support and improved pre-processing
src/anemoi/inference/inputs/cutout.py Added kwargs support for input state creation
src/anemoi/inference/device.py Enhanced device selection with distributed computing support
src/anemoi/inference/config/init.py Relaxed configuration validation to allow extra fields
pyproject.toml Updated omegaconf version constraint

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


super().__init__(config)

self.from_analysis = True if hasattr(config.input, "test") else False
Copy link
Preview

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use direct boolean assignment instead of conditional expression. This can be simplified to self.from_analysis = hasattr(config.input, "test").

Suggested change
self.from_analysis = True if hasattr(config.input, "test") else False
self.from_analysis = hasattr(config.input, "test")

Copilot uses AI. Check for mistakes.

Comment on lines +166 to +168
input_state = input.create_input_state(
date=window_start_date, ref_date_index=0
) # for interpolator, the first date is present and the last is future. For AR models with multiple input states, the last date is the current date. This is why the distinction is made here.
Copy link
Preview

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment is too long and detailed for a single line. Consider moving this explanation to a docstring or a separate comment block above the code for better readability.

Suggested change
input_state = input.create_input_state(
date=window_start_date, ref_date_index=0
) # for interpolator, the first date is present and the last is future. For AR models with multiple input states, the last date is the current date. This is why the distinction is made here.
# For interpolator, the first date is present and the last is future.
# For AR models with multiple input states, the last date is the current date.
# This is why the distinction is made here.
input_state = input.create_input_state(
date=window_start_date, ref_date_index=0
)

Copilot uses AI. Check for mistakes.

# No need to prepare next input tensor if we are at the last step
if is_last_step:
break
break # noqa: F701
Copy link
Preview

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The noqa comment for F701 (break outside loop) is incorrect here since this break statement is properly inside a for loop. This noqa directive should be removed.

Suggested change
break # noqa: F701
break

Copilot uses AI. Check for mistakes.

if check[pmask_in_np].any():
# Report which ones are conflicting
conflicting = [self._input_tensor_by_name[i] for i in pmask_in_np[check[pmask_in_np]]]
raise AssertionError(f"Overwriting prognostic input slots for {conflicting}")
Copy link
Preview

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message could be more descriptive. Consider adding context about what this means and potential solutions, such as 'Attempting to overwrite prognostic input slots that have already been set for variables: {conflicting}. This may indicate a configuration issue with variable mappings.'

Suggested change
raise AssertionError(f"Overwriting prognostic input slots for {conflicting}")
raise AssertionError(
f"Attempting to overwrite prognostic input slots that have already been set for variables: {conflicting}. "
"This may indicate a configuration issue with variable mappings. "
"Please check your variable configuration to ensure that each prognostic input slot is only set once."
)

Copilot uses AI. Check for mistakes.

Comment on lines +279 to +280
dates = sorted([to_datetime(d) for d in dates])
date_to_index = {d.isoformat(): i for i, d in enumerate(dates)}
Copy link
Preview

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code duplicates the sorting and date_to_index creation that was already done at lines 235-237. The duplicated logic should be removed to avoid redundancy and potential inconsistencies.

Suggested change
dates = sorted([to_datetime(d) for d in dates])
date_to_index = {d.isoformat(): i for i, d in enumerate(dates)}
# dates and date_to_index are already sorted and created earlier

Copilot uses AI. Check for mistakes.

@gmertes
Copy link
Member

gmertes commented Sep 30, 2025

Thanks for this! As you say, I think we first need to discuss how to bring this functionality into the codebase. I think this would be a good topic for in two weeks when Harrison and I are in Zurich ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config dependencies Pull requests that update a dependency file
Projects
Status: To be triaged
Development

Successfully merging this pull request may close these issues.

3 participants