Skip to content

Conversation

okrause
Copy link
Contributor

@okrause okrause commented Aug 29, 2025

Hello,

here is a PR to add support for Google Cloud NetApp Volume, which provides NFS/SMB shared file-system.

It has two new modules:

  • netapp-storage-pool: A capacity container to host volumes
  • netapp-volume: Volumes are file-systems

The PR contains code and documentation for the two new modules, an example blueprint and updates to the blueprint readme and network_storage.md.

I am unsure if this needs to be a community contribution or a core one. GCNV is a Google first party service and I am part of the GCNV product team. So I consider it "core" contribution, but in the end it depends on how the CT team defines core vs community. Please advise.

I also need guidance if I need to add tests.

Looking forward to the review process.

@okrause okrause requested review from samskillman and a team as code owners August 29, 2025 12:46
@okrause
Copy link
Contributor Author

okrause commented Sep 17, 2025

Hello, I am back from vacation. Pushed an update to fix an merge conflict caused my changes in the develop branch.

@okrause
Copy link
Contributor Author

okrause commented Sep 25, 2025

Question: We did a few changes in the service recently which caused issues in older google provider versions. We recommend users to use 7.4.0+ now. CT seems to have some implicit constraints which do not allow providers >=7. Is this a known issue? When can I use a 7.x based provider?

Copy link
Collaborator

@bytetwin bytetwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, netapp-storage-pool as well as netapp-volume should be under community/modules instead of modules. Reason being this module is not directly supported by the toolkit team.

I also think there should be an integration test for the blueprint if we are planning to have netapp as a mainstream storage option in toolkit


[gke-g4]: ../examples/gke-g4

### [netapp-volumes.yaml] ![core-badge]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

community badge instead of core

@@ -0,0 +1,65 @@
#!/bin/bash
# Copyright 2023 Google LLC
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2025

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this file used ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unclear to me. But Filestore is shipping it as well, so I decided to add a customized version for Netapp Volumes as well. It is unclear to me how exactly it is used in FileStore, but I thought there will be a reason. I assume is is part of the metadata which is used when VMs are deployed. The name says something about ansible. I assume it is part of the magic which makes CT mount the filesystems in the VMs.

}
}
provider_meta "google" {
module_name = "blueprints/terraform/hpc-toolkit:netapp-volume/v1.60.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v1.67.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate what this is and how the version numbers come to be? I see I commented this block out in netapp-storage-pool entirely.
I simply try to adhere to the standards in CT, but this is a part of which I don't understand the purpose and how to use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see response in the other comment

module_name = "blueprints/terraform/hpc-toolkit:netapp-volume/v1.60.0"
}
provider_meta "google-beta" {
module_name = "blueprints/terraform/hpc-toolkit:netapp-volume/v1.60.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v1.67.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from the current version of the toolkit in main. This helps us understand the versions of the modules that are currently used in the wild and prioritize support.

@okrause
Copy link
Contributor Author

okrause commented Oct 1, 2025

I also think there should be an integration test for the blueprint if we are planning to have netapp as a mainstream storage option in toolkit

Can you please share some info on integration test to get me started? I need to understand how they are done and how they work.

@okrause
Copy link
Contributor Author

okrause commented Oct 1, 2025

IMO, netapp-storage-pool as well as netapp-volume should be under community/modules instead of modules. Reason being this module is not directly supported by the toolkit team.

@bytetwin I understand that reasoning. But I want to understand the role of Filestore here. Is Filestore maintained by the toolkit team? If yes, why not pick up NetApp Volumes as well, since it is a more capable and feature rich native Google NFS and SMB file service?

@cboneti
Copy link
Member

cboneti commented Oct 1, 2025

Hi @okrause, please push the small changes first (version differences), year for the header and let's run the basic tests here. After that, I can guide you on how to write integration tests. We are still debating if this module should be in core or in community.

@okrause
Copy link
Contributor Author

okrause commented Oct 2, 2025

Hi @okrause, please push the small changes first (version differences), year for the header and let's run the basic tests here. After that, I can guide you on how to write integration tests. We are still debating if this module should be in core or in community.

Thank you for the feedback. I did push some changes and looking forward to resolve open questions with you. On my list of item to clarify are:

  • core or community
  • integration tests
  • What is provider_meta information for? How do I know which version number to use?
  • I want to use google>=7.4.0, but currently getting constraint conflicts. How to resolve.

Thank you @cboneti and @bytetwin for supporting this PR review.

@cboneti
Copy link
Member

cboneti commented Oct 2, 2025

@okrause
We are discussing internally the core vs community, when Hanu is back on Monday, we will converge to a final decision.

What is provider_meta information for? How do I know which version number to use?
I believe I answered that in the commands. You should use the current version of main.

I want to use google>=7.4.0, but currently getting constraint conflicts. How to resolve.
I am not sure, I haven't tried. I believe the toolkit team might have the answer for that. In the past, I saw this type of problems when the provider made breaking changes that would require significant rework within the toolkit code. This might be the case here but the team will know better (another topic to be discussed on Monday).

As for regression tests, if this module goes to core, then it will be absolutely required.
Regression tests are defined in tools/cloud-build/daily-tests, PTAL. Unless things changed dramatically there, the documentation is probably not complete. It will also be very hard for you to try things out without the general infrastructure so realistically, it might be a better exercise for us to think of the specific post-deploy tests we should run... like in tools/cloud-build/daily-tests/ansible_playbooks/test-validation (see in tools/cloud-build/daily-tests/builds/lustre-slurm.yaml + tools/cloud-build/daily-tests/tests/lustre-slurm.yml an example of how these are put together through ansible).

@cboneti cboneti added release-key-new-features Added to release notes under the "Key New Features" heading. release-new-modules Added to release notes under the "New Modules" heading. labels Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-key-new-features Added to release notes under the "Key New Features" heading. release-new-modules Added to release notes under the "New Modules" heading.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants