This Terraform module deploys a Qumulo cluster on Oracle Cloud Infrastructure (OCI). It provides a scalable and highly available file storage solution with enterprise-grade features.
The deployment creates the following components:
- Qumulo cluster nodes across multiple fault domains
- Network security groups and rules
- Required IAM policies and groups
-
A Linux machine running Terraform version 1.12.2 or later
- OCI CLI and "jq" installed
-
You will need the following assets provided by Qumulo:
- qumulo-core.rpm (version 7.4.3B or later)
-
OCI Object Storage User Permissions
This account is used by the cluster nodes to gain access to the Object Storage buckets used for persistent storage:
-
One of the following options must be satisfied to proceed:
A. Deploying user has the following permissions:
- manage users in TENANCY
- manage groups in TENANCY
- manage policies in TENANCY
B. Precreated User, Group, and Identity Policy
- Create a new User
- Create a Customer Secret Key for this user, retain these values for use below
- Create a Group that includes this user
- Create an Identity Policy that includes the statement:
"Allow group <group created above> to manage object-family in compartment id <cluster deployment compartment> where target.bucket.name = <bucket-prefix in output from persistent storage terraform stack>-bucket-*/"
- set the variables
custom_secret_key_id
andcustom_secret_key
to the Customer Secret Key OCID and key respectively.
-
-
OCI Cluster User Permissions:
These permissions allow the cluster nodes and provisioner node tha ability to interact with resources necessary for cluster operations.
-
One of the following options must be satisfied to proceed:
A. Deploying user has the following permissions:
- manage dynamic groups IN TENANCY
- manage policies in TENANCY
B: Precreated Dynamic Group and Identity Policy
- Create a Dynamic Group that includes all compute instances in deployment compartment
- Create an Identity Policy that includes the following permissions, this policy must be created in a Compartment that contains both the Compartment where the cluster resources will be deployed and the Compartment were the target Subnet is deployed.
"Allow dynamic-group <dynamic group name> to read secret-bundles in compartment id <deployment compartment ocid>" "Allow dynamic-group <dynamic group name> to use secrets in compartment id <deployment compartment ocid>" "Allow dynamic-group <dynamic group name> to manage virtual-network-family in compartment id <network compartment ocid>" "Allow dynamic-group <dynamic group name> to use instances in compartment id <deployment compartment ocid>"
- Set variable
create_dynamic_group_and_identity_policy
tofalse
in theterraform.tfvars
file
-
-
OCI Configuration:
- Valid OCI credentials set for the DEFAULT profile
- The region in your profile should match the region of the deployment
- See OCI CLI Configuration Guide
-
Network Requirements:
- Subnet with outbound internet access for dependencies and Qumulo monitoring
- Access to KMS vault for cluster deployment
The persistent storage of the Qumulo cluster is segregated under a subdirectory persistent-storage
to give you flexibility to scale compute resources without destroying the
persistent storage. The output of the persistent storage deployment will be automatically passed to the compute resource deployment in the next step.
cd persistent-storage
Edit terraform.tfvars
to set your environment-specific configurations:
# Example configuration
region = "us-phoenix-1"
compartment_ocid = "ocid.xyz"
terraform init
terraform plan
terraform apply
Alternatively, you can set up all the terraform variables in terraform.tfvars
under the parent directory and run terraform plan/apply -var-file=../terraform.tfvars
Warning: please make sure that the persistent storage used in this step was freshly deployed. Re-using persistent storage that was associated with any previous compute resources is not supported.
cd ../
Edit terraform.tfvars
to set your environment-specific configurations:
# Example configuration
compartment_id = "ocid1.compartment.oc1..example"
q_cluster_name = "qumulo-cluster"
q_node_count = 3
q_cluster_node_count = 3
terraform init
terraform plan
terraform apply
Deployment typically takes 10-20 minutes to complete.
One of Qumulo's advantages is that compute resources can be scaled independently of the storage capacity. If your workload demands additional compute nodes, you can simply add them to your cluster. If they are no longer needed, you can later remove them.
Edit terraform.tfvars
to set the target number of nodes:
# Example configuration, previously those numbers were both 3
q_node_count = 5
q_cluster_node_count = 5
terraform plan
terraform apply
Node add typically takes 10-20 minutes to complete.
Node remove needs to happen in 2 steps in order to make sure that all filesystem dependencies are cleaned up before the actual node resources are torn down.
Edit terraform.tfvars
to reduce the number of nodes in the cluster:
# Example configuration, previously those numbers were both 5
q_node_count = 5
q_cluster_node_count = 3
terraform plan
terraform apply
Removing nodes from the cluster typically takes 5-10 minutes to complete.
Edit terraform.tfvars
to reduce the number of nodes deployed:
# Example configuration
q_node_count = 3
q_cluster_node_count = 3
terraform plan
terraform apply
Tearing down node resources typically takes less than 5 minutes to complete.
The maximum soft capacity limit on the cluster is determined by the number of object storage buckets configured on the backend. Each object storage bucket supports up to 500TB capacity. If more capacity beyond the current maximum soft capacity limit is needed, you need to add object storage buckets first.
This is a two-step operation.
cd persistent-storage
Edit terraform.tfvars
to increase the number of buckets:
# Example configuration
object_storage_bucket_count = 20
terraform plan
terraform apply
Now you have deployed additional buckets but they are not configured to be used by the cluster yet. Let's do that next.
cd ../
Optionally, you can increase the cluster soft capacity limit at the same time. The maximum cluster soft capacity limit is 500TB per object storage bucket.
# Example configuration
q_cluster_soft_capacity_limit = 1500
terraform plan
terraform apply
The new buckets will be picked up by the provisioning logic automatically. Reconfiguration typically takes less than 5 minutes.
The Terraform state is stored locally by default. For production deployments, consider:
- Using OCI Object Storage for remote state
- Implementing state locking
- Regular state backups
- Initial cluster deployment
- Node add/remove post initial cluster deployment
- Adding additional object storage buckets post initial cluster deployment
- Increasing cluster soft capacity limit
- Changing floating IPs on QFSD versions >= 7.5.1
- 4 node clusters with fault domain tolerence are not supported. We recommend an initial cluster of 3 nodes or 5 or more nodes for fault domain tolerence.
- Changing a cluster of 3 nodes or 5 or more nodes to 4 nodes is not supported due to fault domain tolerence incompatibility, vice versa.
- The maximum cluster soft capacity limit is 500TB per object storage bucket.
To deploy a cluster outside the home region, the following changes are required:
- Update the
region
variable interraform.tfvars
to the region where you want to deploy the cluster. - Update the region for the DEFAULT profile in your
~/.oci/config
file to the region where you want to deploy the cluster. - Update the
subnet_ocid
variable interraform.tfvars
to the subnet OCID of the subnet where you want to deploy the cluster. - The Secrets Vaults for the cluster and persistent storage must be in the same region as the cluster.
- You must use a precreated dynamic group and identity policy created in the home region for the cluster. Set
create_dynamic_group_and_identity_policy
tofalse
interraform.tfvars
. Follow the instruction in the Prerequisites section 4.B to create the dynamic group and identity policy.
For issues beyond the scope of this module:
- Contact Qumulo support
- Consult OCI documentation
- Review Terraform logs
See LICENSE file for details.