ACS wtih two Ceph clusters( Primary and DR ceph cluster with RDB mirroring/ Replciation). How to switch VMs to Dr Ceph when primary goes down? #11809
lernerlinux-hash
started this conversation in
General
Replies: 1 comment
-
@lernerlinux-hash One option you can consider is below: For storage types where the replicated volume has a different object identifier(path) than the source volume , create a pair of instances with static IP configuration across the clusters and primary storages. Create a replication between the instances's volumes , keeping the DR instance stopped. For failover start the instance in the DR site. Primary Storage will be cluster wide here, both Main as as well as DR site Storages are presented to CloudStack. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi team,
This is my first post. Please excuse me if I missed anything or asked some basic questions.
We are running ACS version 4.20.0.0, with primary storage as Ceph (RBD) and secondary storage as traditional NFS.
Currently, VMs are deploying and volumes are being created on Ceph. I have created a pool in Ceph and added the primary storage to ACS.
My query is about setting up disaster recovery (DR). I want to keep a DR Ceph cluster in a different location and use snapshot-based RBD mirroring for replication.
Right now, I am testing on a lab VM before applying this in production.
Our setup is as follows:
ACS on VM1
NFS as secondary storage
Primary Ceph cluster on 3 VMs (where my VMs are deployed)
DR Ceph cluster on 3 VMs (this cluster is up, and the RBD mirroring daemon is configured with one-way replication from Primary Ceph to DR Ceph)
So far, I have been manually demoting the image on the primary cluster, promoting it on the DR cluster, and then stopping and starting the VM. However, the VM gets stuck at the GRUB bootloader and does not boot properly.
I am not sure where I am going wrong and would appreciate help understanding this scenario.
I believe this is a common production use case, so I might be missing some key steps.
Please let me know if any more details are needed; I will be happy to provide them.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions