This is a similar post to one a few days ago. In that post I detailed an experience I had with a recovered DC that refused to boot in Azure, and how I solved the issue.

This post is more specific to creating a clean room, restoring your controllers to the clean room to run tests on them, particularly to make sure they’re replicating with each other, and restore them to the original Resource Group.  In this post we are using Commvault as the backup medium and restoring to Azure using a Commvault Media Agent.

Step 1: Create a Clean Room in Azure

The clean room in Azure is a Resource Group into which Commvault will restore backups of the Active Directory Domain Controllers. This Resource Group should be created in the same region as the production environment.

Resource Group:                RG-CACENTRAL-AD-Cleanroom
Within the Resource Group create:
Virtual Network:               VNET-CACENTRAL-AD-Cleanroom
Scope:                                  10.200.0.0/16
Subnet 1:                             SNET-CACENTRAL-AD-Cleanroom
                                              10.200.0.0/24
Subnet 2:                            AzureBastionSubnet
                                             10.200.254.0/26
Storage Account:              cleanroomstorage1
Public IP:                           PIP-Bastion-Cleanroom

NOTES

• Subnet 2 is automatically created when Bastion is deployed, and the name of the subnet must be AzureBastionSubnet
• The storage account is a requirement of Commvault as a place holder for the recovery VHDs
• Commvault also requires the name of the virtual network/subnet
• Throughout this exercise the vnet was never given its own security group or public IP address. This is a sandbox area with no internet access to ensure there can be no tampering with the domain controllers while they’re being recovered.
Bastion Deployment

Step 2 – Add Bastion

• In the resource group click on Create
• Enter “Bastion” in the search field
• Select Microsoft Bastion
• Click on Create to add the Bastion Plan to the Resource Group

o Subscription: pre-selected
o Resource Group: pre-selected
o Name: RG-CACENTRAL-CLEANROOM-BASTION
o Region: pre-selected (but make sure it matches)
o Tier: Basic
o Virtual Network: select VNET-CACENTRAL-AD-Cleanroom
o Subnet: select AzureBastionSubnet
o Public IP select ‘Create new’
o Public IP Name PIP-Bastion-Cleanroom

It can take a few minutes for Bastion to deploy.

The Clean Room has now been set up. Please note that the only Public IP to this environment is with Bastion and Bastion cannot be accessed outside of the Azure Portal.

Step 3 – Restoring the VMs to the Clean Room

Using Commvault the restoration process will deposit/create replica VMs to the resource group.

• Use the option to not start the VM when the restore is completed so you can check each one individually for corruption.
• Try to restore from the same time frame for each domain controller
IE – restore for Nov 11 2022 8 PM for one, then every other one should be the same or close
• Recommend you start with your PDCE role holder and restore each subsequent domain controller to a time or date after the PDCE

Step 4 – Verifying Azure Cleanroom VMs

The restored VMs can be booted up but may require additional modifications to function in the different vnet. This can be set on the vnet object or each network interface object.

• Update the Network Interface for each VM
o IP configurations – set to Static and note the IP address
o DNS Servers – add the static IP assigned to each VM as DNS servers
o Update the root IP A address of the domain, test pinging to ensure it resolves to the correct address
Use Bastion to connect to each domain controller. Note that your username should not be your pre-2000 (DOMAIN\username) but rather your modern username (username@internaldomain.loc)

• Suggest you open Active Directory Sites and Services and add the SNET-CACENTRAL-AD-Cleanroom subnet and link to the site name of the original domain controller

Testing AD

In this environment I was able to connect to each domain controller and immediately execute commands to verify AD replication. 

• Start Elevated CMD
• Execute: repadmin /syncall
• Execute: repadmin /showrepl
• Execute: repadmin /replsummary

I also created a bogus DNS entry under our domain and confirmed that it did replicate between controllers. Once confirmed be sure to remove this entry.

There was an issue with NETLOGON/SYSVOL replication with DFS Replication. In DNS the root A host record of the local domain is the original IP address of the PDCE and doesn’t automatically update. We resolved this issue in our test environment by adjusting this IP from the original to the new IP of the PDCE. Immediately all servers showed the NETLOGON and SYSVOL folders, and policies were being replicated between the domain controllers.

After moving back to the production environment make sure this DNS record is updated to its original value.

Step 5 – Migration back to Production Environment

The process to migrate the VMs to the Production Environment is simplified by only migrating the managed disks. To move the virtual disks the easiest way is to snapshot then create a replica from the snapshot in the target Resource Group.

• Shut down the VMs in the Clean Room
• Click on the disk to replicate in the Clean Room
• Click on Create Snapshot
• Give the snapshot a name but leave it in the same Resource group
• Set the networking to ‘Disable public and private access’
• Review + create
• Create

Now click on the snapshot in the Resource group

• Click on Create Disk
• Give the disk a name that identifies it as the replacement OS disk for the DC being restored
• Select the Resource group where the DC is being restored
• Select an appropriate size for the disk
• Set the networking to ‘Disable public and private access’
• Review + create
• Create

Go back to your production Resource group and ensure the servers you are about to restore are shut down

• Start with your PDCE Virtual machine
• Click on Disks
• In the OS Disk section, click on Swap OS Disk

Pay attention to the disk name, size, and which resource group it is in, to ensure correct selection.

Note:
You must confirm the OS Disk Swap by typing in the name of the VM

Click on OK at the bottom of the screen and wait for deployment to complete.

You will need to repeat this process for each VM being restored.

Final Steps

Once you have restored all the domain controllers in this fashion, start them up one at a time starting with your PDCE, and execute your AD tests again.

Because you are only swapping disks and not recreating the VMs you do not need to worry about configuration changes to the VM itself. All that information is retained. The analogy here would be that you have restored to a new drive to an earlier point in time, removed the corrupted or locked drive, and replaced with the new drive.

Tech Notes:

  • I found it can sometimes take Azure a little while for the disks to appear in the dropdowns (sometimes a few minutes)
  • I strongly recommend that you test this process biannually up to the point of restoring the disks to the production VMs.
  • Bastion also has its own costs so if you aren’t going to be using it regularly then you can add when needed and remove when your recovery process has been concluded.  Keeping the Clean Room Resource Group doesn’t really cost anything but it’s so easy to recreate you might as well remove it as well.
  • After recovery you should ensure your backups are continuing as expected – check your backup logs and run new backups manually as a test