Introduction
In this post, I’ll demonstrate how to do maintenance on a two node single site Exchange 2016 Database Availability Group.
For more information on Exchange 2016 Database Availability Groups, see here.
Lab Setup
In this lab, I have two Exchange 2016 servers in a DAG with mailbox databases replicated between them for high availability. The Exchange servers are:
- LITEX01
- LITEX02
We have four mailbox databases:
- MDB01
- MDB02
- MDB03
- MDB04
There’s a copy of each of these mailbox databases on LITEX01 and LITEX02.
Put a mailbox server into maintenance mode
We’ll start with putting LITEX01 into maintenance mode so we can install Exchange updates, Windows Updates, hardware maintenance etc.
In Exchange 2016, the mailbox servers include all the Exchange roles, CAS and MBX. If your Exchange 2016 server is providing CAS services for your clients, you should remove it from the load balanced array. How you do this will depend on how you have configured load balancing.
Also note that your incoming and outgoing external messages need to be routed through both servers so that when you put one into maintenance mode, this won’t stop external message delivery. This will depend on how you have your message routing configured.
Other than the CAS service, the server will be performing the below functions:
- Message delivery
- Unified Messaging (Call routing)
- Cluster services (Primary Active Manager)
- Mailbox service (either active or passive mailbox databases)
Message Delivery
The HubTransport component on LITEX01 needs to be drained. To do this, we put the HubTransport component into a draining state, restart the Transport Service then redirect messages that are pending delivery to LITEX02. Log into LITEX01 and run these commands from the Exchange Management Shell running as administrator:
Set-ServerComponentState LITEX01 -Component HubTransport -State Draining -Requester Maintenance
Restart-Service MSExchangeTransport
Redirect-Message -Server LITEX01 -Target LITEX02.litwareinc.com
Press y when prompted
The server should now not be involved in message transport. We can confirm this by checking that the HubTransport component on LITEX01 is draining:
Get-ServerComponentState LITEX01 -Component HubTransport
Unified Messaging
You may or may not be using the server for Unified Messaging but if you are, just run this command to prevent the server handling calls. Calls will be drained which means that ongoing calls will complete:
Set-ServerComponentState LITEX01 -Component UMCallRouter -State Draining -Requester Maintenance
Confirm that the UMCallRouter component is draining (maintenance mode):
Get-ServerComponentState LITEX01 -Component UMCallRouter
Cluster Services
If you’re wondering what the Primary Active Manager (PAM) is, well it’s the term given to the server that owns the quorum and reacts to server failures. Although a failure of the server that holds the PAM causes a failover to the Standby Active Manager, (SAM), it’s best to fail this over gracefully. To do this, we need to pause the cluster node, LITEX01. This not only moves the PAM from LITEX01 to LITEX02 but it prevents LITEX01 owning this role till the cluster node is resumed.
First, let’s confirm where our PAM is located:
Get-DatabaseAvailabilityGroup -Status | fl Name,PrimaryActiveManager
Here we see that it’s currently on LITEX01 which means we need to move it (yes, more work, excellent!).
Right, let’s move it to LITEX02 by running this command:
Move-ClusterGroup “Cluster Group” -Node LITEX02
We also need to prevent LITEX01 becoming the PAM by pausing the cluster node. You need to run this command from an elevated PowerShell window:
Suspend-ClusterNode LITEX01
We’ll just confirm this has in fact worked:
Get-DatabaseAvailabilityGroup -Status | fl Name,PrimaryActiveManager
Ok, the PAM has been moved just fine and the cluster node LITEX01 is paused. We can move on to the next step.
Mailbox service
We need to move any active mailbox databases off LITEX01. They should fail over when we shut down the server or when the services stop but we’ll move them off manually which is the recommended approach.
Let’s just see what databases are mounted on LITEX01 before we start this step:
Get-MailboxDatabaseCopyStatus -Server LITEX01
Ok, we can see mailbox databases MDB01 and MDB02 are mounted on LITEX01. To move these to LITEX02, we use this command:
Get-MailboxDatabaseCopyStatus -Server LITEX01 | ? {$_.Status -eq “Mounted”} | % {Move-ActiveMailboxDatabase $_.DatabaseName -ActivateOnServer LITEX02 -Confirm:$false}
We can now confirm our databases have been moved to LITEX02:
Get-MailboxDatabaseCopyStatus -Server LITEX02
All our mailbox databases are mounted on LITEX02.
The next step is to prevent LITEX01 automatically mounting the databases in case of a problem with LITEX02. To do this, we set the DatabaseCopyAutoActivationPolicy property to blocked on LITEX01:
Set-MailboxServer LITEX01 -DatabaseCopyAutoActivationPolicy Blocked
We can confirm that this was done by running this command:
Get-MailboxServer LITEX01 | ft Name,DatabaseCopyAutoActivationPolicy
Our mailbox service on LITEX01 is now in maintenance mode.
We then put the server itself into maintenance mode:
Set-ServerComponentState LITEX01 -Component ServerWideOffline -State Inactive -Requester Maintenance
We can confirm that LITEX01 is now inactive by running the command below:
Get-ServerComponentState LITEX01 -Component ServerWideOffline
Congratulations! Your server is now in maintenance mode and we can now do the required work on it.
Take a mailbox server out of maintenance mode
When we’re done with our maintenance, we can take LITEX01 out of maintenance mode. We’ll reverse the changes we’ve made to put the server into maintenance mode.
Set the mailbox server as active
Set-ServerComponentState LITEX01 -Component ServerWideOffline -State Active -Requester Maintenance
Confirm this has worked:
Get-ServerComponentState LITEX01 -Component ServerWideOffline
Set the Unified Messaging component to active
Set-ServerComponentState LITEX01 -Component UMCallRouter -State Active -Requester Maintenance
Confirm this has worked:
Get-ServerComponentState LITEX01 -Component UMCallRouter
Resume the cluster node
Run this command from an PowerShell window with elevated permissions:
Resume-ClusterNode LITEX01
Confirm the node is now up in the cluster:
Get-ClusterNode
Set the mailbox server DatabaseCopyAutoActivationPolicy
Here we set the DatabaseCopyAutoActivationPolicy property to Unrestricted to allow LITEX01 to mount databases automatically if needed:
Set-MailboxServer LITEX01 -DatabaseCopyAutoActivationPolicy Unrestricted
We can confirm this has worked by running this command:
Get-MailboxServer LITEX01 | ft Name,DatabaseCopyAutoActivationPolicy
Set the HubTransport component to active
Set-ServerComponentState LITEX01 -Component HubTransport -State Active -Requester Maintenance
Restart-Service MSExchangeTransport
Confirm that the HubTransport component is active:
Get-ServerComponentState LITEX01 -Component HubTransport
Confirm that our server is not in maintenance mode
To confirm that our server is no longer in maintenance mode, we can run the command below to check that all required components are active:
Get-ServerComponentState LITEX02 | ft Component,State -AutoSize
Optional tasks
Optionally, you can re-balance your mailbox databases as after these steps, all mailbox databases are mounted on LITEX02. Instructions on how to do this are here.
You can now repeat the above tasks to do maintenance on LITEX02.
Conclusion
In this post, I’ve done a run-through of how you can perform maintenance on your DAG members without downtime.