NSX-V Site Failover/Failback Plan: Part 2

Posted by
Reading Time: 3 minutes

This blog post is a continuation of the “Planned” or “Unplanned” failover of NSX-V components i.e. NSX Manager, controllers, universal distributed logical routers in an Active/Passive datacentre scenario i.e. all North/South routing flow via one site’s ESG(s).

Just to reverberate, I have split this topic into three parts:

    1. Part 1 (here), talks about:
        1. Use Cases
        2. Assumptions
        3. Current state and Target State i.e. before and after failover
        4. Pre-requisites
        5. Summary of the Failover Plan
    2. Part 2 (this blog post), talks about the failover configuration steps to make Site-B “Primary”
    3. Part 3 (here), talks about the configuration steps required after Site-A comes back online to avoid conflicts.

I would encourage you to visit the previous blog Part 1 and get familiar with the assumptions, visualize before and after failover states, and the pre-requisites of this Cross-vCenter NSX Design, before proceeding ahead. 

Below are the diagrams to visualize the placement of the NSX-V components and routing that will be achieved, after following the steps in this “Part 2” of the Failover Plan:

Location of the NSX-V components, after failover (Click on the Image to enlarge it):

North/South routing of NSX-V components, after failover (Click on the Image to enlarge it):

Site-A (Only in case of a planned Failover):
    1. Shutdown all ESGs/DLRs/UDLRs
    2. Shutdown Controllers
    3. Shutdown NSX Manager
Site-A:
    1. Disconnect Secondary NSX Manger from Primary:
        1. Go to “Network and Security” Plugin in the vSphere Client
        2. Installation and Upgrade -> Management -> NSX Managers
        3. Select the Secondary NSX Manager
        4. Click “Actions” -> “Disconnect from the Primary NSX Manager”
        5. The NSX Manager will now be in Transit Mode.
    1. Promote/Assign the NSX Manager (now in Transit mode) as Primary:
        1. Select the NSX Manager (now in Transit mode
        2. Click “Actions” -> “Assign primary Role”.
    1. Deploy the Universal NSX-V Controllers:
        1. Go to “Network and Security” Plugin in the vSphere Client
        2. Installation and Upgrade -> Management -> NSX Controller Nodes
        3. Click Add and deploy the “three” Universal NSX controller nodes with the same configuration
        4. Deploy the First controller, wait for it to deploy successfully and when the status says connected deploy the next two.
        5. Create DRS rules for the controller VMs to run on separate ESXi Hosts.
    1. Deploy UDLR Control VMs:
        1. Go to “Network and Security” Plugin in the vSphere Client
        2. NSX Edges -> Double click the respective UDLR
        3. Settings -> Configuration (for NSX-V 6.4.5: Settings ->Appliance Settings)
        4. Add “NSX Edge Appliance” and specify the Datacenter, Cluster/Resource pool and Datastore
        5. Click the “Add” icon to deploy another NSX Edge device with the same configuration
        6. Configure HA for UDLR as necessary.
        7. Change CLI credentials as necessary
            1. Go to “Network and Security” Plugin in the vSphere Client
            2. NSX Edges -> Right the respective UDLR and click “Change CLI credentials”
            3. Enter the Credentials and click “Ok” 
    1. Verify “Global Configuration” on the UDLR:
        1. Go to “Network and Security” Plugin in the vSphere Client
        2. NSX Edges -> Double click the respective UDLR
        3. Click Manage -> Routing
        4. Verify Configuration as documented before (in pre-requisites)
            1. Verify ECMP (if configured previously)
            2. Verify Router ID.
    1. Verify and amend “Dynamic Routing” configuration for the UDLR control VM(s):
        1. Go to “Network and Security” Plugin in the vSphere Client
        2. NSX Edges -> Double click the respective UDLR
        3. Click Manage -> Routing
        4. Verify the configuration as documented before (in the pre-requisites) and amend as necessary:
            1. Verify BGP configuration status, AS numbers, neighbors, etc.
            2. Amend the BGP neighbor’s weights – set “Site-B” ESG neighbors higher than “Site-A” e.g. if the “Site-A” ESG neighbors weight is 60 set “Site-B” ESG neighbors weight to 120.
            3. If configured, amend any BGP filters as necessary to permit or deny network routes
            4. Verify Route Redistribution
            5. Open the console of the UDLR VM and login with “admin” credentials
            6. Verify BGP neighbors status is “Established” and “UP” for Site-B ESG IPs, by running the following command:

                                  show ip bgp neighbors

            7. Verify the routes (or “Default” route) are being received from Site-B ESGs, by running the following command:

show ip route

Note: Follow the same steps above, for each UDLR Instance as necessary.

    1. Amend any dynamic routing configuration on ESGs, as necessary:
        1. If configured, amend any filters as necessary to permit or deny network routes to Physical switch neighbors
        2. Check BGP neighbors status is Established, UP
        3. Verify routes are being exchanged (both Physical routes and UDLR routes), by running the following command:

show ip bgp neighbors

show ip route

    1. Optional: If “Site-B” will be the “Primary” for some forceable future, update the syslog, NTP and DNS IPs on the following components to point to the Site-B syslog server
          1. UDLR
          2. NSX controllers.
    1. If deployed, enable any “OneArm” Load Balancer ESG(s) network connectivity in Site-B (enable interface)

This completes Part 2 of the NSX-V Site Failover/Failback Plan, lets discuss the configuration “step-by-step” required, when “Site-A” (previous primary) comes back online in NSX-V Site Failover/Failback Plan: Part 3.

Leave a Reply

Your email address will not be published. Required fields are marked *