Pages

Step by Step: Adding Your Second Lync Standard Edition Server 2013 & Creating an Associated Backup Pool for Resiliency Part 4

We are on a journey installing various Lync Server 2013 roles. In today’s step by step, we will setup our 2nd Lync Server Standard Edition pool and then set it up as a Backup Registrar so automatic failover can happen. We will also look at Lync Server 2013’s new failover capabilities that allow full client capability to be restored in the event of a disaster. To use this blog the only other lab you need to have done is Part 1.

Previous Articles in this Series:

Prepare the 2nd Front End Server: Prerequisites

See Lync Server 2013 prerequisites here. Installing your 2nd Lync Pool is much like installing the first. We will go over the steps below briefly, with special notes. But for detailed notes on installing an FE server, just refer to the Part1 blog in this series.

Install Lync Server 2013

Insert Lync Server 2013 CD, and when you see popup below, click Yes

c-install-note_thumb

Once the Deployment Wizard appears we are done here for now.

deploy-fresh_thumb2

Open Topology Builder to Add Your 2nd Front End Server/Pool

Right Click on “Standard Edition Front End Servers” | New Front End Pool

NOTE: While the topology builder and this blog refer to a Standard Edition Front End Pool, just be aware that a Standard Edition Front End Pool really is just one Front End Server, because there only can be one server in a Standard Edition Pool.

topology-new-front-end_thumb2

Next | Enter our Backup Front End FQDN (FE02.lab.local) | Next

fqdn-pool_thumb2

Check Conferencing, Enterprise Voice. (Note: you will not be able to check CAC because only 1 per Site)

pool-options_thumb2

Now instead of screenshots for each screen, we’ll just note what we want to check.

  • Collocate Mediation = Yes | Next
  • Enable and Edge Pool = No | Next
  • let defaults | Next
  • let Defaults (Note: you need to create this share just like your original share) | Next
  • let defaults | Next
  • let defaults | Next
  • Action | Topology | Publish

Goto the Primary (FE01.lab.local) Standard Server and Open Lync Server 2013 Deployment Wizard

Click on “Install or Update Lync Server System”

step 2 and Run

After it completes, click Finish.

Now Goto the Backup (FE02.lab.local) Standard Server and Open Lync Server 2013 Deployment Wizard

Click on “Install or Update Lync Server System”

  • Step 1 Run (15-30minute wait) Finish
  • Step 2: Run | Next (10minutes wait)
  • Step 3
  • Step 4

We’ll Test Our 2nd Pool/Server By Moving Users to It

To test, log into Lync Server control panel. Notice you will now be asked which Lync pool you want to log in to. Let’s select FE01.lab.local.

lscp-which-fe-pool-login_thumb2

Once the LSCP is open well click Users | Find | Select u1@lab.local | Action | Move Selected Users to Pool… |

move-users_thumb2

Now lets select our new Pool/Server (FE02.lab.local) and click OK.

select-pool_thumb2

After you move a user there is no need to refresh the user list, this is automatically done for you. And, sure enough, the u1@lab.local is now on FE02.lab.local! Great.

user-moved-to-fe02_thumb3

Now lets open Lync 2013 client and login using user u1@lab.local that we just enabled on our 2nd Standard Edition Front End Pool/Server (FE02.lab.local). Good, our new pool works!

What Happens when we change Pools During an Active Conversation or Call?

Since we could easily move user(s) to our new Pool/Server with no sweat, now lets get dangerous. Call someone using u1@lab.local and CHANGE POOLS DURING THE CALL. Winking smileLet’s repeat the steps we just took above, but do it during a live call and see what happens.

Below is a screenshot of what happens if you change pools/servers during a peer to peer call:

  • The Lync 2013 client will momentarily logout and back in again
  • During this time (as you see below) the call continues
  • Sharing continues
  • Video continues
  • As noted in the conversation window, functionality is momentarily limited:
    • Video cannot be started during momentary logout/in
    • Sharing limited and below items will be interrupted
      • Polls
      • whiteboard
      • Powerpoint

audiocall-during-pool-change_thumb1

That’ pretty cool, right? Yeah.

Setup a Resilient Pool (aka Associated Backup Pool)

Now let’s setup our 2nd Front End Pool/Server as an Associated backup pool so that if our 1st Front End Pool goes down the clients can automatically failover to the 2nd Front End Pool.

Open Topology Builder and download the topology.

Next, we’ll edit the primary “Standard Edition Front End Servers” by right clicking and click “Edit Properties”

edit-fe01-backup-pool_thumb2

Now we can define our Resiliency settings

  • Associated backup pool = FE02.lab.local; (Note the warning about having both FE’s in the same site. For our lab, and in some production we can ignore this)
  • Automatic = Checked
  • Failover = 30secs (for lab purposes, this would be short for production…)
  • Failback = 30secs (for lab purposes)
  • Then click OK to finish.

resiliency-settings_thumb2

Let’s Publish the Topology by clicking: Action | Topology | Publish | Next |

Open text file to see what you should do next. In our case we are instructed to run Install or Update Setup/Update on FE01 and FE02. Now click Finish.

finished-topology-publish_thumb2

Based on our “next steps” instructions noted above, lets open Lync Server Deployment Wizard on FE01.lab.local and click on “Install or Update Lync Server System”

  • Step 2 Run | Next |Next
  • Step 4 Run | Next | (this will get our new Lync server Backup Service running)

Lets open Lync Server Deployment Wizard on FE02.lab.local and click on “Install or Update Lync Server System”

  • Step 2 Run | Next
    • NOTE: If Step 2 fails with “Can not update database XDS”  error then we need to manually install the rtc database using the PS command below:
    • install-csdatabase –centralmanagementdatabase –sqlserverfqdn FE02.lab.local –sqlinstancename rtc
    • Now run Step 2 again.
  • Step 3 (if necessary)
  • Step 4

Run the below Powershell commands on your FE01.lab.local to ensure conferencing data is replicated:

  • Invoke-CSBackupServiceSync –PoolFqdn FE01.lab.local
  • Invoke-CSBackupServiceSync –PoolFqdn FE02.lab.local

Add DNS SRV Record for Backup Pool/Server

Now lets go into DNS and add a record for our Backup Pool /Server. This SRV record is necessary so that if the first server (FE01.lab.local in our lab) goes down, the client can find the backup Pool/Server.

So let open the DNS server management and add the SRV record. The things that are important:

  • Service = _sipinternaltls
  • Protocol = _tcp
  • Priority = 10 (take note: this value is different than your initial SRV record)
  • Weight = 10 (take note: this value is different than your initial SRV record)
  • Port number = 5061
  • Host offering this server = FE02.lab.local

backup-sipinternaltls_srv_record_thu

After you have added this DNS record you might want to verify it has taken effect on the client PC by running NSLookup on the clients you will be testing.

  • NSLookup
  • set type=srv
  • _sipinternaltls._tcp.lab.local

test-that-srv-is-working_thumb3

You Might Need This Step, But Only do it if Needed: Remove The Cert Without the Backup Server Name in it

NOTE: Please, take a minute and thank Dustin Hannifin and Jason Lee for providing this crucial step in this blog post. Winking smile

With both Primary and Backup Front End Server running do the following:

Exit Lync 20013 client on client machine.

On same client machine: Open MMC

File | Add/Remove Snap-in… | Certificates | My User Account | Ok

Navigate to: Personal | Certificates and delete the cert named same as your Lync username.

delete-certificate_thumb2

Now let log back into Lync 2013 client.

Now, Let’s Test Resiliency by Disabling NIC on Primary Front End (FE01.lab.local)

Make sure all your users (that you want to test resiliency for) are homed on FE01.lab.local. Next, we’ll simulate our FE01.lab.local machine being down by disabling the NIC.

diable-nic_thumb2

Now around 30 seconds, our client(s) should log out. Sure enough!

logout-at-30seconds_thumb1

Now they will try to login to the backup pool (in this case FE02.lab.local)…

NOTE: We setup our failover to happen in 30seconds. I’ve noticed in my lab the failing Lync clients will logout very near 30 seconds, but it could take several minutes till the clients are able to log back into the Associated Backup Pool/Server (FE02.lab.local). (ie: be fully failed over) I haven’t taken the time to investigate if this is my lowly lab’s performance Winking smile, or something built into Lync. (if someone knows, please post a comment)

But sure enough, it logged into backup pool! You will notice the Lync 2013 client let’s you know you have some limitations:

  • Contact List is unavailable
  • Call Forwarding may not be working
  • Delegates and Team-Call may not be receiving calls
  • Limited chat room access
  • Etc.

logged-into-backup-pool_thumb3

Now if we enable the NIC on FE01.lab.local the clients should Failback to FE01.lab.local in 30 seconds. (NOTE: on my lab some clients would failback as soon as 10 seconds.)

Next We Will Take a Look at New Lync Server 2013 Failover Options

Much of what we have discussed in this blog so far is largely the functionality you will find in Lync Server 2010. (I suspect you could use most of the above steps in Lync 2010.) But with Lync Server 2013, the Lync Server administrator can now failover the CMS and the failed pool so that the “Limited Functionality due to outage” is removed. Let’s get started with our failover.

Our first step is to find out where the Active Central Management Database is hosted. To do this we run the PowerShell:

  • Get-CsService –CentralManagement

As shown below, FE01.lab.local is the PoolFqdn (we will refer to this as $CMS_Pool) of the currently Active CMS.

get-csservice-active-cms

The next step is to check if the the $CMS_Pool is running Lync Server 2013. You can do this in Topology Builder (in our lab we know it is, but in a live environment we might not) If the $CMS_Pool is running Lync 2013 we can use this PowerShell to see who it’s backup pool is:

Get-CsPoolBackupRelationship –PoolFQDN $CMS_Pool

As shown below we can see the $Backup_Pool is FE02.lab.local

get-cspoolbackuprelationship

Next we will see if the $CMS_Pool is available right now:

Get-CsManagementStoreReplicationStatus –CentralManagementStoreStatus

Below we have an example how this command will look with the $CMS_Pool available.

replication-status-with-cms-available

Now lets disable the NIC on $CMS_Pool (ie FE01.lab.local) to simulate server down. Our primary Lync FE is now down! (shown below)

smoking_drive

Now run the Get-CsManagementStoreReplicationStatus –CentralManagementStoreStatus  command again. Note that the command will fail/error out if the $CMS_Pool/FE01.lab.local is not available.

(NOTE: If this is a Ent. Edition server you will need to check which Back End holds the primary CMS using: Get-CsDatabaseMirrorState -DatabaseType CMS -PoolFqdn <Backup_Pool Fqdn> . Read more about this command by Clicking Here. Running this command on Std. Edition will fail. On a Std. Edition server there is only one server so we know which it is. )

Next we will run the command to failover the Central Management Server to our Backup Server:

  • Invoke-CsManagementServerFailover -BackupSqlServerFqdn FE02.lab.local –BackupSqlInstanceName RTC –Force

invoke-cmsfailover

Now lets verify the move happened by running:

  • Get-CsManagementStoreReplicationStatus –CentralManagementStoreStatus

Sure enough! the new ActiveMasterFQDN is now FE02.lab.local (as shown below). Great!

verify-that-cms-moved-to-backup

Now we can fail over the Pool by running:

  • Invoke-CsPoolFailOver –PoolFqdn FE01.lab.local –Disastermode –Verbose

After running…Voila! The Lync Client services are automatically restored to Lync 2013and the “Limited Functionality” notice disappears with no user interaction!

full-lync-services-restored

Notes:

  • On my 3 user lab this script took about 50 seconds to complete. After it completed I waited a little over a minute until full capability was restored to the Lync client!
  • The Chat service was not restored because resiliency was not setup in our lab for this service.

Conclusion

Well--yahoo! We have successfully setup a Lync Standard Edition Associated Backup Pool and we have demonstrated Lync Server 2013’s very improved complete Failover resiliency.

Continue your lab with more articles in this Lync Server 2013 Step by Step Series:


Special Thanks to Elan Shudnow and his great article on Lync 2010 Resiliency:
http://www.shudnow.net/2012/05/04/lync-2010-central-site-resilience-w-backup-registrars-failovers-and-failbacks-part-3/

http://social.technet.microsoft.com/wiki/contents/articles/9289.second-lync-standard-edition-server-to-provide-a-limited-high-availability-en-us.aspx

http://jasonmlee.net/archives/459

If you want to Fail Back to FE01.lab.local

  • Invoke-CsPoolFailback -PoolFQDN FE01.lab.local –Verbose  (may take 10-15minutes; Lync will logout/in near end)
  • Invoke-CsManagementServerFailover -BackupSqlServerFqdn FE02.lab.local BackupSqlInstanceName RTC –Force ( this just takes 10secs)

19 comments:

  1. Any trouble shooting ideas for when you cannot move users to the STD edition server?

    ReplyDelete
    Replies
    1. Hello, What is the error message/event your getting?

      Delete
  2. Interesting, on step "Setup Resilient Pool" on FE02 step 2 I got different error "exception of type 'microsoft.rtc.management.deployment.deploymentexception' was thrown"
    still same solution applies: "install-csdatabase –centralmanagementdatabase –sqlserverfqdn FE02.lab.local –sqlinstancename rtc". After running this cmdlet step 2 ran completed successfully!

    ReplyDelete
    Replies
    1. Red Phone,
      Thanks for that observation. (this might be a slight difference between Preview and RTM? not sure)

      but once again, thanks for that.

      Delete
  3. Its the same error in RTM. Hopefully its addressed in a later update.

    ReplyDelete
  4. Hi Matt, Enjoy your work. You have always been very helpful for me in my endeavour to get the most out of Lync. My question which I think I know the answer for is. "Can I setup automatic failover for Lync 2013 SE?" or does there have to be a manual element. I knwo you need to have SQL mirroring and witnesses setup for automation. Does SE only allow SQL express or can I move my DB's to a full SQL box and setup Mirroring and failover ?

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. Matt, used this post to troubleshoot part of our 2010 to 2013 migration. Our Topology already had the 2013 backup registrar setup, however when we migrated the Central Management store from 2010 to the 2013 pool the backup 2013 server wouldn't replicate or report a "happy" status in CSCP.
    Had to re-run the deployment on the 2013 Backup, delete the CMS database on the 2013 backup and then run the powershell to re-create the database and reboot and then everyone was happy.

    -Jason (jasonmlee.com)

    ReplyDelete
  7. Matt, hello. Can you help me? When i do "Invoke-CSBackupServiceSync –PoolFqdn FE02.lab.local" (on FE01.lab.local), i get error The HTTP request is unauthorized with client authentication scheme 'Ntlm'. The authentication header received from the server was 'Negotiate, oXsweaADCgEBonIEcGBuBgkqhkiG9xIBAgIDAH5fMF2gAwIB
    BaEDAgEepBEYDzIwMTMwNjEyMTYxMzMwWqUFAgMJkkKmAwIBKakZGxdDRU5URVIuQVNTT1JUSS5LT01
    JLkNPTaoXMBWgAwIBAaEOMAwbCmx5bmMyMDEzMiQ=".'
    Sync dont work...

    ReplyDelete
  8. http://technet.microsoft.com/en-us/library/gg398976.aspx and it works!

    ReplyDelete
  9. Does any command require after addition or removal of server from a pool?

    ReplyDelete
  10. Hi All,
    Can someone tell me if you always have to failover the CMS database manually (commandline) when there is an outage of the primary pool server ? Isn't there a way to do this automtically ?
    Thank you

    ReplyDelete
  11. Hi Matt,

    Thank you for the great guide! Any idea why I can see messages from users on server A in pool A, but I cannot respond and get an error that user from server A is offline, even though they are not and I see it OK. I am in server B/pool B. Thanks!!

    ReplyDelete
  12. Hi Matt,

    i think you have a mistype on this part....

    "If you want to Fail Back to FE01.lab.local

    Invoke-CsPoolFailback -PoolFQDN FE01.lab.local –Verbose (may take 10-15minutes; Lync will logout/in near end)
    Invoke-CsManagementServerFailover -BackupSqlServerFqdn FE02.lab.local BackupSqlInstanceName RTC –Force ( this just takes 10secs)"

    If you want to failover the CMServer to FE01/Pool01, you have to point to the SQL Server of the FE01.

    Correct me if i'm wrong :)

    ReplyDelete
  13. How you configure pool failover in a deployment with 2 central sites, each site containing 2 standard edition servers and the servers needs to failover to the secondary server in its site ?

    ReplyDelete
  14. I think this is an informative post and it is very useful and knowledgeable. therefore, I would like to thank you for the efforts you have made in writing this article.
    spagart GmbH online

    ReplyDelete
  15. Is this article applicable for adding second server in SfB 2015 Standard edition server as well?

    ReplyDelete
  16. This comment has been removed by the author.

    ReplyDelete
  17. What about removing all of this? I am about to decom my DR and voice resiliency config form the topology and want to know what ramifications (if any) it will have on production and users. I know the setup of this requires to run the deployment wizard on each FE server in the pool, but what needs to be done to remove it?

    Cheers!

    ReplyDelete

Note: Only a member of this blog may post a comment.