It was the end of the day for me, completely drained out after giving a training session of 7200 seconds (yeah, that’s straight three hours 😉). I promised myself that once the session is over, I would celebrate with a long drive and a can of beer. Got out of the office, jumped in the car, pressed on the accelerator, opened the can of beer. Just before my first sip, the life of DBA happened….

There was a notification on my phone, our alert mechanism sent a mail with subject highlighted in bold ‘Production server is unavailable. Please check’.

At this point I knew that this will be a short drive and beer will not see the light of day.

The next moment, Teams notifications started.

I opened the Cluster resource manager and saw a red failure icon on SQL services CEIP (MSSQLServer) service.

Problem Statement


Our primary node in FCI became unavailable so SQL services fail overed to node 2, but when it moved to node 2, it threw its bomb and shows ‘The cluster resource ‘SQL Server CEIP MSSQLSERVER of type generic service SQL Server (MSSQLSERVER) Failed’






Solution


SQL server CEIP Service is the Customer Experience Improvement Program that sends telemetry data back to Microsoft. CEIP Service gets installed with SQL Services. We can’t uncheck it during installation. After few basic checks, I gave it a thought to give a shot to disable the CEIP service.

And yes, I DID that. Later, I felt it wasn’t a good option as it was part of cluster, so I undid the changes.

No disabling the CEIP didn’t help. If you guys want to see how we can disable the CEIP service, we will release a short blog quickly on this.

What really helped me to bring the services up was to refresh the password and add account of the CEIP service as logon as service on node 2.

Below are the steps how I did it
1. Go to Services.msc
2. Scroll down to SQL Server CEIP service, Right click and select ‘properties’
3. Click tab ‘Log On’

Here I identified the account under which service is set to run is Virtual Account. These are created during the installation of SQL Server. These accounts are managed by the Operating System itself.



We also don’t know the password so we can’t type that into the appropriate box.
4. Easiest way is, just type the account and leave the password blank. Yes, that’s how easy it is.
5. Click ‘Ok’ then Windows will apply the correct password and give the virtual account “login as a service” permissions.

After this I tried to bring the failed cluster service up and it came online.
(And Yes, That beer is still in my refrigerator).