Fix ConfigMgr CMG Stuck in Starting State
On my Configuration Manager production server, I noticed the ConfigMgr CMG stuck in starting state. Running the CMG connection analyzer showed a warning at Check the CMG service is in ready state. While the CloudMgr.log showed Failed to get access token to resource endpoint. Let’s troubleshoot this issue and fix it.
I also have to admit that I did not look at my CMG state for long time. When you setup cloud management gateway for the first time, the initial state of CMG is provisioning started. Then it changes the state to Provisioning completed and finally to ready state. I assumed my CMG would be working fine.
However one day I noticed CMG was stuck in starting state with no option to stop service or start service. The options were greyed out. I could synchronize the CMG configuration and I had to run CMG connection analyzer and analyze CMG log files to troubleshoot this issue.
My CMG instance was up and running fine. I could RDP my CMG instance and content distribution to CMG was working fine from my Configuration Manager. What bothered me was the CMG was still stuck in starting state and I had to find out why.
Fix ConfigMgr CMG Stuck in Starting State
If you notice your ConfigMgr CMG stuck in starting state, you can try the steps covered in this post. Although the solution may differ because the errors and warnings in your case could be different. I typically use Cloud Manager Gateway connection analyzer for real-time verification. The CMG connection analyzer checks the current status of the service, and the communication channel through the CMG connection point to any management points that allow CMG traffic.
Running the CMG connection analyzer showed the warning at first step – Check the CMG service is in ready state. State of the CMG service is ‘2’. For more information, see CloudMgr.log on Service Connection Point on CMG deployment progress.
Reviewing the CloudMgr.log on service connection point, I could see set of warnings and error – Failed to get access token to resource endpoint.
Here is the set of lines picked from CloudMgr.log file on service connection point.
WARNING: Warning: Exception during cloud service monitoring task for service PRAJWALCMG SMS_CLOUD_SERVICES_MANAGER WARNING: Exception Microsoft.ConfigurationManager.AzureManagement.FailedToCommunicateToServiceException:Failed to get access token to resource endpoint SMS_CLOUD_SERVICES_MANAGER WARNING: Stack trace: at Microsoft.ConfigurationManager.AzureManagement.ResourceManager.Initialization Microsoft ConfigurationManager CloudServicesManager ServiceMonitorTask MonitorCloudDistributionPoint Microsoft.ConfigurationManager.CloudServicesManager.ServiceMonitorTask.Start(Object taskState) SMS_CLOUD_SERVICES_MANAGER WARNING: Inner exception System.AggregateException:One or more errors occurred. WARNING: Inner exception stack trace: at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)~~ at Microsoft.ConfigurationManager.AzureManagement.ResourceManager.Initialize() TaskManager: 1 task(s) running, 0 task(s) waiting to start. ERROR: TaskManager: Task [AnalyticsCollectionTask: Service PRAJWALCMG] has failed. Exception Microsoft.ConfigurationManager.AzureManagement.FailedToCommunicateToServiceException, Failed to get access token to resource endpoint.
The ConfigMgr CMG was stuck in starting state because the secret key had expired for the server app. The secret key expiry (UTC) is shown next to server app. You should find the AAD applications under Administration\Overview\Cloud Services\Azure Active Directory Tenants. Right click the server app and click Renew Secret Key.
Sign-in with the credentials and renew the secret key for AAD application. If you see secret key successfully renewed message, the job is complete.
After few minutes I noticed that my CMG state changed from Starting to Ready. You may also right click cloud management gateway and click Synchronize Configuration.
Final step was to run the Cloud management gateway connection analyzer and see if there were any warnings. Running the CMG connection analyzer showed no errors or warnings and the CMG was healthy again.
I’m unable to find any kind of Alert settings for the secret key expiration as notated in this Microsoft Doc. Any ideas on how we can setup alerts for this so emails can be sent to the team before this issue surfaces?
You get the console notifications provided you have enabled them under Alerts section.