Useful Info on OpsMgr Agent
Agent is one of the most important part in Operations Manager (OpsMgr) and all the monitoring intelligence will be run by the agent. Every agent reports to Management server (MS) in Management group. This MS is called primary MS for this agent. If agent is not able to communicate with primary MS then it will failover to other available MS. It will periodically check for primary MS and will failback once primary MS is available. It is easy to find the primary MS for any agent from console or we can check registry. (HKLM\Software\Microsoft\Microsoft Operations Manager\3.0\Agent\Management Groups\<Management Group Name>\Parent Health Service\0\“NetworkName”) and also we can find it by running command “Netstat | Findstr 5723” in the command prompt on the agent. If agent is failed-over other to available MS then this will be best method to find the MS it is currently reporting.
All the configurations and management pack downloaded from primary MS will be stored in the “Health Service State” folder. All the downloaded management pack will be in “\Program Files\System Center Operations Manager 2007\Health Service State\Management Pack” and config data will be wrote to “\Program Files\System Center Operations Manager 2007\Health Service State\Connector Configuration Cache\<Management Group Name>\OpsMgrConnector.Config.xml”
If we stop the System Center Management Service (HealthService) and delete the “HealthService State” folder, agent will get the MS name from the registry and will start downloading the data from the MS.
Every time agent communicates with MS it needs to authenticate before it can exchange data. Agent authentication in OpsMgr requires Kerberos or certificate authentication. In AD domain Kerberos is the default authentication protocol and in untrusted domains or workgroup, certificate based authentication used.
An OpsMgr agent uses a queue file (“Health Service Store” ESE DB - It’s normally under “\Health Service State\Health Service Store) to store data that needs to be sent to the MS. The queue file is used as part of normal communication between agent and a MS. These queue file prevent the loss of data when a management server is not available or agent is not able to communicate with other available MS. When queue is full, the agent will start dropping data (oldest first in general). Since the recent monitoring data is very important than older monitoring data.
Default size for agent queue is 15 MB and we can change the queue file size from the agent registry
HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters\ManagementGroups\<Management Group Name>\MaximumQueueSize
In OpsMgr you have the agent node and the watcher node. The agent node is the system with the OpsMgr agent that performs data collection, evaluation, etc. The watcher node is a designated system external to the agent that can perform monitoring to ensure the actual agent of interest is healthy. Heartbeat monitoring is handled by the Health Service watcher node (RMS/Management servers). The watcher nodes expect heartbeats from agents - if they don't get one appropriate rules will fire to indicate the problem. As per the default setting, the watcher node expect the heartbeats from the agent every 60 seconds and only 3 consecutive missed heartbeats are allowed and fourth missed heartbeat will raise an heartbeat failure alert and will initiate the ping to the agent and if it doesn’t response to ping an 'Failed to Connect to Computer' alert will be raised. Agents which are not sending heartbeat to MS will be shown as grayed-out in console.