Cluster Metrics

The Cluster Agent Dashboard metrics derive from the Kubernetes API, and they report information for the clusters and pods. For any defined set of namespaces, the Cluster Agent reports events on these Kubernetes and hardware resources.

Splunk AppDynamics monitors cluster health and Kubernetes objects for:

Cluster Agent

Metric Name Description UI Location Metric Path
Availability

Availability of the Cluster Agent.

This metric helps in identifying if the Cluster Agent is down. Value of 100 represents that the Cluster Agent status is active, thus available.

Server > Metric Browser

Cluster Agent|Availability

Cluster Summary Metrics

Metric Name Description UI Location Metric Path
Error events count Number of error events Dashboard > Errors Hardware Resources|Cluster|Error events count
Evicted pods count Number of evicted pods Pods > Evicted Hardware Resources|Cluster|Evicted pods count
Eviction threats count Number of events that represent pod evictions Dashboard > Errors Hardware Resources|Cluster|Eviction threats count
Image pull errors Number of image pull errors Dashboard > Issues > Image Issues Hardware Resources|Cluster|Image pull errors
Image pulls Number of image pulls Dashboard > Issues > Image Issues Hardware Resources|Cluster|Image pulls
Info events count Number of informational events Dashboard > Errors Hardware Resources|Cluster|Info events count
Pod errors Number of errors related to pods Dashboard > Issues > Pod Issues Hardware Resources|Cluster|Pod errors
Pod Kills Number of pods that were killed Inventory > Pods > Pod Kills Hardware Resources|Cluster|Pod Kills
Pod restarts Number of times the pods restarted Dashboard > Issues > Pod Issues Hardware Resources|Cluster|Pod restarts
Pods Scaledowns Count of scaledowns; you can scale down your deployments and replica sets. Inventory > Pods > Scaledowns Hardware Resources|Cluster|Pods Scaledowns
Pods count Total count of pods Inventory > Pods > Phases > Normal Hardware Resources|Cluster|Pods count
Pods failed Number of failed pods Pods > Failed Hardware Resources|Cluster|Pods failed
Pods pending Number of pods in a pending state. Pending status normally indicates an issue. See theKubernetes documentation. Pods > Pending Hardware Resources|Cluster|Pods pending
Pods running Number of pods in a running state Pods > running Hardware Resources|Cluster|Pods running
Pods succeeded Number of pods in Succeeded phase Dashboard > Pods By Phase Hardware Resources|Cluster|Pods succeeded
Pods unknown Number of pods in Unknown state Dashboard > Pods By Phase Hardware Resources|Cluster|Pods unknown
Pods with Missing Dependencies - Config Maps and Secrets If a pod is dependent on any Config Maps & Secrets, then those dependencies are missing. Inventory > Pods > Missing Dependencies - Config Maps and Secrets Hardware Resources|Cluster|Pods With Missing Dependencies - Config Maps And Secrets (Pod Metrics for Inventory tab)
Pods with Missing Dependencies - Services If a pod is dependent on any Services, then those dependencies are missing.

Inventory > Pods > Missing Dependencies - Services

Hardware Resources|Cluster|Pods With Missing Dependencies (Pod Metrics for Inventory Tab)
Pods with No Limits

Number of pods with no limits (on CPU/memory) set. If you specified limits on any pod that you are starting, this metric indicates how many pods do not have a limit defined (Displays in the Inventory tab, under Pod Metrics).

Inventory > Pods > No Limits Hardware Resources|Cluster|Pods With No Limits
Pods With No Liveness Probe Number of pods with no liveness probe. If you configured a probe in Kubernetes to monitor liveness, the values display in the Inventory tab, under Pod Metrics. Inventory > Pods > No Probes -Liveness Hardware Resources|Cluster|Pods With No Liveness Probe
Pods With No Readiness Probe Number of pods with no readiness probe. If you configured a probe in Kubernetes to monitor readiness, the values display in the Inventory tab, under Pod Metrics. Inventory > Pods > No Probes -Readiness Hardware Resources|Cluster|Pods With No Readiness Probe
Privileged Pods Number of privileged pods that run with root access (Displays in the Inventory tab, under Pod Metrics). Inventory > Pods > Privileged Hardware Resources|Cluster|Privileged Pods

Storage errors Overall number of errors related to storage for the cluster. Inventory > Pod Metrics Hardware Resources|Cluster|Storage errors
Storage quota violations Number of storage quota violations; if someone exceeds that quota. Inventory > Pod Metrics Hardware Resources|Cluster|Storage quota violations

CPU

CPU Capacity

Metric Name Description UI Location Metric Path
Total (MilliCores) Total CPU capacity for the cluster in MilliCores Cluster Capacity > CPU Hardware Resources|Cluster|CPU|Capacity|Total (MilliCores)
Used (MilliCores) CPU capacity already used by the cluster in MilliCores Cluster Capacity > CPU Hardware Resources|Cluster|CPU|Capacity|Used (MilliCores)

CPU Quota

Metric Name Description UI Location Metric Path
Limit Used (%) Percentage of CPU limit quota used Dashboard > Quotas > CPU Limit

Hardware Resources|Cluster|CPU|Quota|Limit Used (%)

Limit Used (MilliCores) MilliCores value for CPU limit quota used Dashboard > Quotas > CPU Limit Hardware Resources|Cluster|CPU|Quota|Limit Used (MilliCores)
Request Used (%) Percentage of CPU request quota used Dashboard > Quotas > CPU Request Hardware Resources|Cluster|CPU|Quota|Request Used (%)
Request Used (MilliCores) MilliCores value for CPU request quota used Dashboard > Quotas > CPU Request Hardware Resources|Cluster|CPU|Quota|Request Used (Millicores)

CPU Utilization

Metric Name Description UI Location Metric Path
Limit (MilliCores)

Limit of CPU which can be used by the pods. Only the pods belonging to monitored namespaces are used to calculate this metric.

If this value is not specified for any pod, then the value is calculated as the CPU limit of the node.

For example:

  • If node limit is 24m with five pods without any CPU limit, then this value is displayed as 24m.
  • If node limit is 24m with five pods and each pod has a limit of 5m, this value displays the limit as 25m.
Dashboard > Utilization > CPU Hardware Resources|Cluster|CPU|Utilization|Limit (MilliCores)
Request (MilliCores) MilliCore value of CPU for which all the pods in monitored namespaces have requested. Dashboard > Utilization > CPU Hardware Resources|Cluster|CPU|Utilization|Request (MilliCores)
Used (MilliCores) Actual CPU which the pods from monitored namespaces are currently using. Dashboard > Utilization > CPU Hardware Resources|Cluster|CPU|Utilization|Used (MilliCores)

DaemonSets

Metric Name Description UI Location Metric Path
Count Number of daemon sets that exist Inventory > Objects > DaemonSets > (Count) HardwareResources|Cluster|DaemonSets|Count
Nodes Available Number of nodes that are running and available on the cluster Inventory > Objects > DaemonSets > Available HardwareResources|Cluster|DaemonSets|Nodes Available
Nodes MissScheduled Number of nodes that are running, but should not be running Inventory > Objects > DaemonSets > MissScheduled HardwareResources|Cluster|DaemonSets|Nodes MissScheduled
Nodes Unavailable Number of nodes that should be running, but are not running Inventory > Objects > DaemonSets > Unavailable HardwareResources|Cluster|DaemonSets|Nodes Unavailable

Deployments

Metric Name Description UI Location Metric Path
Count Number of deployments that exist in the cluster Inventory > Objects > Deployments > (Count) HardwareResources|Cluster|Deployments|Count
Replicas Number of pod replicas in the cluster that are not in a terminated state Inventory > Objects > Deployments > Available HardwareResources|Cluster|Deployments|Replicas
Replicas Unavailable Total number of unavailable pod replicas across all deployments in the cluster Inventory > Objects > Deployments > Unavailable HardwareResources|Cluster|Deployments|ReplicasUnavailable

Endpoints

Metric Name Description UI Location Metric Path
Count Number of endpoints in the cluster Inventory > Services > Endpoints > Count HardwareResources|Cluster|Endpoints|Count
Not Ready Address Total number of not ready addresses for all the endpoints in the cluster Inventory > Services > Endpoints without ready IP HardwareResources|Cluster|Endpoints|Not Ready Address
Orphans Total number of endpoints in the cluster which do not have any ready, nor any not ready addresses Inventory > Services > Orphan Endpoints with no IP HardwareResources|Cluster|Endpoints|Orphans
Ready Address Total number of ready addresses for all the endpoints in the cluster Inventory > Services > Endpoints HardwareResources|Cluster|Endpoints|Ready Address

Jobs

Metric Name Description UI Location Metric Path

Count

Total number of jobs in the cluster. Inventory > Objects > Jobs > (Count) Hardware Resources|Cluster|Jobs|Count

Pods Active

Total number of active pods for all the jobs in the cluster. Inventory > Objects > Jobs > Active Hardware Resources|Cluster|Jobs|Pods Active

Pods Failed

Total number of pods which reached phase Failed for all the jobs in the cluster. Inventory > Objects > Jobs > Failed Hardware Resources|Cluster|Jobs|Pods Failed

Pods Succeeded

Total number of pods which reached phase Succeeded for all the jobs in the cluster. Inventory > Objects > Jobs > Succeeded Hardware Resources|Cluster|Jobs|Pods Succeeded

Memory

Memory Capacity

Metric Name Description UI Location Metric Path
Total (MB) Total Memory capacity for the cluster in MBs. Dashboard > Cluster > Capacity > Memory Hardware Resources|Cluster|Memory|Capacity|Total (MB)
Used (MB) Memory capacity already used by the cluster in MBs Dashboard > Cluster > Capacity > Memory Hardware Resources|Cluster|Memory|Capacity|Used (MB)

Memory Quota

Metric Name Description UI Location Metric Path
Limit Used (%) Percentage of Memory limit quota used Dashboard > Quotas > Memory Limit

Hardware Resources|Cluster|Memory|Quota|Limit Used (%)

Limit Used (MB) MB value for Memory limit quota used Dashboard > Quotas > Memory Limit

Hardware Resources|Cluster|Memory|Quota|Limit Used (MB)

Request Used (%) Percentage of Memory request quota used Dashboard > Quotas > Memory Request Hardware Resources|Cluster|Memory|Quota|Request Used (%)
Request Used (MB) MB value for Memory request quota used Dashboard > Quotas > Memory Request Hardware Resources|Cluster|Memory|Quota|Request Used (MB)

Memory Utilization

Metric Name Description UI Location Metric Path
Limit (MB)

Limit of Memory which can be used by the pods. Only the pods belonging to monitored namespaces are used to calculate this metric.

If this value is not specified for any pod, then the value is calculated as the memory limit of the node.

For example:

  • If node limit is 24MB with five pods without any memory limit, then this value is displayed as 24MB.
  • If node limit is 24MB with five pods and each pod has a limit of 5MB, this value displays the limit as 25MB.
Dashboard > Utilization > Memory Hardware Resources|Cluster|Memory|Utilization|Limit (MB)
Request (MB) MB value of Memory for which all the pods in monitored namespaces have requested. Dashboard > Utilization > Memory Hardware Resources|Cluster|Memory|Utilization|Request (MB)
Used (MB) Actual Memory which the pods from monitored namespaces are currently using. Dashboard > Utilization > Memory Hardware Resources|Cluster|Memory|Utilization|Used (MB)

Nodes

Metric Name Description UI Location Metric Path
Master Count Number of master nodes in the cluster Inventory > Masters Hardware Resources|Cluster|Nodes|Master Count
Worker Count Number of worker nodes in the cluster Inventory > Workers Hardware Resources|Cluster|Nodes|Worker Count
Memory Pressure Count Number of nodes that are under memory pressure in the cluster Inventory > Memory Pressure Hardware Resources|Cluster|Nodes|Memory Pressure Count
Disk Pressure Count Number of nodes that are under disk pressure in the cluster Inventory > Disk Pressure Hardware Resources|Cluster|Nodes|Disk Pressure Count

Pods

Pods Capacity

Metric Name Description UI Location Metric Path
Total Count Total number of pods that a cluster can support

Pods > Total Count

Hardware Resources|Cluster|Pods|Capacity|Total Count
Used Count Number of pods already created in the cluster Pods > Count Hardware Resources|Cluster|Pods|Capacity|Used Count

Pods CPU Usage

Metric Name Description UI Location Metric Path
%Busy Scaled This normalises the CPU usage percentage relative to the CPU limit, scaling it to a more detailed unit. This metric displays how much of the allocated CPU resources (measured in milli-cores) are being used, providing a precise view of CPU utilisation with the CPU limit of the resource. Server > Metric Browser Root|Individual Nodes|<namespace>/<pod-name>|Hardware Resources|CPU|%Busy Scaled
%Busy

The percentage of the CPU used by a pod. If the CPU limit is provided for the pod, the busy % is calculated as the percentage of CPU used relative to the CPU limit of the pod.

If CPU limit of the pod is not specified, this is calculated as the percentage of CPU used relative to the CPU limit of the node or cluster, whichever is available.

Server > Metric Browser Root|Individual Nodes|<namespace>/<pod-name>|Hardware Resources|CPU|%Busy

Pods Memory Usage

Metric Name Description UI Location Metric Path
Used (MB) The amount of memory used by a pod. Server > Metric Browser Root|Individual Nodes|<namespace>/<pod-name>|Hardware Resources|Memory|Used (MB)

PVC

PVC Quota

Metric Name Description UI Location Metric Path
Used PVC quota already being used in the cluster (count) Dashboard > Quotas > PVC Hardware Resources|Cluster|PVC|Quota|Used
Used % Percentage of PVC quota already being used in the cluster Dashboard > Quotas > PVC Hardware Resources|Cluster|PVC|Quota|Used (%)

PVC Utilization

Metric Name Description UI Location Metric Path
Capacity (MB) Total PVC available for the pods in the monitored namespaces Dashboard > Utilization > PVCs Hardware Resources|Cluster|PVC|Utilization|Capacity (MB)
Request (MB) Value for PVC requested by pods in monitored namespaces Dashboard > Utilization > PVCs Hardware Resources|Cluster|PVC|Utilization|Request (MB)

ReplicaSets

Metric Name Description UI Location Metric Path
Count Number of replica set resources in the cluster Inventory > Objects > ReplicaSets > Count Hardware Resources|Cluster|Count
Replicas Total number of replicas for all the replica sets in the cluster Inventory > Objects > ReplicaSets > Count Hardware Resources|Cluster|ReplicaSets|Replicas
Replicas Available Total number of available replicas for all the replica sets in the cluster Inventory > Objects > ReplicaSets > Available Hardware Resources|Cluster|ReplicaSets|Replicas Available
Replicas Unavailable Total number of unavailable replicas for all the replica sets in the cluster Inventory > Objects > ReplicaSets > Unavailable Hardware Resources|Cluster|ReplicaSets|Replicas Unavailable

Services

Metric Name Description UI Location Metric Path
Count Total number of Kubernetes Services running in the cluster Inventory > Services > Services Hardware Resources|Cluster|Services|Count

StatefulSets

Metric Name Description UI Location Metric Path
Count Number of statefulsets in monitored namespaces Inventory > Objects > StatefulSets > (Count) Hardware Resources|Cluster|StatefulSets|Count
Replicas Ready Number of replicas in a ready state across all statefulsets in monitored namespaces Inventory > Objects > StatefulSets > Replicas Not Ready Hardware Resources|Cluster|StatefulSets|Replicas Ready
Replicas Desired Number of replicas across all statefulsets in monitored namespaces which are specified as desired in statefulset spec N/A Hardware Resources|Cluster|StatefulSets|Replicas Desired
Replicas Not Ready Number of replicas across all statefulsets in monitored namespaces which are not ready and are yet to be created or started Inventory > Objects > StatefulSets > Replicas Not Ready Hardware Resources|Cluster|StatefulSets|Replicas Not Ready
Collisions Number of hash collisions for statefulsets across all namespaces monitored N/A Hardware Resources|Cluster|StatefulSets|Collisions

Storage Quota

Metric Name Description UI Location Metric Path
Used (MB) Storage quota used by the cluster in MB Dashboard > Quotas > Storage Hardware Resources|Cluster|Storage|Quota|Used (MB)
Used (%) Percentage of storage quota used by the cluster Dashboard > Quotas > Storage Hardware Resources|Cluster|Storage|Quota|Used (%)