Configure the Prometheus receiver to collect Milvus metrics

Collect Milvus metrics with the Splunk Distribution of the OpenTelemetry Collector.

You can monitor the performance of Milvus vector databases by configuring the Splunk Distribution of the OpenTelemetry Collector to send Milvus metrics to Splunk Observability Cloud.

This solution uses the Prometheus receiver to collect metrics from Milvus and its subcomponent. Milvus exposes a http://<component-host>:9091/metrics endpoint that publishes Prometheus-compatible metrics.

To configure the Prometheus receiver to collect metrics from Milvus vector databases, you must first deploy Milvus locally or on a cloud server in either standalone or distributed mode. For instructions, see Overview of Milvus Deployment Options in the Milvus documentation.

  1. Deploy the Splunk Distribution of the OpenTelemetry Collector to your host or container platform:
  2. To activate the Prometheus receiver for Milvus manually in the Collector configuration, make the following changes to your values.yaml configuration file.
    1. Add prometheus/milvus to the receivers section. For example, with Milvus deployed in distributed mode:
      YAML
      prometheus/milvus: 
        config: 
           global: 
             scrape_interval: 10s 
           scrape_configs: 
             - job_name: 'milvus-scraper' 
               metrics_path: /metrics 
               static_configs: 
                 - targets:  
                   - 'milvus-proxy:9091'  
                   - 'milvus-querynode:9091'  
                   - 'milvus-datanode:9091'  
                   - 'milvus-indexnode:9091'  
                   - 'milvus-rootcoord:9091'
    2. By default, Milvus exposes a large number of metrics. Add the following configuration under the processors section to filter only the metrics required for the built-in dashboard:
      YAML
      processors:  
        filter/milvus_metrics:  
          metrics:  
            include:  
              match_type: strict  
              metric_names:  
                - "milvus_datacoord_compaction_task_num"  
                - "milvus_datacoord_datanode_num"  
                - "milvus_datacoord_segment_num"  
                - "milvus_datacoord_stored_binlog_size"  
                - "milvus_datacoord_stored_index_files_size"  
                - "milvus_datanode_compaction_delete_count"  
                - "milvus_datanode_compaction_latency"  
                - "milvus_datanode_compaction_missing_delete_count"  
                - "milvus_datanode_flush_buffer_op_count"  
                - "milvus_datanode_flushed_data_rows"  
                - "milvus_datanode_msg_rows_count"  
                - "milvus_num_node"  
                - "milvus_proxy_cache_hit_count"  
                - "milvus_proxy_delete_vectors_count"  
                - "milvus_proxy_insert_vectors_count"  
                - "milvus_proxy_mutation_latency"  
                - "milvus_proxy_req_count"  
                - "milvus_proxy_req_latency"  
                - "milvus_proxy_search_vectors_count"  
                - "milvus_proxy_sq_latency"  
                - "milvus_querycoord_querynode_num"  
                - "milvus_querycoord_task_num"  
                - "milvus_querynode_entity_num"  
                - "milvus_querynode_read_task_concurrency"  
                - "milvus_querynode_segment_num"  
                - "milvus_querynode_sq_queue_latency"  
                - "milvus_querynode_wait_processing_msg_count"  
                - "milvus_rootcoord_ddl_req_count"  
                - "milvus_rootcoord_ddl_req_latency"  
                - "milvus_rootcoord_force_deny_writing_counter"  
                - "milvus_rootcoord_proxy_num"  
                - "milvus_storage_kv_size"  
                - "milvus_storage_op_count"
    3. Add a new pipeline under the service section for Milvus and export the metrics to the target endpoints. For example:
      YAML
      service: 
        pipelines: 
          metrics/milvus: 
            receivers:  
              - prometheus/milvus 
            processors:  
              - filter/milvus_metrics  
              - memory_limiter 
              - batch 
              - resourcedetection 
              - resource 
            exporters: 
              - signalfx/histograms
  3. Restart the Splunk Distribution of the OpenTelemetry Collector.

Configuration settings

Learn about the configuration settings for the Prometheus receiver.

To view the configuration options for the Prometheus receiver, see Settings.

Metrics

The following metrics are available for Milvus databases. These metrics fall under the default metric category. For more information on these metrics, see Milvus Metrics Dashboard in the Milvus documentation.

Milvus metrics

Metric name Metric type Description
milvus_datacoord_compaction_task_num gauge Current number of active compaction tasks in DataCoord.
milvus_datacoord_datanode_num gauge Current number of active DataNodes managed by DataCoord.
milvus_datacoord_segment_num gauge Current number of segments managed by DataCoord.
milvus_datacoord_stored_binlog_size gauge Total binlog size (in bytes) of all healthy segments managed by DataCoord.
milvus_datacoord_stored_index_files_size gauge Total size (in bytes) of index files for all segments managed by DataCoord.
milvus_datanode_compaction_delete_count counter Total number of delete entries processed during segment compaction in a DataNode.
milvus_datanode_compaction_latency histogram Latency (in milliseconds) of segment compaction operations in a DataNode.
milvus_datanode_compaction_missing_delete_count counter Total number of delete entries that were expected but not applied during segment compaction in a DataNode.
milvus_datanode_flush_buffer_op_count counter Total number of buffer flush operations performed by the DataNode.
milvus_datanode_flushed_data_rows counter Total number of data rows successfully flushed from memory to storage by the DataNode.
milvus_datanode_msg_rows_count counter Total number of data rows consumed from the message stream by the DataNode.
milvus_num_node gauge Current number of active service nodes and coordinators in the Milvus cluster.
milvus_proxy_cache_hit_count counter Total number of cache hits recorded by the Milvus Proxy.
milvus_proxy_delete_vectors_count counter Total number of vectors successfully deleted through the Milvus Proxy.
milvus_proxy_insert_vectors_count counter Total number of vectors successfully inserted through the Milvus Proxy.
milvus_proxy_mutation_latency histogram Latency (in milliseconds) of successful insert and delete operations handled by the Milvus Proxy.
milvus_proxy_req_count counter Total number of client operations executed through the Milvus Proxy.
milvus_proxy_req_latency histogram Latency (in milliseconds) of client requests processed by the Milvus Proxy.
milvus_proxy_search_vectors_count counter Total number of vectors successfully searched through the Milvus Proxy.
milvus_proxy_sq_latency histogram Latency (in milliseconds) of successful search and query operations processed by the Milvus Proxy.
milvus_querycoord_querynode_num gauge Current number of QueryNodes managed by the QueryCoord component.
milvus_querycoord_task_num gauge Current number of tasks in the QueryCoord scheduler.
milvus_querynode_entity_num gauge Number of searchable/queryable entities in the QueryNode, grouped by collection, partition, and state.
milvus_querynode_read_task_concurrency gauge Number of read tasks currently executing concurrently in the QueryNode.
milvus_querynode_segment_num gauge Number of segments currently loaded in the QueryNode, grouped by collection, partition, state, and number of indexed fields.
milvus_querynode_sq_queue_latency histogram Latency (in milliseconds) that search and query requests spend waiting in the QueryNode queue.
milvus_querynode_wait_processing_msg_count gauge Number of messages currently waiting to be processed in the QueryNode.
milvus_rootcoord_ddl_req_count counter Total number of DDL operations processed by the RootCoord.
milvus_rootcoord_ddl_req_latency histogram Latency (in milliseconds) of DDL operations processed by the RootCoord.
milvus_rootcoord_force_deny_writing_counter counter Total number of times Milvus entered a force-deny-writing state enforced by RootCoord.
milvus_rootcoord_proxy_num gauge Current number of Proxy nodes managed by the RootCoord.
milvus_storage_kv_size histogram Size of key-value data stored in Milvus storage (in bytes).
milvus_storage_op_count counter Total number of persistent data operations performed in Milvus storage.

Attributes

The following resource attributes are available for Milvus databases.

Milvus attributes

Attribute name Description Values
node_id The unique identity of a role. A globally unique ID generated by Milvus.
status The status of a processed operation or request.
  • abandon

  • success

  • fail

query_type The type of read request.
  • search

  • query

msg_type The type of message.
  • insert
  • delete
  • search
  • query
segment_state The status of a segment.
  • Sealed
  • Growing
  • Flushed
  • Flushing
  • Dropped
  • Importing
cache_state The status of a cached object.
  • hit
  • miss
cache_name The name of a cached object. This label is used together with the label cache_state.

Examples of possible values:

  • CollectionID

  • Schema

channel_name Physical topics in message storage (Pulsar or Kafka).

Examples of possible values:

  • by-dev-rootcoord-dml_0

  • by-dev-rootcoord-dml_255

function_name The name of a function that handles certain requests.

Examples of possible values:

  • CreateCollection

  • CreatePartition

  • CreateIndex

user_name The username used for authentication. A username of your preference.
index_task_status The status of an index task in meta storage.
  • unissued
  • in-progress
  • failed
  • finished
  • recycled