Configure your Splunk Observability Cloud account to collect GCP VertexAI metrics

Learn how to configure your Splunk Observability Cloud account to collect GCP VertexAI metrics.

You can monitor the performance of Google Cloud Platform (GCP) VertexAI applications by configuring your GCP VertexAI applications to send metrics to Splunk Observability Cloud. This solution creates a cloud connection in your Splunk Observability Cloud account that collects metrics from Google Cloud Monitoring.

Complete the following steps to collect metrics from GCP VertexAI.

  1. Connect GCP to Splunk Observability Cloud. For more information on the connection methods and instructions for each method, see Connect to Google Cloud Platform.
  2. To monitor GCP VertexAI metrics with Splunk Observability Cloud, run your applications that use GCP VertexAI models.

Metrics

Learn about the available metrics for GCP VertexAI.

The following metrics and resource attributes are available for GCP VertexAI applications. These metrics fall under the default metric category.

For more information on these metrics, see Cloud Monitoring metrics for Vertex AI in the Google Cloud documentation.

Metric name Unit Description
prediction/online /prediction_count count Number of online predictions.
prediction/online /prediction_latencies ms Online prediction latency of the deployed model.
prediction/online /response_count count Number of different online prediction response codes.
prediction/online /prediction_latencies.count count Number of online predictions.
prediction/online /prediction_latencies.sumOfSquareDeviation ms The sum of squared deviation for prediction latencies.
publisher/online_serving /model_invocation_count count Number of model invocations (prediction requests).
publisher/online_serving /model_invocation_latencies.sumOfSquareDeviation ms The sum of squared deviation for model invocation latencies.
publisher/online_serving /model_invocation_latencies.count count Number of model invocations (prediction requests).
publisher/online_serving /model_invocation_latencies ms Model invocation latencies (prediction latencies).
publisher/online_serving /token_count count Accumulated input/output token count.
publisher/online_serving /consumed_token_throughput count Overall throughput used (accounting for burndown rate) in terms of tokens.
publisher/online_serving /consumed_throughput count Overall throughput used (accounting for burndown rate) in terms of characters.
publisher/online_serving /character_count count

Accumulated input/output character count.

publisher/online_serving /first_token_latencies ms Duration from request received to first token sent back to the client.
publisher/online_serving /first_token_latencies.count count Number of first token latencies.
publisher/online_serving /first_token_latencies.sumOfSquareDeviation ms The sum of squared deviation for first token latencies.

Attributes

Learn about the available resource attributes for GCP VertexAI.

The following resource attributes are available for all GCP VertexAI metrics:
  • gcp_project_status

  • gcp_project_name

  • gcp_project_label_last_revalidated_by

  • model_user_id

  • gcp_project_number

  • request_type

  • gcp_id

  • gcp_project_label_cloud_registration_id

  • gcp_project_creation_time

  • gcp_project_label_last_revalidated_at

  • input_token_size

  • output_token_size

  • project_id

  • metricTypeDomain

  • gcp_project_label_environment

  • publisher

  • monitored_resource

  • gcp_project_label_account_type

  • gcp_project_label_owner_group

  • service

  • Location

In addition, the type resource attribute is available for the publisher/online_serving /token_count and publisher/online_serving /character_count metrics.

Next steps

How to monitor your AI components after you set up Observability for AI.

After you set up data collection from supported AI components to Splunk Observability Cloud, the data populates built-in experiences that you can use to monitor and troubleshoot your AI components.

The following table describes the tools you can use to monitor and troubleshoot your AI components.
Monitoring tool Use this tool to Link to documentation
Built-in navigators Orient and explore different layers of your AI tech stack.
Built-in dashboards Assess service, endpoint, and system health at a glance.
Splunk Application Performance Monitoring (APM) service map and trace view View all of your LLM service dependency graphs and user interactions in the service map or trace view.

Monitor LLM services with Splunk APM