Collect metrics and traces from OpenAI services

Learn how to configure and activate the component for OpenAI services.

You can collect metrics and traces from OpenAI services by instrumenting your Python application using the Splunk Distribution of OpenTelemetry Python with the OpenTelemetry OpenAI instrumentation. The OpenAI instrumentation pushes metrics and traces to the OTLP receiver.

Complete the following steps to collect metrics and traces from OpenAI services.

  1. Deploy the Splunk Distribution of the OpenTelemetry Collector to your host or container platform:
  2. Start the Splunk Distribution of the OpenTelemetry Collector.
  3. Deploy the Python agent in your OpenAI service:
    1. Install the Splunk Distribution of OpenTelemetry Python using the guided setup or manual method.
    2. Install the Generative AI/LLM instrumentation by following the steps on OpenTelemetry OpenAI Instrumentation in the OpenTelemetry Python Contrib GitHub repository. You can use zero-code instrumentation or manual instrumentation.
  4. Run the service.

Metrics and attributes

Learn about the monitoring metrics available for OpenAI services.

The following metrics and resource attributes are available for OpenAI services. These metrics fall under the default metric category. For more information on these metrics, see Metrics in the OpenTelemetry documentation.
Metric name Histogram function Instrument type Unit Description Resource attributes
gen_ai.client.token.usage input histogram count Number of input tokens processed.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model error.type
gen_ai.client.token.usage output histogram count Number of output tokens processed.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model
  • error.type
gen_ai.client.token.usage sum histogram count Sum of tokens processed.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model
  • error.type
gen_ai.client.token.usage max histogram count Max number of total tokens processed.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model
  • error.type
gen_ai.client.token.usage min histogram count Min number of total tokens processed.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model
  • error.type
gen_ai.client.operation.duration sum histogram s Total time of durations.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model
  • error.type
gen_ai.client.operation.duration max histogram s Max length of durations.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model
  • error.type
gen_ai.client.operation.duration min histogram s Min length of durations.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model
  • error.type
gen_ai.client.operation.duration count histogram count Count of durations.
  • server.port
  • server.address
  • gen_ai.application_name
  • gen_ai.system
  • gen_ai.environment
  • gen_ai.operation.name
  • gen_ai.request.model
  • gen_ai.response.model
  • error.type

Next steps

How to monitor your AI components after you set up Observability for AI.

After you set up data collection from supported AI components to Splunk Observability Cloud, the data populates built-in experiences that you can use to monitor and troubleshoot your AI components.

The following table describes the tools you can use to monitor and troubleshoot your AI components.
Monitoring tool Use this tool to Link to documentation
Built-in navigators Orient and explore different layers of your AI tech stack.
Built-in dashboards Assess service, endpoint, and system health at a glance.
Splunk Application Performance Monitoring (APM) service map and trace view View all of your LLM service dependency graphs and user interactions in the service map or trace view.

Monitor LLM services with Splunk APM