Skip to main content

Datadog brings OpenAI model monitoring into the fold, launches new integration

Datadog
Image Credit: Datadog

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


New York-based Datadog, which provides a cloud observability platform for enterprise applications and infrastructure, today announced an integration to monitor OpenAI models such as GPT-4.

The offering, Datadog says, will help enterprise teams understand user interactions with their GPT-powered applications, ultimately enabling them to fine-tune the models for better performance and economies.

The announcement comes as OpenAI’s large language models continue to see adoption across a variety of enterprise-specific use cases, including business-critical areas such as customer service and data querying.

How does the OpenAI integration help?

Once up and running, the Datadog-OpenAI integration automatically tracks GPT usage patterns, providing teams with actionable insights into model performance and costs via dashboards and alerts.

For performance, the plugin looks at OpenAI API error rates, rate limits and response times, allowing users to identify and isolate issues within their applications. It also offers the ability to view OpenAI request volumes — along with metrics, traces, and logs containing prompts and corresponding completions — to understand how end customers are interacting with the applications, and to gauge quality of the output generated by their OpenAI models. 

“Customers can install the integration by instrumenting the OpenAI Python library to emit metrics, traces and logs for requests made to the completions, chat completions and embedded endpoints. Once instrumented, the metrics, traces and logs will be automatically available in the out-of-the-box dashboard provided by Datadog,” Yrieix Garnier, VP of product at Datadog, told VentureBeat. 

These dashboards can then be customized to drill down further into performance issues and optimize the models for improved user experience, the VP added.

On the costs front, Datadog says, the integration allows users to review token allocation by model or service and analyze the associated costs of OpenAI API calls. This can then be used to manage expenses more effectively and avoid unexpected bills for using the service.

While Garnier did confirm that customers of both companies are testing the integration, he did not share specific results they have witnessed so far. The connector currently works for multiple AI models from OpenAI, including the GPT family of LLMs, Ada, Babbage, Curie and Davinci.

New Relic offers something similar

New Relic, another player in the observability space, offers a similar OpenAI integration that tracks API response time, average tokens per request and the associated cost. However, Garnier claims Datadog’s offering covers additional elements, like response-time-to-prompt token ratio, as well as metrics providing contextual insights into individual user queries.

“Furthermore, for API response times, API requests and other metrics, we allow users to break this down by model, service and API keys. This is critical in order to understand the primary drivers of usage, token consumption and cost,” he noted.

Moving ahead, monitoring solutions like these, including those specifically tracking hallucinations, are expected to see an increase in demand, given the meteoric rise of large language models within enterprises. Companies are either using or planning to use LLMs (most prominently those from OpenAI) to accelerate key business functions, from querying their data stack to optimizing customer service.