VB Transform 2024 returns this July! Over 400 enterprise leaders will gather in San Francisco from July 9-11 to dive into the advancement of GenAI strategies and engaging in thought-provoking discussions within the community. Find out how you can attend here.


Data can exist in any number of different places across an enterprise. Gathering all that data for analysis in IT operations and security is often a challenging task.

Helping organizations to get all their data together for better observability is a core focus for San Francisco based Cribl. The company, founded in 2017, initially positioned itself as a data observability pipeline provider with its Cribl Stream product. In 2022, it added Cribl Search to its portfolio, making data discovery easier for users. Now in 2024, Cribl is advancing further with a data lake service that debuted in April and a new AI copilot capability announced today at the company’s CriblCon conference.

The overall goal is to make it easier for enterprises of all sizes to obtain, store and analyze data. The new developments at Cribl come as the company aims to reposition itself in the increasingly competitive data observability market to be about more than just observability.

“We’ve repositioned the company over the last year in response to our evolution into a multi product company and we now call ourselves the data engine for IT and security,” Clint Sharp, Cribl’s cofounder and CEO, told VentureBeat. 


VB Transform 2024 Registration is Open

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now


What is a data engine anyways?

The data engine is the term Cribl uses to describe its platform for managing large volumes of data for security and observability use cases. 

At the core of Cribl’s data engine platform is Cribl Stream for routing and processing data streams. The company has also rapidly built out supplementary products, including Cribl Search which is a federated search engine to query large datasets without moving the data. Cribl Edge technology uses lightweight agents for data collection, and Cribl Lake offers a data lake for storing and managing data, built on top of Amazon S3.

Sharp said that Cribl is not aiming to directly compete against a large data platform vendor like Snowflake or Databricks. Rather the Cribl data engine has a specific focus on enabling data for IT and security within enterprises.

According to Sharp, IT and security teams often really need data that is typically loosely structured, if structured at all. In his view, other data platforms don’t work as well for this unstructured log data.

Cribl helps customers route all types of heterogeneous data to various destinations like Splunk or Elasticsearch. Its products also enable searching large datasets without moving them. This differentiates Cribl from general-purpose data platforms and makes it more suited to the challenges of security, observability and analytics on messy technical data streams.

While Cribl helps with observability, the primary functions it enables are data ingestion, processing and management.  Rather than being a fully featured monitoring or observability solution, Cribl helps get data into technologies like Splunk or Datadog. rather than directly analyzing the data itself. 

“We’re not a SIEM [Security and Information Event Management], we’re not an observability solution,” Sharp clarified. “But we’re helping them move tracing data, analyze metric data,we complement the solutions in the space and we help them get the data where it needs to be.”

Data engine gets an AI co-pilot with accuracy being the top priority

Like many enterprise software vendors,  Cribl is now adding an AI copilot to help its users. Cribl is taking a very pragmatic and measured approach as it brings AI to its users.

Cribl is taking a Retrieval Augmented Generation (RAG) based approach for its copilot. That approach involves the use of a vector database that has access to the company’s vast knowledge base. On top of that is the large language model (LLM), which at the outset is OpenAI’s GPT-4, though Sharp emphasized that the LLM is the differentiator here, it’s the fine tuning and RAG configuration.

The AI copilot allows Cribl’s users to interact via natural language across the company’s product suite.  For example, a user could ask it to generate a pipeline that parses Apache weblogs and turns them into JSON or to search logs and chart errors over time split by HTTP code. The copilot is also able to generate dashboards for users and help them to get started figuring out how best to visualize and use data.

Sharp admitted that it took his company many months to build out and perfect its AI copilot. The reason why it took so long is because he said the initial results were not always accurate.

“The questions that you’re asking a copilot in our space are deeply technical,” he said.

For example, a user might want to understand and build a data pipeline for a Splunk universal forward, parse data in a specific way and forward it to a different location. Sharp said that the Cribl AI copilot can now execute those use cases, which is something it couldn’t do in the early iterations.

“There’s a lot of learning in there in order to meet the kind of quality bar that we felt we needed to have,” he said.

Source link