Before generative AI was the massive industry trend it is today, there was predictive AI which, as the name implies, helps to provide predictions about future events based on data. But what if you could combine both technologies into one?
That’s the goal of Pecan AI. The eight-year-old startup already offers a predictive analytics platform for enterprises and raised $116 million in funding since its start, including a $66 million round in Feb. 2022.
Today, the company is launching a new tool, Predictive GenAI, which combines some of the power of modern generative AI capabilities with predictive machine learning.
“While we were working in our side of the neighborhood on the classic machine learning predictive analytics solutions, on the other side of the neighborhood the entire gen AI revolution happened,” Zohar Bronfman, CEO and co-founder, Pecan AI, told VentureBeat. “One thing gen AI is terrible at is creating predictions.”
While gen AI is not ideal for making predictions, predictive machine learning techniques are not particularly user friendly. Pecan AI’s Predictive GenAI blends both approaches enabling data scientists to now more easily build and generate predictive AI models.
Making predictive AI accessible for business users
A key goal for Pecan AI is to help companies adopt machine learning and AI in the simplest way possible.
Historically, data scientists were the primary users of AI platforms, and in particular, predictive machine learning technology.
Bronfman said that Pecan AI is designed for accessibility and aims to democratize AI capabilities and bring it to people that are closer to the business side of things inside of companies.
There are two parts to Pecan AI Predictive GenAI capability.
- Predictive Chat is a feature that allows users to make natural language queries through a chatbot-style interface. Bronfman said that goal is to help guide the user that has a specific business problem to more easily use a specific predictive framework that suits the business need.
- The new Predictive Notebook uses generative AI to build the data science notebook that is used as the foundation for building a predictive model. Bronfman explained that the predictive notebook is Pecan AI’s proprietary notebook that is SQL based. It contains generated cells that define the transformation of a company’s native data into an AI-ready dataset for predictive modeling. Each generated cell is responsible for an element of that transformation, such as querying, structuring, and joining the data. The cells can be run automatically in Pecan AI’s backend in a transparent manner for the user. However, if a user wants to take a more in-depth involvement, they can tweak the cells using SQL. At the end of the process, the notebook creates a set of queries that are applied to the user’s data tables to transform them from their native state into an AI ready dataset for Pecan AI’s modeling library.
Why regular gen AI can’t predict (well, if at all)
As its users my attest, gen AI is good at a lot of different things, such as building chatbots, summarizing content and writing reports.
In Bronfman’s view, gen AI on its own however is not the right fit for making predictions for several reasons.
He told Venturebeat that the datasets gen AI tools are exposed to during training are not in the proper AI-ready format required for predictive modeling.
Bronfman explained that for a predictive model, the dataset needs to have each row as a distinct entity, with each column representing a specific feature and a label column for the target variable.
However, in real business scenarios, obtaining datasets in this format requires significant data engineering work.
Generative AI models are not good at taking raw tabular data from different sources and transforming it into the flat, two-dimensional format required for predictive modeling. This is a skill that typically requires an experienced data scientist to accomplish.
The use of a vector database is also not quite enough for full fledged predictive AI modeling either, according to Bronfman.
He explained that while vector databases and embeddings can support basic predictive capabilities by working with a limited set of features, they are not sufficient.
Bronfman said that either the models would have to be very simple, capturing only a limited pattern, or alternatively a data scientist would still need to do relatively complex feature engineering to prepare the data in the proper format before feeding it to a richer predictive model.
Innovations in data preparation help to improve prediction
While the conversational predictive gen AI may be the most visible new capability, Pecan AI is moving forward with its patented innovations around automating data preparation and feature engineering.
Among the data preparation innovations that Pecan AI has been working on is automation to help improve issues like data leakage, which can undermine model accuracy. In machine learning, data leakage refers to the use of information taken from the training process that normally would not be available when a prediction is made.
“It’s not trivial to identify leakage, especially if you’re not a professional data scientist,” Bronfman said. “So we have, for example, automated ways of identifying leakage.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.