Join leaders in Boston on March 27 for an exclusive night of networking, insights, and conversation. Request an invite here.
Amid a relatively quiet period from OpenAI, rival Anthropic has stolen headlines with the release of its new Claude 3 family of large language models (LLMs). But there’s another foundation model provider to keep an eye on that dropped some significant generative AI news this week: Deci.
VentureBeat last covered the Israeli startup in fall 2023 when it released its DeciDiffusion and DeciLM 6B open source models, which are fine-tuned variants Stability’s Stable Diffusion 1.5 and Meta’s LLaMA 2 7B — both open source as well — designed to be faster and require less compute resources than their original source models. Since then, Deci released DeciCoder, a code completion LLM, and DeciDiffusion 2.0, though the latter — along with many of Deci’s other models, have since been paused on Hugging Face.
Now, the company is releasing a new, even smaller and less computationally demanding LLM, Deci-Nano, that is closed source, as well as a full Gen AI Development Platform for enterprises and coders, another paid product. Deci-Nano is available exclusively, for now, as part of the Deci Gen AI Development Platform.
Moving away from open source?
The company appears to be moving toward a more fully commercial or blended open-source/closed-source model mix, similar to what we’ve seen Mistral do with its controversial partnership with Microsoft.
Do Deci’s and Mistral’s moves into closed source AI models indicate a waning enthusiasm for open source AI? After all, every private company needs to make money somehow…
Performance, at a (low) price…
If Deci is indeed moving in a more commercial direction as it appears, then the company appears to be easing users and customers into this phase of its existence.
Deci-Nano offers language understanding and reasoning with ultra-fast inference speed, generating 256 tokens in just 4.56 seconds on NVIDIA A100 GPUs.
The company posted charts on its blog announcing Deci-Nano showing that it outperforms Mistral 7B-Instruct and Google’s Gemma 7B-it models.
Deci-Nano is furthermore priced very aggressively at $0.1 per 1 million (input) tokens, compared to $0.50 for OpenAI’s GPT-3.5 Turbo and $0.25 for the new Claude 3 Haiku.
“Deci-Nano embodies our production-oriented approach, which includes a dedication not only to quality but also to efficiency and cost-effectiveness,” said Yonatan Geifman, Deci co-founder and CEO, in a post on his LinkedIn page. “We’re building architectures and software solutions that squeeze maximum compute power out of existing GPUs.”
But it remains closed source. And Deci hasn’t publicly shared how many parameters it has. VentureBeat reached out to an advisor with the company, who told us: “this model is actually closed source, and Deci has elected to not release any information regarding its size or architecture..It’s meant to generate buzz around the new Gen AI Development Platform they are launching.”
From financial and legal analysis to copywriting and chatbots, Deci-Nano’s affordability and superior capabilities seek to unlock new possibilities for businesses seeking to innovate without the burden of excessive costs.
Deci is offering a number of options for customers to deploy it, either on serverless instances for ease and scalability or dedicated instances for fine-tunability and enhanced privacy. The company says trhis flexibility ensures that businesses can scale their AI solutions as their needs evolve, seamlessly transitioning between deployment options without compromising on performance or security.
A new platform is born
Though the bulk of Deci’s announcement this week focused on Deci-Nano, the bigger news (no pun intended) may be the company’s move to offer a full Generative AI Platform, which it describes in a news release as “comprehensive solution designed to meet the efficiency and privacy needs of enterprises.”
What exactly do users of the platform get? “A new series of proprietary, fine-tunable large language models (LLMs), an inference engine, and an AI inference cluster management solution,” according to Deci.
The first proprietary model being offered through the platform is of course, Deci-Nano. But clearly, Deci plans to offer others based on the wording of these marketing materials.
The inference engine allows users to deploy Deci-Nano to their specifications, either connecting to Deci’s API and servers, running Deci-Nano on the customer’s virtual private cloud, or deploying it on-premises on the customer’s server.
For customers seeking to manage Deci-Nano themselves in a virtual private cloud (VPC), Deci will just provide them their own containerized model. The company also run a managed interference on behalf of the customer in the customer’s Kubernetes cluster.
Finally, Deci’s Genartive AI Platform offers a full on-premises deployment solution for customers who want the tech in their data center, not on the cloud. Deci will provide these customers with a virtual container that houses both the Deci-Nano model and Deci’s Infery software development kit, so the customer can build the model into apps and experiences for customers, employees or other end-users.
Pricing has not been publicly listed for the Deci Generative AI Platform and its various installation offerings, but we will update once we obtain that information.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.