Today, H2O AI, the company working to democratize AI with a range of open-source and proprietary tools, announced the release of Danube, a new super-tiny large language model (LLM) for mobile devices.
Named after the second-largest river in Europe, the open-source model comes with 1.8 billion parameters and is said to match or outperform similarly sized models across a range of natural language tasks. This puts it in the same category as strong offerings from Microsoft, Stability AI and Eleuther AI.
The timing of the announcement makes perfect sense. Enterprises building consumer devices are racing to explore the potential of offline generative AI, where models run locally on the product, giving users quick assistance across functions and eliminating the need to take information out to the cloud.
“We are excited to release H2O-Danube-1.8B as a portable LLM on small devices like your smartphone… The proliferation of smaller, lower-cost hardware and more efficient training now allows modestly-sized models to be accessible to a wider audience… We believe H2O-Danube-1.8B will be a game changer for mobile offline applications,” Sri Ambati, CEO and co-founder of H2O, said in a statement.
VB Event
The AI Impact Tour – NYC
We’ll be in New York on February 29 in partnership with Microsoft to discuss how to balance risks and rewards of AI applications. Request an invite to the exclusive event below.
What to expect from Danube-1.8B LLM?
While Danube has just been announced, H2O claims it can be fine-tuned to handle a range of natural language applications on small devices, including common sense reasoning, reading comprehension, summarization and translation.
To train the mini model, the company collected a trillion tokens from diverse web sources and utilized techniques refined from Llama 2 and Mistral models to enhance its generation capabilities.
“We adjusted the Llama 2 architecture for a total of around 1.8B parameters. We (then) used the original Llama 2 tokenizer with a vocabulary size of 32,000 and trained our model up to a context length of 16,384. We incorporated the sliding window attention from Mistral with a size of 4,096,” the company noted while describing the model architecture on Hugging Face.
When tested on benchmarks, the model was found to be performing on par or better than most models in the 1-2B-parameter category.
For example, in the Hellaswag test aimed at evaluating common sense natural language inference, it performed with an accuracy of 69.58%, sitting just behind Stability AI’s Stable LM 2 1.6 billion parameter model pre-trained on 2 trillion tokens. Similarly, in the Arc benchmark for advanced question answering, it ranks third behind Microsoft Phi 1.5 (1.3-billion parameter model) and Stable LM 2 with an accuracy of 39.42%.
H2O has released Danube-1.8B under an Apache 2.0 license for commercial use. Any team looking to implement the model for a mobile use case can download it from Hugging Face and perform application-specific fine-tuning.
To make this process easier, the company also plans to release additional tooling soon. It has also released a chat-tuned version of the model (H2O-Danube-1.8B-Chat), which can be implemented for conversational applications.
In the long run, the availability of Danube and similar small-sized models is expected to drive a surge in offline generative AI applications across phones and laptops, helping with tasks like email summarization, typing and image editing. In fact, Samsung has already moved in this direction with the launch of its S24 line of smartphones.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.