Why it matters: The creative industry needs a constant flux of content to keep fans happy, and that content needs to be created somehow. When it comes to 3D models, AI algorithms could provide a big help by slashing the amount of generation time.
By using a large dataset to train a machine learning algorithm, researchers from Adobe and the Australian National University have created a technology that could do wonders for 3D model creation. The researchers created what they consider the first Large Reconstruction Model (LRM) capable of predicting a 3D model’s shape from a single two-dimensional image, and it can do so within just 5 seconds.
Previous 3D-generation models were trained on small-scale datasets focused on a single image category, the researchers explain in their paper. Conversely, their LRM model has a highly scalable, transformer-based architecture with 500 million learnable parameters, and it has been trained on around 1 million 3D objects available in the Objaverse and MVImgNet datasets.
This combination of high-capacity model and large-scale training data provides the LRM algorithm with a “highly generalizable” content creation capability, the researchers explain. The model was able to produce “high-quality” 3D reconstructions from various testing images including real-world photos, the paper says. Furthermore, LRM can take both “normal” images and visual patchworks generated by AI services like DALL-E and Stable Diffusion as its input 2D model.
According to the study’s lead author, Yicong Hong, LRM is a significant breakthrough in single-image 3D reconstruction. The AI algorithm can produce detailed geometry from a video or a shape image, preserving complex textures like wood grains.
LRM has potential “transformative” capabilities, the researchers state, as it could be employed in a vast range of industries including design, entertainment, and gaming. Designers or 3D artists could streamline the process of 3D modeling, significantly reducing the time needed to generate assets for video games or animations. 3D content creation in a rapidly evolving industry has become a challenge, and AI companies are rushing to provide potential solutions like the Stable 3D service recently introduced by Stability AI.
LRM could also democratize 3D modeling, as “normal” users could potentially develop highly detailed models from photos taken with a smartphone. Although LRM still faces challenges, such as blurry textures for hidden parts of an image, it opens up a world of creative and commercial opportunities. The researchers have provided a page with video demos and interactive 3D meshes to show what LRM can do right now.