It’s unusual for an innovation to make equally large waves in the art and artificial intelligence communities, but that’s what OpenAI accomplished with the release of the DALL-E image generator. Simply type in a description, and DALL-E can make it real. The company’s new algorithm has a similar pitch, but instead of making a 2D image, Point-E creates a 3D model of your description.
Point-E doesn’t go right from text to a 3D mesh — it’s actually composed of two different AI models. First, a text-to-image AI processes the prompt to make a standard 2D image. Then, an image-to-3D AI swings into action to turn the flat rendering into a 3D model. So, if you were to ask Point-E to make a traffic cone, it would start with a stripey triangle. It’s up to the second AI to understand that traffic cones are cones to produce the correct 3D shape.
This isn’t a completely novel idea — Google has a tool called DreamFusion that can do something similar. However, DreamFusion was designed to run on a machine with four of Google’s custom TPU v4 AI processors, and it takes that hardware 90 minutes to generate an image. You’re looking at multiple hours of GPU time per image with DreamFusion. Point-E is much faster, and it can run on a computer with a single GPU.
Many of us have had a blast telling DALL-E to make off-the-wall renderings, but Point-E isn’t yet ready for that instant AI gratification. The study explains this is the first step in foundational technology that could eventually become as quick and easy as DALL-E. OpenAI says that Point-E still falls far short of commercial 3D modeling — the results are more akin to a cloud of points.
When smoothed and processed, the Point-E models can produce a passable representation of a real object, as seen above. The real leap forward here is the efficiency, which OpenAI claims is “one to two orders of magnitude faster” than existing systems. Perhaps in the future, Point-E will intrude upon the domain of 3D modeling the way DALL-E has in art. Adobe recently decided it would allow AI-generated art in its stock image library, which not all artists like.
If you want to tinker with Point-E, all the code is available on GitHub. However, you’ll need to have Python, as well as some experience with programming and command line tools to make it work. However, the relatively modest hardware requirements mean it’s more accessible than DreamFusion.
Leave a Reply