Engineers learning computer-aided design could benefit from the virtual VideoCAD tool, which could also increase designers’ efficiency.
The majority of today’s tangible items are designed using computer-aided design, or CAD. Before submitting a final version to a manufacturing line, engineers can test and enhance the 3D models created from 2D designs using CAD. With thousands of commands to pick from, the software is infamously difficult to learn. It takes a great deal of time and practice to become genuinely adept with the software.
With an AI model that utilizes CAD software similarly to a person, MIT developers want to reduce the learning curve for CAD. Similar to how an engineer would use the software, the model rapidly generates a 3D rendition of an object from a 2D sketch by selecting buttons and file settings.
Over 41,000 instances of how 3D models are constructed in CAD software are included in the new collection, VideoCAD, which was generated by the MIT team. The new AI system is now able to utilize CAD software in a manner similar to that of a human user after learning from these movies, which show how various forms and things are built step-by-step.
The team is working toward an AI-enabled “CAD co-pilot” with VideoCAD. In addition to producing 3D models of a design, they imagine that such a tool may collaborate with a human user to recommend subsequent actions or automatically execute build sequences that would otherwise be laborious and time-consuming to navigate through by hand.
According to Ghadi Nehme, a graduate student in MIT’s Department of Mechanical Engineering, “AI has the potential to boost engineers’ productivity as well as make CAD more accessible to more people.”
Faez Ahmed, an associate professor of mechanical engineering at MIT, adds, “This is important because it lowers the barrier to entry for design, helping people without years of CAD training to create 3D models more easily and tap into their creativity.”
In December, Ahmed, Nehme, postdoc Ferdous Alam, and graduate student Brandon Man will discuss their work at the Conference on Neural Information Processing Systems (NeurIPS).
Recent advancements in AI-driven user interface (UI) agents—tools that are taught to use software applications to perform tasks like automatically collecting data online and organizing it in an Excel spreadsheet—are expanded upon in the team’s latest study. Ahmed’s team questioned whether these UI agents could be made to use CAD, which has a lot more features and capabilities and requires considerably more complex jobs than the typical UI agent can perform.
The team’s new project was to construct an AI-driven user interface agent that would take control of the CAD software and, click by click, turn a 2D sketch into a 3D version. In order to accomplish this, the team first examined an existing dataset of human-designed items in CAD. The series of high-level design commands, such as “sketch line,” “circle,” and “extrude,” that were utilized to construct the finished product is included in every object in the collection.
But the scientists found that training an AI agent to use CAD software actually required more than just these high-level directives. The specifics of each activity must also be understood by a real agent. For example: What area of the sketch should it choose? At what point should it enlarge? What area of a sketch ought to be extruded? The researchers created a mechanism to convert high-level commands into user-interface interactions in order to close this gap.
Nehme explains, “For instance, let’s say we drew a sketch by drawing a line from point 1 to point 2.” “With the ‘line’ operation selected, we converted those high-level actions to user-interface actions, which means we say, go from this pixel location, click, and then move to a second pixel location, and click.”
Ultimately, the team produced more than 41,000 movies of human-designed computer-aided design (CAD) items, each with a real-time description of the precise clicks, mouse movements, and other keyboard activities that the human originally performed. After that, they fed all of this data into a model they created to discover links between CAD object production and user interface activities.
The new AI model could take a 2D drawing as input and directly operate the CAD program by clicking, dragging, and choosing tools to create the entire 3D shape after it had been trained on this dataset, which they called VideoCAD. From basic brackets to more intricate housing designs, the items varied in complexity. The team hopes that both the dataset and the model may eventually allow CAD co-pilots for designers across a variety of industries. They are currently training the model on increasingly complicated forms.
According to Mehdi Ataei, a senior research scientist at Autodesk Research, which creates new design software tools, “VideoCAD is a valuable first step toward AI assistants that help onboard new users and automate the repetitive modeling work that follows familiar patterns.” Ataei was not involved in the study. “This is an early foundation, and I look forward to seeing successors that span multiple CAD systems, more realistic, messy human workflows, and richer operations like assemblies and constraints.”

