Combining Forge and Machine Learning to Automate Time-Consuming Design Tasks

Share this Article

Using our daily office routines, we must accomplish many repetitive and time-consuming tasks to meet client requirements and international standards. Some of these tasks are easy to automate; others seem limited by desktop resources or require human intervention. With the evolution of machine learning and the introduction of the Forge Design Automation API, EMDC Group was able to automate some of those processes, thus saving tremendous amounts of time and improving team efficiency. This article presents a sample of what we've built in-house to give you an idea about the tasks that can now be automated and the potential that artificial intelligence and cloud computing bring to the AEC industry.

Why Automating Time-Consuming Tasks Is Important

As projects are currently so fast-paced, the most essential priority is to meet deadlines. However, time-consuming and iterative tasks can be a burden on the project’s planning, allowing less time for important decisions. When these time-consuming tasks are automated, time will be invested in more important tasks and the project’s duration and cost can be considerably reduced.

Automation Limitations

Automation is limited by many factors, mainly the desktop resources whereby in some cases the automation time is almost equal to the manual iteration time. In addition, some processes will still require manual input. Also, in some cases, the API limitations complicate automation, making it impossible when a certain functionality is not public.

What Is Autodesk Forge

Autodesk Forge is a cloud-based developer platform enabling developers to automate tasks using cloud computing. Cloud apps are starting to dominate the digital world, and they will be the main technological focus in the 2020s, the way mobile platforms were 10 years ago. Forge is Autodesk's introduction of cloud computing into the world of CAD-oriented processes. 

Something to keep in mind is that Forge is not just one solution; it offers several functionalities and exposes several APIs. For example, the BIM 360 API allows developers to automate BIM 360 project setup, by assigning permissions, initiating projects, and so on. These tasks were slightly time consuming and the BIM 360 API enables the elimination of several of these tasks.

Another API worth mentioning is the Data Management API, which automates BIM 360 Docs file transfer operations, such as downloads, uploads, and so on. That opens a door to schedule document upload to BIM 360 Docs, for example, which often is a necessity.

Several other APIs and functionalities are possible with Forge, like the Forge Viewer which enables the viewing of the model within web browsers, the Reality Capture API, which handles point clouds, raster images, webhooks, etc.

However, the Forge API with the most potential for EMDC Group as engineering consultants is the Design Automation API. This is our focus here.

Simply put, the Design Automation API runs Autodesk software instances and executes precompiled add-ins and scripts over cloud servers. Hence, the Design Automation for Revit API, for example, enables developers to compile Revit cloud add-ins that run remotely in the cloud. Several Autodesk platforms are supported by the Design Automation API, from AutoCAD and Revit, which are the main focus of this article, to 3ds Max and Inventor.

Why Use the Design Automation API

Several advantages are offered by the Design Automation API. Using this functionality, the user can avoid wasting desktop workstations on processes with long runtimes as Forge provides an additional virtual workstation.

Moreover, the Forge add-in is running on high performing servers that no desktop workstation can even compete with, which offers a highly increased performance. That increase in performance implies a huge decrease in runtime and thus saves tremendous amounts of time.

Another considerable advantage is that running these apps can be initiated from any platform, whether desktop, mobile, or any web browser. So, any user with an Internet connection, whether at home, in the office, or on-site, can access those apps and run them, which is becoming a standard feature to have.

Machine Learning

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves. 

Machine learning is in fact:

  • A subdivision of artificial intelligence

  • A combination of statistics and probability

It is also important to define what machine learning is not:

  • Machine learning is not a synonym for artificial intelligence

  • Machine learning does not insinuate machines gaining self-awareness and starting to read and speak

  • Machine learning does not mean that computers will gradually gain control over the world.

Primarily, the first step of machine learning starts with automated statistical algorithms that process input data and find data relations which implies the learning term. As a second step, and after the data is gathered and the algorithm is trained, automated probabilistic processes come into play and predict new values based on previously processed data. So essentially, a probabilistic model is being built and not a perfectly accurate mathematical model. However, even with a risk of uncertainty, the decision made by the algorithms will be based on the highest “scoring,” and that makes the model useful.


This reminds us of the famous statistician George Box who said, “All models are wrong, but some are useful.”

When Is Machine Learning Needed

  • When the project presents a shortage of information, machine learning is needed. In other words, when dealing with nonparametric objects, like 2D CAD drawings, fake BIM models, point clouds, or raster images, for example.


  • When the APIs have limitations. More specifically, when a functionality is not exposed by the API, or when automating a process cannot be achieved through conventional statements. (For example, If statements, For loops, While loops...)

In these two above scenarios, two solutions would be available:

  • Performing manual input, which means dedicating a team of employees to do the required work manually.


  • Training a machine learning algorithm, which means gathering data from previous models, finding a data relationship (the statistical part), and using it to predict the missing parameter values and to automate the process (the probabilistic part).

Not All Automation Is Machine Learning

The term 'machine learning' is widely used. Many active users are frequently publishing automation scripts (Dynamo scripts, macros, add-ins) and claiming that their work falls under the machine learning or artificial intelligence categories. It is important to know that not all automation, no matter how advanced it may seem, uses machine learning. Before dealing with real-case scenarios that actually use machine learning, it would be useful to start with some samples that can be achieved without the use of machine learning.

Sample 1: Automated P-Trap Connection

We begin with an auto-routing sample where condensate drain connections and their P-traps have been automatically connected to the main runs. This is an auto-routing sample that involves a lot of advanced mathematics, whether vector theory or linear algebra, so that the components are correctly placed and connected. Users may fall upon several auto-routing samples associated with machine learning which is often misleading as this can often be achieved through classic automation. In short, and since there is no data gathering, nor data labeling, nor algorithm training, nor statistically extracted data relations, machine learning is not in play.

Sample 2: Automated Hanger Placement

Another advanced sample is an add-in built at EMDC Group that automatically places pipe or duct hangers and connects them to the nearest structural elements. Such a process, for example, does not involve any aspects of machine learning.

Sample 3: Automated Attached Hangers

The third sample is also a hanger connection tool that attaches hangers to the nearest hanger above. Although this does involve some advanced mathematics to detect the bearing hanger lines, it can be achieved without the use of machine learning.

The above three samples illustrate explicit automations and do not use any aspect of machine learning. You can learn more about these samples at\samples.

However, a lot of processes do involve some form of machine learning and are being used today in our daily life and in the AEC industry.

What Cannot Be “Conventionally” Automated

Before discussing the real case EMDC Group custom apps that use machine learning, it is important to start by enumerating general samples that cannot be explicitly automated. For example:

  • Scanned PDF data extraction

    Example: A scanned PDF drawing from which information is needed, like handwritten comments.

  • Documentation of exploded CAD drawings

    Example: An AutoCAD drawing from which sheet names and numbers need to be extracted and listed in a spreadsheet.

  • Raster image detail conversion:

    Example: An installation detail which needs to be translated in an AutoCAD drawing.

  • Point cloud to BIM model conversion

    Example: A point cloud model which needs to be translated into an as-built model.

This impossibility of automation will therefore lead to:

  • Manual input required

  • Performing iterative tasks which are most of the time out of scope

  • Overtime and additional tasks for employees and possible cost overruns 

How Can These Tasks Be Accomplished

Solutions to the above tasks are few. One solution would be to train any competent operator to do this time-consuming job or... train a machine.

Based on the above, automated solutions for each of the issues can be identified:

  • Scanned PDF data extraction

Solution: Use of Automated Optical Character Recognition, which applies aspects of machine learning to recognize typical schematics. In this case, the letters of the handwritten comments are being recognized.

  • Documentation of exploded CAD drawings

Solution: Use of machine learning to find sheet information and document it, based on previous cases. In this case, finding sheet names and numbers.

  • Raster image detail conversion

Solution: Use of a vectorization software which also applies machine learning to detect lines, circles and forms. In this case, the pixels are converted into AutoCAD elements.

  • Point cloud to BIM model conversion

Solution: Use of machine learning to automatically detect elements. In this case, connecting the group of cloud points and translating them into building elements.

Machine Learning Subfields Involved in EMDC Group’s Customized Workflow

Machine learning is a diverse field and can include clustering, supervised learning, unsupervised learning, classification, regression, decision trees... It includes complex subfields such as deep learning and neural networks, which operate on several automation layers and open opportunities to achieve impressive results, such as an algorithm that learns how to complete a console game level by itself and so on.

Therefore, before going through the EMDC Group real case scenarios in detail, it is important to define which machine learning subfields were involved in the development of these customized apps.

Want to share your knowledge?
Write an article for AU


Clustering is the grouping of non-labeled data based on a certain similar property. Clustering is a form of unsupervised machine learning. It is unsupervised because the data processed has not been labeled before, nor has a training set been generated before based on labeled data. So the elements are raw, unsorted, unlabeled, and need to be grouped into wider blocks.


The main challenge that clustering involves is the grouping definition. In simpler terms, the challenge lies in determining what makes a group of elements different from another group, and what defines the beginning of a group, when a group should be trimmed, and when another group of elements should start to be gathered.

That is defined using statistical processes that automatically define a group of similar elements. Clustering includes several algorithms, of which we’ll be mentioning two: the first is based on the mean value of the data, so that the mean values are close to or match a chosen number (K) of kernel elements. It is called a K-Means clustering algorithm. The second is based on data density, where similar elements are gathered until their density distribution is lower than a certain threshold, after which the group is limited. That is called a DBSCAN algorithm.

This can be illustrated through the most common clustering-based process encountered every day: facial recognition. A face is recognized, knowing that its image is usually a group of pixels, by the detection of its components, which in turn are recognized based on a clustering algorithm; the eye, for example. If we were to simplify the process which in reality is way more complicated, the eye is recognized by having the pixels of the eye color grouped until the color's density, which at some point starts to decrease gradually until it fades away, is lower than a certain limit. That is an eye detected. The same process happens for other face components and that way the whole face is recognized.

Supervised Learning

The other machine learning subfield involved is supervised learning. We’ll be introducing the concepts behind supervised learning by a simple example involving apples and bananas.

  • First, data (in this case pictures of apples and bananas) is labeled and fed into a learning algorithm.

  • Then, a training set is generated, and the relation between the labels, which in this case is the fruit type, and the data, which is the fruit pictures, is statistically found (for example, a high number of red images corresponds to a high number of apples).

  • When an unlabeled item is fed into the algorithm, the generated data set is used and the type of the fruit is detected based on how probable it is matching with the labeled data. (For example, if it's red, there's a 99% probability that it's an apple.) Of course, and as it is based on probability, there is a chance for outliers or, in other terms, wrong results; e.g., if the user feeds in the picture of a rotten, yellowish apple.


Why is it called supervised? Because the data entered to generate the set and statistically find the relation has previously been labeled by a user.

Apple and banana detection might seem silly, but a slightly more advanced process is now used for groundbreaking advancements, such as cancer prediction algorithms, where a software can give, to a certain accuracy, a prediction on whether someone may or may not have cancer five years later, which can save millions of lives. Labeled mammographic data, for example whether a patient had cancer five years after the corresponding mammography exam, are fed into the system and the particular aspects of the cancer that were not previously known to pathologists are being found due to such processes.

Real Case Scenario 1: Using Clustering to Automate Multileader Grouping

Defining the Problem

  • A client wants to produce detailed AutoCAD drawings based on a BIM model.

  • The first step would be to export the Revit model into AutoCAD drawings as this saves 50% of the work.

  • However, a problem occurs: Tags are exported into exploded multileaders. In this case, multileaders were required by the client, and even in a determined style.


Classic Solution

The classic solution is to create an AutoLISP routine to group text elements and exploded leaders into multileaders.

  • For this purpose, the user selects the elements to be grouped, one group at a time.

  • The problem is that this routine still requires manual input, by selecting the text related to the multileader, which presents a big margin of errors if automated.

  • The solution is the use of the DBSCAN Clustering algorithm which groups multileader elements into multileaders based on their distribution density and similarity around each text, sorting them out without needing user input. The similarity parameter is the distance separating each line within a text paragraph.


How This Solution Is Implemented

The automated solution would be a Revit API-based add-in used to execute the DBSCAN algorithm over exported AutoCAD drawings.

  • This add-in uses AutoCAD interoperability to process drawings as soon as they are exported. In other words, it exports the Revit sheets to AutoCAD drawings, opens the exported drawings and runs the solution directly in AutoCAD.

  • This add-in saves drawing opening time and eliminates manual input.

  • However, this solution has a problem: It is a resource-heavy algorithm, implying longer execution time and requiring dedicated workstations.


Forge Design Automation for Revit

This scenario can be resolved by using Forge Design Automation for Revit. The desktop add-in previously described is converted into a Forge add-in with lower execution time and less user interaction. As a result, the desktop workstations are saved for other tasks.


HTML Interface

As in any software, there is the user interface part that should be user friendly and that the user sees and uses to operate an app, and the algorithm, hidden underneath, with all the detailed processes involved.

In a Design Automation for Revit app, the app is controlled through HTML requests, which are what happens underneath whenever a user executes anything online, whether logged in or out.

To upload, run a Forge app, download the resulting files, and so on, a request should be posted from the user side to the server side, telling the server to initiate that operation. This is a very important aspect, because HTML requests are supported by several platforms and programming languages, so the user can build an app, or preferably a web page (which is what has been done in EMDC group’s case) based on whatever language is preferred, whether cURL, Javascript, ASP.NET, Ruby, Python, etc.

The language used in this case is Javascript. In fact, it is important that such a process is supported across as many devices and platforms and web browsers as possible, and currently JavaScript is supported by almost all web browsers. So, it makes the most sense. Python is definitely a runner-up in that field. Note that the UI can also be a desktop app showing a classical Windows form, or an interactive mobile app.

Therefore, posting the command should give the order to execute a script or an add-in. The current example is a Revit cloud add-in that in turn runs an AutoCAD cloud script to group the multileaders.


Whenever the Revit cloud add-in receives the corresponding request, it runs the scripts and generates the output files. The add-ins themselves, both for Revit and AutoCAD, are DLL class libraries compiled to operate on top of the .NET environment. These classes can be written in any .NET language, whether C#, F#, VB.NET, and so on. The compiled class libraries are then bundled and pre-uploaded to the Forge servers, where the server constantly listens to an HTML requests in order to run the app in question.

Note that any API references to the Revit or AutoCAD UIs are completely removed when a desktop add-in is converted into a cloud add-in, as now the operation is driven by HTML requests and the interface is completely separate.

Want more? Download the full class handout to read on.

Rana Zeitouny entered the engineering consultancy industry in 2002, after obtaining her bachelor's degree in Electrical Engineering from the Lebanese American University in Lebanon. Since then, she deeply grasped the AutoCAD software, and in 2005 she standardized the AutoCAD implementation including layers, blocks, commands, shortcuts, etc. and trained the Electrical department. From 2008 to 2016, she gave university courses about electrical design including training sessions on AutoCAD software. Today, after taking a Revit training in 2015, she is using both AutoCAD and Revit software in her daily consultancy work.

Majd Makhlouf Majd is a mechanical engineer and design technologist, with a master's degree in Mechanical Engineering. He is an Autodesk Revit Certified Professional and a member of the Autodesk Developer Network. In January 2020, he founded Building Information Researchers and Developers OÜ, a software development company based in Estonia providing services for the AEC sector worldwide. He specializes in BIM management, Autodesk Revit and AutoCAD add-in development, both public and custom developed, Forge web and cloud-based apps, Dynamo Zero Touch Node Packs, and mobile VR/AR applications.