Behind the Scenes: How we Test Fusion?
There have been several posts in the Community asking about how we test Fusion 360. I will provide an overview of the different types of tests we conduct, but focus primarily on providing a comprehensive view of tests that we conduct to model end-user experience. Over the past year (perhaps a little more), we have adapted our testing based on – our interactions with you on how Fusion is being used, challenges you have reported with stability/performance and environmental factors (bandwidth etc.). I will talk about 3 things that we do to catch as many ‘experience’ issues as possible before we put a software update in your hands – the first two being test approaches and the last one being analytics that we leverage to help guide our decisions/priorities.
As may be trite in the software industry, testing paradigms adopted to get a read on quality generally focus on uncovering bugs at the code level followed by a test team that exercises functions to catch problems missed during the design and implementation phases, work as a product team to resolve the critical bugs that have high risk/impact to quality and iterate through this process a few times before the product is released to customers. Granted, I am probably over-simplifying the process (as many good test organizations do also focus on many other aspects such as good test automations, developer peer reviews, stress tests, etc), but the point is traditional approaches to tested-in-quality typically only gain mileage from a software engineering perspective and does little to drive ‘high quality’ from a Customer perspective.
While we do follow many of the traditional approaches in testing Fusion 360, we also apply rigor in testing from a Customer perspective. The latter is, in most cases a process of evolution (unless you are talking about software that needs to meet pre-defined policies/transactions in the industry it serves – such as legal, banking or medical) as different customers use the tools provided differently. As a result of this, we have seen some stability issues slip through, but we are learning and committed to getting better every day. Our general approach to testing is to develop Modular tests that cascade from unit level components to workflows used to achieve (a set of) objectives.
- Level 1: Component tests focused on individual workspaces and functional areas. These are mostly automated tests that get run by developers for changes and on every build as a litmus test of its health.
- Level 2: Client tests focused on robustness of capabilities added/enhanced. Cloud Services tests focused on reliability of back-end servers. These are a combination of automated monitors and manual tests. The target is to uncover regressions and/or to break the software.
- Level 3: Client tests that validate experience resulting from the integration of different pieces of the infrastructure. Cloud Services tests that validate back-end experience resulting from the integration of different pieces of the infrastructure. Focus is on Performance & Scalability as it relates to effects from the interaction between software – hardware – OS – network connections, etc.
- Level 4: Putting it all together from a Customer Perspective – Design Exercises.
Within the Fusion 360 team we conduct Design Exercises prior to releasing each update in order to simulate real end-user workflows. These are non-scripted tests that leverage a combination of cognitive team skills/creativity and a good dose of analytics based on our learning from customer interactions. The general structure of the exercises target workflow paths one might take to complete a set of design objectives. These could be one or more of the following but not limited to,
- Set up projects with specific user roles/permissions and invite the project team for collaboration.
- Decide on whether Bottom-up or Top-down design methods will be adopted.
- Decide what components will be imported and which ones will be created from scratch.
- Concept Design.
- Assemble & create Joints.
The steps and approaches followed for each exercise vary and also generally include tools in the product that may be necessary to leverage as part of the workflow (ex., measure, inspect, selections, versions, moving/copying files in a project, etc). The outcome of the exercise produces bugs, CERs as well as exposes performance, reliability and usability issues.
Below is an example of one such exercise. Issues uncovered are prioritized based on Severity, Frequency of occurrence, Impact to Customer productivity, Repeatability and Risk exposure (of the problem as well as from any side-effect the fix could potentially cause).
And below is a gallery of some other interesting projects the team has conducted over the past several months. Some of these range from work done by individuals to work done by a team of up to five.
Also, as part of the Fusion 360 Quality Assurance efforts, we focus on evaluating performance of the Fusion Client and Cloud Services. In addition to the qualitative performance evaluations we conduct as part of the Design Exercises, we also run quantitative measures using scripted tests. These are typically evaluated against a known baseline and where we improve performance, the improved numbers become the new baseline.
Fusion Client performance is run on physical hardware configured to be repeatable for each run. Degradations noted from the results are debugged and using similar criteria as noted in the previous sections the issue(s) are prioritized for resolution. Fusion Cloud Services performance is run using a Crowdsourced team in order to hit non-Autodesk network connections, provide coverage in connection types & bandwidth, cover geographical variations and different file sizes. These tests are run manually with each tester taking 3 readings at different time(s) of the day that get averaged to one number per tester. The data is then analyzed to understand performance ranges and debug anything that looks abnormal.
We have developed automated dashboards that provide insights (into our Production environments) and allow us to make informed decisions regarding actions to take in improving Customer Experience. These measures are used to drive focus in correcting issues impacting your productivity and at the same time helps us learn and adapt to implementing preventive measures.
- Incoming crash report How many buckets are getting triaged & resolved? What key areas need attention?
- What commands are most commonly being exercised by customers and ensure we provide adequate coverage to those?
- What is the distribution of OS’ and simulate a distribution close to that in our R&D efforts?
- What do our performance measures tell us against the prior release baseline? (so that we can identify key workflows that may be a bottleneck and address them).
- What is the first customer experience like to get started with Fusion? That is, how reliable are our Cloud services in common workflows (such as account creation, login, upload, create, save, open)? What is the uptime of back-end servers and what percentage of jobs submitted to the cloud are successful?
We have also developed automated dashboards that provide insights into our development environment in order to make informed decisions regarding release readiness and driving focus on key problems. These measures are used to drive and track convergence in quality as well as to prioritize areas of focus for the engineering team.
This is a process of evolution for us and we are learning every day from you. We are also constantly leveraging opportunities to make the necessary adjustments to our approach and improve it. We would love to hear any comments or suggestions from you.