Posted by MachineWorks
My name is Matthew and I work on the QA team for Polygonica and MachineWorks. We were discussing QA procedures and someone said "that would make a great blog" ... so here it is.
QA Team Structure
The QA team comprises five people including the build engineer and QA Manager. The team manages both the Polygonica and MachineWorks products, which also share the same infrastructure and procedures.
Polygonica and MachineWorks journaling
Developers who use Polygonica and MachineWorks will be familiar with the API journaling mechanism as our support team request a journal in almost every support case.
The journaling mechanism was built in from day one back in 1994, and originated in 1989 as part of the Lightworks rendering engine, which was developed by the same company at the time. Every API call that is made can be logged to a journal along with the associated parameters. This can then be replayed through a proprietary interpreter to exactly reproduce the conditions in the customer’s application. As well as helping massively in reducing support turnaround time it also means that it is easy for us to add customer reported bugs to the regression testing system.
The nice thing about journals is that they can be replayed against any build - debug or optimised, on any platform - Mac, Linux or Windows. In particular the developers are really disciplined in using asserts within the code, so if there is a bug it can quite often be picked up in an assert either by the support team or ourselves. This often makes turning around problems much faster.
Automated Regression Testing
Both products have large suites of automated regression tests. They are combinations of customer reported issues generated using the journaling mechanism and unit tests created by the developers using an in-house script language. All the tests are continuously running for each codeline and branch that will go to customers plus of course the developer mainline where new code is added.
The tests are split into single-threaded and multi-threaded groups. The same test often appears in both. Most of the multi-threaded tests are usually run several times with different numbers of threads set.
Currently there are around 1700 regression tests for Polygonica which work on over 14,000 different models. As MachineWorks passes its quarter century there are now over 50,000 regression tests.
Following up errors
Most errors found in the regression tests are picked up automatically using differences against a baseline. We always check for false positives caused by small floating point differences caused by algorithm or code changes. The floating point difference tolerance we use is very tight. We are choosing to have the QA team make an informed choice rather than have a looser difference tolerance that might let through real bugs.
Once QA have verified the error we run a script which does a binary chop of check-ins to find out which changelist caused the error. Then we email the developer and tell them what has regressed due to their changes. The development team are very good at responding, normally within a day we get a fix or a meaningful response such as accepting the new result as the new baseline.
New Feature Testing
New features are tested both by writing code examples and by generating new scripts in the internal scripting language. The script language does mean we are slightly insulated from the API but it does allow us to generate more realistic system tests much more quickly. The developers provide new scripts based on their own development and this helps us understand how the new features should be used. Once we have the new examples or scripts they are run against the large library of solids we have built up over the years from customers and also from online libraries.
Prior to any customer release, be it a patch or a major version, we do a set of performance tests. These are run after all the regression tests have passed. For each test both the baseline and the new build are run three times. The fastest time in each case is used for the comparison.
All the data is kept locally to the machine running the scripts to remove any network issues when testing file import or export.
We check for speed-ups as well as slow-downs. A speed-up that the developer can't explain might well indicate a potential problem that didn't get picked up in the regression tests. For slow-downs the developers have to decide whether the slow-down is warranted as a result of, for example, delivering a more accurate result, or whether further optimisations are required. Sometimes the optimisations will be deferred to a later release e.g. if it is already planned to multi-thread a function in a later release. Normally though the routine will be optimised further - performance is taken very seriously.
We do get intermittent problems, almost always related to differences in multi-threaded behaviour. Typically they will occur around 1 in 10 times but we've had cases which are as rare as 1 in 100. The algorithms at the heart of Polygonica and MachineWorks are very complex and prediction is almost impossible, particularly when floating point discrepancies are taken into account. That's where the continuous regression testing is a real advantage as we have a much bigger sample size. This helps us pick up these issues early on although it can then take a lot of QA team effort to isolate the changelist that caused them.
Even with a roomful of powerful machines it takes a few days for a codeline to go through all the tests required to certify it ready to send to customers. Although we make no promises as to the frequency of patch releases we're pleased that we are in a position to send fully tested code updates regularly. Ok, it's not quite devops but we pride ourselves in being very responsive.
For Polygonica in particular, patch releases are often driven by new functions that customers have requested - as they start using them they find they want modifications and tweaks and of course they find problems that we didn't anticipate.
We really like to work closely with our customers as they and their customers are always pushing the functionality in ways we never expected. They are the domain experts. Be it in AM, CAE, Construction, Mining, Medical implants or Dentistry they are better placed to know what the market requires than we are.
We will always respect your privacy. Please view our privacy centre to learn more about how we take care of your data.