Any time you make a change to software, you run the risk of breaking some part of the system that was previously working. That’s why any well-run software development project has regression tests. You want to make sure that any changes don’t have any negative impacts, and regression tests are designed to catch the bugs that are inadvertently introduced.
A system of any significant size can have many, many regression tests, and they can take a long time to run. If you have 1,200 tests and each one takes an average of 1 second to run, it will take 20 minutes for the tests to complete. you can greatly diminish that time by running these tests in parallel. If you have 5 threads running tests in parallel, those 1,000 tests will take only 4 minutes to complete.
However, running tests in parallel introduces a problem. How do you ensure that one of your tests does not modify data that another of your tests requires? You can make each test responsible for creating its own data, but this in turn causes a different problem. How do you ensure that your tests do not generate conflicting data?
Conflicting Data Example
Let’s say you have a system that manages Thingamabobs.
- Test #1 validates that the Thingamabob search page properly displays 5 particular Thingamabobs.
- Test #2 validates that you can create a new Thingamabob.
If Test #2 runs before Test #1, it may create a Thingamabob that matches the search parameters of Test #1. Since Test #1 is not expecting this Thingamabob in the search results, Test #1 will fail. Because Test #1 and Test #2 are running in parallel, you’ve introduced a race condition.
This post discusses a mechanism for eliminating race conditions when running tests in parallel: Multi-tenancy.
What is Multi-Tenancy
As per Wikipedia:
The term “software multitenancy” refers to a software architecture in which a single instance of software runs on a server and serves multiple tenants. A tenant is a group of users who share a common access with specific privileges to the software instance.
You can define a tenant in various ways. A tenant can consist of a single user or many users. A tenant can be defined by a key that is sent with every request or by a parameter that is part of a user’s profile stored in a database. The relevant concern is that the system must have some way of distinguishing one tenant from another.
Applying Multi-Tenancy to Testing
To use multi-tenancy with parallel testing
- Each test (or group of tests in the case that some tests run in sequence) must have a distinct tenant from all other tests
- Data created by a test must include the tenant. In a relational database this could be a column on a table to which the test is adding data.
- Every data access must include the tenant identifier.
- Prior to running a test, you should remove all of the data for the tenant. Data may exist in the data store because the last run of a test failed and the test could not clean up its data.
Data No Longer Conflicts
In the example from before, Test #1 is assigned tenant code “ABC” and Test #2 is assigned tenant code “DEF”. Now, when you run Test #1 and Test #2 in parallel, you can be sure that the Test #1 will not display the Thingamabob created by Test #2, because it is only getting Thingamabobs with tenant code “ABC”.
You can parallelize your tests as much as you want and the tests will not have any data interactions.
Alternatives to Multi-Tenancy
Some commonly-used alternatives to multi-tenancy and their problems are listed below.
Alternative | Problems |
Make the tests smarter so that tests with data dependencies don’t run at the same time | This will make your tests brittle.
Any time you add a new test, you have to determine which tests it may conflict with |
Have each test use a specific subset of existing data | Existing data may change and cause your tests to break
As more tests are created, it becomes more complicated to created distinct partitions. |
Deploy multiple instances of your system and run only one test at a time on each system instance. | It can get expensive to stand up many instances.
If third-party systems are required to run regression tests, there may be configuration issues connecting multiple systems to a single third-party system. Note that this strategy can be used in tandem with multi-tenancy in order to support even higher levels of parallelization because, at some point, you will exhaust system resources by running too many tests in parallel on a single system. |
At Ten Mile Square, we use multi-tenancy for parallelizing the tests we create when building software for our clients. If you’re spending a lot more time maintaining and running your system’s tests, look into making your system multi-tenant capable. It will probably save you time and money in the long run.