Like all companies in the tech space, Notarize has the ongoing challenge of balancing agility. On one hand, we need to push out new features as quickly as possible. On the other hand, we have to ensure a high level of quality and stability, and maintain the contract with our customers that our solutions will simply work, effectively and consistently.
Ensuring this balance is a job for the entire organization. As a Senior Quality Assurance (QA) Engineer, it’s something I think about every day.
One of the major transitions that we have made in our QA process at Notarize has been a transition from a completely manual regression process – where humans are mimicking user paths to test our software before each release – to a highly automated one. The benefits of automation are numerous, and we’ve already made huge strides towards reducing the amount of breaking bugs that get into our release candidates.
The QA team has largely shifted from a team that does repetitive testing of the entire product, hoping to find something before it gets out to our end users, to a true engineering team responsible for the health and maintenance of the robots that do the bulk of our testing. It’s been a real game-changer for us, as it has been for many other companies.
As great as automated testing is, there are some areas of our product that have proven more difficult to cleanly automate. These areas usually rely on something more than direct feedback from our web interface or external APIs. One of these is webhooks.
A webhook is a proactive API message that is sent out by a service to inform a client that some process has been completed, so that the client does not have to continually query the service’s API for up-to-date information.
At Notarize, we use webhooks with our business and real estate clients to inform them when the status of a notarization or eSign transaction has changed. This happens when a transaction has been “created,” then when it is “sent,” and finally when it is “completed.”
Clients integrated with our API use these webhooks to keep track of their transactions as they move through their lifecycle, and rely on them for effective functioning of their internal systems. From a quality perspective, this is a feature we want to protect from breaks that happen during new releases.
When the Notarize QA Team first started testing webhooks, it was completely manual. Since webhooks are set for an organization using our external API, we set up a Postman library with API calls to set the webhook for a specific listening service.
The first service we used was RequestBin, which provided us with a unique URL where we could send our webhooks, along with a web portal that we could use to look at what was sent.
This worked, but was clunky. RequestBin explicitly designed its individual storage buckets to be highly ephemeral. Users could not rely on them to stick around, and URLs might expire in the middle of a test transaction. RequestBin is a free and very useful service, but it wasn’t ideal for our use case.
The next tool we used was Runscope. We used this tool to run a suite of automated contract tests against our external REST API until we brought our external API testing stack in-house.
Runscope provided functionality similar to RequestBin, with the added benefit of being more reliable and less ephemeral when it came to the longevity of the our webhook payloads. However, once we brought our API end-to-end testing into our codebase – allowing us to test the actual behavior of our APIs through real API requests – Runscope became an unnecessary expense.
Our final third-party request bucket was Beeceptor, which had many similarities to RequestBin, but we generally found it more reliable.
At first, we used Beeceptor with manual tests. We soon realized that if we incorporated calls to our API into our front-end, end-to-end tests, we could perform what I call “partially automated” testing. It worked like this:
This was a huge improvement over the all-manual testing process, and was further improved when we created a Jenkins job to run all of our webhook testing-enabled specs in an automated fashion.
At this point, we had significantly reduced the manual testing effort, but we still were obligated to take time out of every regression cycle to run the “webhooks-staging” Jenkins job, open Beeceptor, and hope that all of the tests ran without errors.
In addition, this meant that our webhooks testing was pushed “to the right” – that is, if there was a breaking bug with our webhooks, it was possible we wouldn’t learn about it until we got to our weekly staging regressions. This would force developers to take time out of their weeks to work on urgent bug fixes instead of important, previously-scheduled work. We really needed a complete solution.
To solve this problem, we built our own homegrown service that could take in our webhooks and be queried on a per-transaction basis. Since our end-to-end framework (WebdriverIO) did not have the capability to consume webhooks natively, we created a very low-resource usage server using Koa. This provided us a simple, elegant solution that could be run using NodeJS, one that could be prototyped in about 90 minutes through paired coding with an experienced engineer.
We shied away from using any kind of database on the server. Everything is handled in-memory, which is fine for this use case since nothing needs to persist more than a few minutes.
Since we already use Heroku, it was an easy choice to set the server up in Heroku on the smallest possible computing unit. This was an additional benefit of doing everything in-memory: since we weren’t relying on a database like Postgres, it was simple and cheap to set up our service. Heroku restarts their hosted applications each night, so even though we store the received webhooks in-memory, we are not worried about exceeding our memory limit over time.
The final step in implementing our webhook listener service was modifying the end-to-end specs for our in-house service that previously relied on Beeceptor – now with the added benefit of checking programmatically that the webhook was sent with the correct status.
This process closed our loop of automation. After testing to make sure that our new solution was effective, we implemented webhooks testing in our daily end-to-end runs against our master branch, moving our testing “to the left” and freeing up QA engineer resources to do more targeted, exploratory, and automated testing.
We now let the robots do the repetitive tasks, and keep the humans doing what they do best: creativity and innovation.
Notarize’s webhook testing journey reflects our overall journey as a quality-focused engineering organization. Studies have shown that increased degrees of automated testing heavily correlates with success in the technology industry.
It is an incremental process, but it’s one that has delivered huge positive results already, and we have every expectation that it will continue to do so.