We started by writing an end-to-end test. This was a lot of work, but it taught us a lot about the application, including:
- How do we deploy the application?
- What other applications/services are required for this to run? How often do those applications change?
- How do we log in?
- How do we create test data? What's the minimum amount of configuration required for a new user?
- What is different about the configuration of the application in development mode compared to production mode?
Writing that first end-to-end test required us to answer all these questions, and it wasn't easy. We wrote the test using Selenium to drive an Internet Explorer browser, which meant we had to build the tests to wait for the AJAX-heavy application to respond. To make it easier to recognize failures in external systems, we had to change the startup scripts - which ended up being helpful diagnostic tool in production.
This test and few others we added later caught many problems that appeared over the next several weeks:
- The required gem dependencies weren't being installed, preventing the application from starting
- A bug in authentication logic prevented anyone from logging in
- The database wasn't being migrated correctly as part of the installation script, preventing the application from starting up
- A bug in an external service caused the user interface of the application to become unresponsive
Yet these test failures were often difficult to diagnose. The test showed that the user never saw the main application screen, but it wasn't always clear why. Since we were running these end-to-end tests after every commit in a continuous integration server, we could at least identify the commit that introduced the bug. We captured the server logs with each build, which enabled us to look for startup or runtime problems.
With these end-to-end tests now in place, we had a safety net which could catch a wide variety of deployment, configuration, and user interface errors. Next we started writing unit tests, which executed faster and were easier to diagnose when they failed.