Design, develop, test, deploy, repeat. In a nutshell, that’s a standard software application life-cycle. Depending on the company, each step has its own needs and methods, but every piece of software is aiming towards the same goal: Providing a service in a reliable way. While coding practices are vital for developing the product at hand, it’s always advisable to become aware of the product’s life-cycle. Even if the process changes depending on the company, it is possible to highlight key stepping stones that can help ensure a smooth ride from development to production.
Development: “It works! (on my machine)”
Depending on the team’s development methodology, development can occur in different ways. Currently, the most widely-accepted ways of developing software divide functionality sets into manageable tasks. These tasks are what developers use as a guide of what the system is supposed to do. During development, deadlines and work ethic come into play, where developers commit to delivering code so that the team can achieve progress. Daily meetings allow them to identify design flaws and obstacles as soon as possible so that the resulting code accurately reflects the initial design and goals set for the completed product. Code reviews are also commonplace during development.
Ideally, developers test and build upon their code through unit tests. In some cases, it’s possible to deploy code to a test environment, but for smaller projects, the norm is to run it on one’s machine, run local tests and then submit it to a remote repository to most likely undergo another set of tests. This next step is known as staging.
Staging: “Look ma. No console.log!”
A staging environment mimics production. It’s a place where everything can go wrong. Usually, developers stage their code and another group of developers (the Quality Assurance or User Acceptance teams) take on a more focused role towards testing. Besides completing automated testing, staging tests involve measuring performance and imposing real-world demands on the system. Sometimes, new features unexpectedly slow down the application. This can be due to inefficient database queries or algorithms; automated testing is usually not able to pick up performance issues, so staging allows the team to beat around new features and see how they fare when the users’ experience is on the line.
During staging, automated testing usually takes place a redundant amount of times, particularly when testing end-to-end connectivity, disk space, handling of requests and overall response times. It is also advised to set up warnings that allow tracking needed architecture upgrades in the future.
Finally, after the product completes the staging tests, it’s finally time to move on to production.
Production: “Never deploy on a Friday”
At this point, the team has tested the system extensively. Nevertheless, errors can still occur. There are a myriad of user-side variables that cannot be tested even by the most proficient QA team. Every operating system, web browser, mobile keyboard, and even ISP restrictions play a part in making errors appear.
A great way to handle production without losing too much sleep is to prepare a smaller set of users that will get the newest version of the system. That way, it’s possible to assess bugs that could not be picked up in earlier stages.
If serious bugs surface, then it’s time to activate the rollback plan. It’s better to have a rollback plan and not need it than to need it and not having one. Ideally, rollbacks are only necessary when there are system-breaking errors; these errors are usually picked up within the first hours of deploying to production. That’s why developers soon learn to never deploy to production on a Friday; serious bugs will have the whole team panicking, which in turn will make debugging much messier and less effective. Deploying to production should always be approached like performing in a classical music concert. It may not be the first time nor the last, but you should have prepared and, if you mess up, your team and you should be able to easily mitigate any damage caused and keep on playing like if nothing bad ever happened. It is also very important to be careful with the downtime, as existing system’s users can be affected by making a change in production. This is usually solved by having several replicas of the system’s components and replacing each replica one by one, until the entire system is updated.