|Part of a series on|
In software engineering, continuous integration (CI) is the practice of merging all developers' working copies to a shared mainline several times a day. Grady Booch first proposed the term CI in his 1991 method, although he did not advocate integrating several times a day. Extreme programming (XP) adopted the concept of CI and did advocate integrating more than once per day – perhaps as many as tens of times per day.
When embarking on a change, a developer takes a copy of the current code base on which to work. As other developers submit changed code to the source code repository, this copy gradually ceases to reflect the repository code. Not only can the existing code base change, but new code can be added as well as new libraries, and other resources that create dependencies, and potential conflicts.
The longer development continues on a branch without merging back to the mainline, the greater the risk of multiple integration conflicts and failures when the developer branch is eventually merged back. When developers submit code to the repository they must first update their code to reflect the changes in the repository since they took their copy. The more changes the repository contains, the more work developers must do before submitting their own changes.
Eventually, the repository may become so different from the developers' baselines that they enter what is sometimes referred to as "merge hell", or "integration hell", where the time it takes to integrate exceeds the time it took to make their original changes.
CI is intended to be used in combination with automated unit tests written through the practices of test-driven development. This is done by running and passing all unit tests in the developer's local environment before committing to the mainline. This helps avoid one developer's work-in-progress breaking another developer's copy. Where necessary, partially complete features can be disabled before committing, using feature toggles for instance.
A build server compiles the code periodically or even after every commit and reports the results to the developers. The use of build servers had been introduced outside the XP (extreme programming) community and many organisations have adopted CI without adopting all of XP.
In addition to automated unit tests, organisations using CI typically use a build server to implement continuous processes of applying quality control in general – small pieces of effort, applied frequently. In addition to running the unit and integration tests, such processes run additional static analyses, measure and profile performance, extract and format documentation from the source code and facilitate manual QA processes. On the popular Travis CI service for open-source, only 58.64% of CI jobs execute tests.
This continuous application of quality control aims to improve the quality of software and reduce the time taken to deliver it, by replacing the traditional practice of applying quality control after completing all development. This is very similar to the original idea of integrating more frequently to make integration easier, only applied to QA processes.
Now, CI is often intertwined with continuous delivery or continuous deployment in what is called CI/CD pipeline. "Continuous delivery" makes sure the software checked in on the mainline is always in a state that can be deployed to users and "continuous deployment" makes the deployment process fully automated.
The earliest known work on continuous integration was the Infuse environment developed by G. E. Kaiser, D. E. Perry, and W. M. Schell.
In 1994, Grady Booch used the phrase continuous integration in Object-Oriented Analysis and Design with Applications (2nd edition) to explain how, when developing using micro processes, "internal releases represent a sort of continuous integration of the system, and exist to force closure of the micro process".
In 1997, Kent Beck and Ron Jeffries invented Extreme Programming (XP) while on the Chrysler Comprehensive Compensation System project, including continuous integration.[self-published source] Beck published about continuous integration in 1998, emphasising the importance of face-to-face communication over technological support. In 1999, Beck elaborated more in his first full book on Extreme Programming. CruiseControl, one of the first open-source CI tools,[self-published source] was released in 2001.
In 2010, Timothy Fitz published an article detailing how IMVU's engineering team had built and been using the first practical CI system. While his post was originally met with scepticism, it quickly caught on and found widespread adoption as part of the Lean software development methodology, also based on IMVU.
This section lists best practices suggested by various authors on how to achieve continuous integration, and how to automate this practice. Build automation is a best practice itself.
Continuous integration—the practice of frequently integrating one's new or changed code with the existing code repository —should occur frequently enough that no intervening window remains between commit and build, and such that no errors can arise without developers noticing them and correcting them immediately. Normal practice is to trigger these builds by every commit to a repository, rather than a periodically scheduled build. The practicalities of doing this in a multi-developer environment of rapid commits are such that it is usual to trigger a short time after each commit, then to start a build when either this timer expires or after a rather longer interval since the last build. Note that since each new commit resets the timer used for the short time trigger, this is the same technique used in many button debouncing algorithms. In this way, the commit events are "debounced" to prevent unnecessary builds between a series of rapid-fire commits. Many automated tools offer this scheduling automatically.
Another factor is the need for a version control system that supports atomic commits; i.e., all of a developer's changes may be seen as a single commit operation. There is no point in trying to build from only half of the changed files.
To achieve these objectives, continuous integration relies on the following principles.
Main article: Version control
This practice advocates the use of a revision control system for the project's source code. All artefacts required to build the project should be placed in the repository. In this practice and the revision control community, the convention is that the system should be buildable from a fresh checkout and not require additional dependencies. Extreme Programming advocate Martin Fowler also mentions that where branching is supported by tools, its use should be minimised. Instead, it is preferred for changes to be integrated rather than for multiple versions of the software to be maintained simultaneously. The mainline (or trunk) should be the place for the working version of the software.
Main article: Build automation
A single command should have the capability of building the system. Many build tools, such as make, have existed for many years. Other more recent tools are frequently used in continuous integration environments. Automation of the build should include automating the integration, which often includes deployment into a production-like environment. In many cases, the build script not only compiles binaries but also generates documentation, website pages, statistics and distribution media (such as Debian DEB, Red Hat RPM or Windows MSI files).
Once the code is built, all tests should run to confirm that it behaves as the developers expect it to behave.
By committing regularly, every committer can reduce the number of conflicting changes. Checking in a week's worth of work runs the risk of conflicting with other features and can be very difficult to resolve. Early, small conflicts in an area of the system cause team members to communicate about the change they are making. Committing all changes at least once a day (once per feature built) is generally considered part of the definition of Continuous Integration. In addition, performing a nightly build is generally recommended. These are lower bounds; the typical frequency is expected to be much higher.
The system should build commits to the current working version to verify that they integrate correctly. A common practice is to use Automated Continuous Integration, although this may be done manually. Automated Continuous Integration employs a continuous integration server or daemon to monitor the revision control system for changes, then automatically run the build process.
When fixing a bug, it is a good practice to push a test case that reproduces the bug. This avoids the fix to be reverted, and the bug to reappear, which is known as a regression. Researchers have proposed to automate this task: if a bug-fix commit does not contain a test case, it can be generated from the already existing tests.
The build needs to complete rapidly so that if there is a problem with integration, it is quickly identified.
Main article: Test environment
Having a test environment can lead to failures in tested systems when they deploy in the production environment because the production environment may differ from the test environment in a significant way. However, building a replica of a production environment is cost-prohibitive. Instead, the test environment or a separate pre-production environment ("staging") should be built to be a scalable version of the production environment to alleviate costs while maintaining technology stack composition and nuances. Within these test environments, service virtualisation is commonly used to obtain on-demand access to dependencies (e.g., APIs, third-party applications, services, mainframes, etc.) that are beyond the team's control, still evolving, or too complex to configure in a virtual test lab.
Making builds readily available to stakeholders and testers can reduce the amount of rework necessary when rebuilding a feature that doesn't meet requirements. Additionally, early testing reduces the chances that defects survive until deployment. Finding errors earlier can reduce the amount of work necessary to resolve them.
All programmers should start the day by updating the project from the repository. That way, they will all stay up to date.
It should be easy to find out whether the build breaks and, if so, who made the relevant change and what that change was.
Most CI systems allow the running of scripts after a build finishes. In most situations, it is possible to write a script to deploy the application to a live test server that everyone can look at. A further advance in this way of thinking is continuous deployment, which calls for the software to be deployed directly into production, often with additional automation to prevent defects or regressions.
Continuous integration is intended to produce benefits such as:
With continuous automated testing, benefits can include:
Some downsides of continuous integration can include: