Our team implemented a Continuous Delivery (CD) solution which enables automated installation of configurable secure Hadoop Cloudera environments on Cloud computing software platforms (Amazon, OpenStack), automated shipping of an application into the created BigData environment, execution of required tests (unit, integration, functional, performance etc.) and delivery of the application to production.
With Continuous Delivery our team now delivers software to production with reduced cycle time between idea and realization using the automation of the entire delivery system: build, deployment, test, and release.
Additionally, the build "from scratch" of a Hadoop cluster with a flexible configuration topologies now takes less time – a developer can have a fully functional work environment in a matter of minutes instead of hours.
The IntroPro DevOps team has unique experience in building and coordinating the CD process for BigData projects. This project expanded our experience using the configuration management and orchestration tools for Cloud platforms.
Our CD Pipeline Implementation:
1) Developer commits code to the code repository (Git)
2) Jenkins notices code commit and starts building the application
- merges (using Git Flow), compiles code and build package
- runs unit tests and reports (JUnit)
- runs static code analysis to report code quality (SonarQube)
- provides to do one-click fully automated deployment
3) Nexus artifact manager stores the application binaries for each build
- this makes it possible to build once and deploy in all environments for a given build number
- rollback is easy, it is simple to grab the previous version
4) Jenkins orchestrates the rest of process, calls different actors to do different jobs:
- Work environment(s) are created using Vagrant and Docker containers (Docker allows us to have a base docker container image setup with the required dependencies like orchestration tools, predefined libraries etc. - this leads to a much faster feedback cycle)
- Ansible (configuration management and orchestration tool) installs all required software inside of Docker containers and configures properties files - it allows us to run the full install each time on a fresh instance for every commit
- Puppet deploys the application
5) Once the application is ready – Integration, Functional and Performance tests (using Selenium, CA Lisa and Jmeter) are executed, only successful results are delivered.
- Amazon AWS for the public cloud and OpenStack — for the private cloud
- Docker - multiple application containers (Cloudera Manager and Hadoop cluster nodes) in one large instance
- Integration with MIT Kerberos and OpenLDAP
- Cloudera CDH5 - Hadoop
- Jenkins - Continuous Integration
- Vagrant with AWS and OpenStack provider plugins - environments creation
- Ansible - configurations management and orchestration
- Shell and Python scripting
- Interaction with Cloudera API
- Selenium and PhantomJS (headless WebKit) to automate UI-only Cloudera configurations
- Git - SCM
- Java - web-service
- Twitter Bootstrap - frontend
- Experience in configuration management and orchestration tools
- In-depth SDLC knowledge
- Experience in building CD for BigData projects
- Experience in changing roles of team for CD process
Anton Zhuk, Project lead, Software Development Lifecycle Solutions