What Ops can learn from Dev

Will Thames

20 April 2016

Overview

About this talk

Why?

  • Developers typically have sophisticated practices for writing and maintaining large codebases.
  • Operations typically aren't as well versed in these practices.
  • More people can use and contribute code when it's easy to access and easy to improve.

About me

  • Systems Engineer at Red Hat, Brisbane.
  • Previously at Suncorp, Brisbane and Betfair, London.
  • Contributor to Ansible.
  • But this talk is intended to be product agnostic.

Writing solid code

Higher level languages

  • Why use configuration management? Surely bash scripts in an for loop over ssh will suffice?
  • Why use python or ruby? Surely assembly or C will suffice?

Higher level automation

  • Abstraction of patterns to higher layers
  • Repeatability
  • Error handling
  • Reduction of boilerplate code
  • Templating
  • API calls

Abstraction

  • As with functions, modules, libraries and packages, wrap up common operations into reusable code. This might be a module for installing and configuring java, or deploying a particular application type.

  • Chef has cookbooks, Ansible has roles and puppet has modules for grouping a bunch of operations.
  • Ansible has modules and chef and puppet have providers for creating new operations.

Repeatability

  • What happens if you run your code twice?
  • What happens if the second time is six months from now?

Versions

  • Give your dependencies version identifiers.
  • Specify the version of dependency in a suitable place.
  • Furthermore specify versions when pulling things from yum, apt-get, git, mercurial etc.

Sharing code

Version control

  • Have some. Which one is relatively unimportant.
  • Find out when something was changed, and by whom.
  • See what changed, and hopefully why (needs good commit messages!)
  • Go back in time — revert changes, compare differences.

Code separate from data

  • Hardcode as little as possible in your templates and task files (beware premature templating though!)
  • Should make it easier to maintain, and allows you to source configuration from alternative data sources.
  • Using the same tools across all environments reduces likelihood of error.
  • Try and make it so that your code could be shared with the world without giving anything away.

Data Inheritance

  • Only write as much configuration as you need.
  • Some variables will be common to all applications across a particular environment.
  • Some variables will be common to all environments for a particular application.
  • Use Ansible groups, Chef roles and Puppet's profiles to manage the inheritance hierarchy.

Data Inheritance

webapp inheritance graph
webapp inheritance graph

Secrets

  • You will need a solution to what to do with secrets. There are many.
  • ansible-vault, chef encrypted databags, eyaml.
  • Hashicorp's Vault, Keywhiz by Square.

Community

  • Separation of code and data (particularly secret data) allows you to share your work with others outside of your organisation.
  • If you are able to share your code, you can include contributions of others, or set your code free so that others can manage improvements, that you can then benefit from. Opening the source is the start of the journey.

Community

  • You can also benefit from work others have done — look for modules that others have written before writing your own. They may not be perfect, but they are a start.
  • See Ansible Galaxy, Puppet Forge, Chef Supermarket.

Code quality

Ops hate to hear

Works on my machine
Works on my machine

Ops hate to hear

We should have tested that
We should have tested that

And if Ops hate to hear them

They really hate to be the ones saying them.

Quality control

  • Use the tools.
  • ansible-lint, puppet-lint, Chef foodcritic.
  • pep8, go fmt etc.
  • dry run mode, diff mode

Standards Documentation

  • Best practices are an advisory of things to consider. Call them guidelines if you prefer.
  • Standards should be testable, preferably automated.
  • We manage our standards and best practices as a git repo using pull requests to achieve consensus.
  • Any changes/additions to best practices and standards must achieve a body of support.

Code reviews

  • All code reviews should be objective. If you're objecting to a style issue, you should be able to point to documentation (internally or using an existing style guide for a language/framework)
  • Have a policy on what level of consensus is required to accept code into the mainline codebase.
  • This will typically be a risk management tradeoff.

Testing

  • Practices such as unit testing and integration testing are currently difficult to achieve.
  • Which leaves end-to-end testing in production like environments.
  • Virtual machines — RHEV, VMWare, Virtualbox etc.
  • PaaS — Heroku, Openshift etc.
  • Containers.
  • Public cloud (AWS, Azure, GCE etc) and private cloud (Openstack)
  • Anything that isn't "your machine"

Continuous Integration

  • Commit
  • Checkout
  • Static analysis
  • Automated provisioning
  • Apply configuration
  • Run test suite (e.g. serverspec)
  • Deploy to production

Disclosure

  • I've yet to see the full implementation of the previous slide in practice.
  • Focus on the things that are most likely to eliminate unnecessary errors or effort.

Thanks for listening!

Questions?