Shipping containers: Lessons learnt from adopting Docker
tl;dr: We started dealing with Docker in Q3/2016 pursuing the goal of adopting it for an existing system infrastructure. By now we do have several production applications deployed and running as Docker containers. It’s a fairly good approach helping you getting standards into your environment and making life a little easier. It has a learning curve though, and a load of advanced features that might get you off track pretty quickly because you’re tempted to deal with them even while you don’t need them (yet). Don’t try to go full-blown container deployment pipeline in one step but rather try to get your environment transformed in small incremental steps. Make sure you feel safe with each increment, gain experience with all things involved in order to avoid losing control of your systems after a big change. Read on for more…
What about it?
Containers are everywhere, and so is Docker which seems the essence and most widespread of these technologies to date. We started looking into this only a year ago for various reasons. Things always change. Our team structure changed, especially regarding development and operation. The amount of technical components increased, and so did the set of different frameworks and stacks involved. Deployment on the other side was supposed to be made easier not more difficult. Strategic approaches changed, too, moving towards a industry group based IT infrastructure with a central hosting and teams shipping dedicated services in a common way. All along with this, Docker appeared on our radar and remained there.
By now, we do have a bunch of application modules (to avoid the term “microservices” for a moment) built, shipped, running in a Docker environment. It works well, it surely eased a couple of things while introducing new interesting questions to solve. We’re still far from being in a fully-automated continuous delivery environment where each and every piece of code runs into production automatically. Every tooling and every technology-backed solution comes with a learning curve, and handling containers is no different. However, there are already a couple of insights we made, insights I consider worth noting…
Know your problems
Though this sounds blatantly obvious, you should very very much focus on figuring out what problems you actually do have in your environment and how containers or more specifically Docker could help solving. Simple reason for stating this: There’s a load of vendors, solution providers and consultants trying to both figure out what needs their customers eventually could have, and there’s a load of tooling providers trying to sell their idea of what a good container based environment looks like. Same way, there are wagonloads of case studies, whitepapers, “best practises” and the like outlining what you really need to do to be successful shipping software in containers.
Most of them aren’t wrong at all. But at the end you are about to choose whether or not their solutions actually address your challenges. If your environment works perfectly well – fine. Don’t fall for a technology like Docker simply because they could make things “even better”. If it ain’t broke, don’t fix it.
That said, our problem scope was rather clear: We want(ed) to straighten deployment procedures a bit. All along with moving from a Java EE application server to multiple standalone JVM applications with embedded HTTP servers a while ago, at some point we exchanged the complexity of a full-blown heavy server component with the complexity of maintaining and running multiple standalone applications. It got better, but of course it left a couple of old and new issues unsolved. Docker, on first and second glance, looked fairly well like an approach to solve at least some of these issues. This is why we started introducing Docker for deployment and operation first, explicitely focussing on keeping everything “as is” for developers. This is a sufficiently complex task – early inclusion of developers would have made this blow up.
Set your boxes
One of the best things Docker can do for existing IT applications is forcing you to re-think applications and dependencies. On real-world environments, there’s always a chance of interactions and dependencies between components and systems that aren’t obvious or documented simply because no one ever thought about it as dependencies. Think about simple things such as your application depending upon some Linux binary that is “just there in /usr/bin” and invoked from within your application: Everything’s fine as long as your application runs on the host it was initially deployed to. Will this still work after moving to a different host with different packages in different versions installed in the base operating system distribution? Or your application uses a mounted network filesystem it expected in /media which, now, after moving to a new environment, is moved to /mnt? Maybe these things even are subject to configuration in your application – are you still sure it will be obvious in a relevant situation? Are you sure you will be able to gracefully handle additional dependencies being introduced at some point in time?
Dealing with Docker, here, greatly has helped raising awareness for these issues by providing that “container” idea. It forces to think of your applications even more as of closed boxes and interactions between applications as links between these boxes that need to be made explicit in order to work at all. Docker is not really needed for that, but it helps getting there by technologically enforcing this container approach as well as by providing ways how to describe dependencies, communications between applications, … : The base image used for your container describes what operating system libs and binaries will be around. Volumes mounted into and ports exposed by your container describe some of its interface. Using something like docker-compose, you’ll be able to describe dependencies between containers in a more strict manner too (in which order do they have to be started to work, how they are linked, …).
Again: This is nothing Docker really is needed for, but it’s a point where it definitely helps getting done some kind of work IT organizations will have to do anyway.
Start safe and fast
Pretty often these days people are talking about “infrastructure as code”. If so (and I think this really is a good idea in some ways), we also should apply the same rules we want to apply to software development that should be agile, in general: Focus on providing visible value in short iterations, try to incrementally improve, try to start somewhere, learn in real-world use cases in production environments, improve. Translating this approach to Docker based deployment, recommendation would be: Make sure you get from mere theory to a working environment quickly. Pick an application candidate, learn how to build an image from it, learn how to run this image in a container. Get this to run in production and make sure everyone on your (Dev)Ops team has an idea to handle this even when you’re off on vacation.
Why? Well, with a full-blown environment featuring, say, Docker, dedicated operating systems such as CoreOS, orchestration tools like Kubernetes, maybe configuration services such as etcd or local or remote Docker registries, you very quickly end up with too much new things to introduce to your environment. There are loads of changes that are very likely to break existing workflows for relevant stakeholders in your IT organization. Your operations people need to learn to manage and get used to all this, and while they do so, your system stability is at risk.
Don’t do that. Try to find a minimum meaningful increment, and get this out and running as fast as possible. In our case, this was all about extending our current deployment procedure based upon common tools such as…
- ssh/scp for copying artifacts between various servers,
- bash scripts for starting / stopping / updating binaries on testing and production environments,
- zip artifacts to ship binaries, start/stop scripts and minimum configuration files,
- stock Ubuntu/Debian servers for running the components, and
- a local gitlab server and gitlab-ci for building “regular” Java / maven artifacts and Docker images all alike
to work with Docker cli and Docker images stored to files. Initially we’re running even without a local or central Docker registry and just use docker load and docker save for saving images to files and loading them again on the Docker servers. Docker containers themselves run on the same stock Ubuntu servers as before, side-by-side with existing applications. Docker containers still are stopped and restarted using bash scripts.
It can’t possibly get any more bare-boned than this, but it works. It doesn’t so far change anything for developers. It changes little for operations guys. But it provides a few benefits already, including looking into containers, log files, … using the web based portainer UI, or being able way easier to roll back to old versions of production code on live servers. That’s enough benefit for operation folks to justify dealing with this, and yet the change introduced this way is acceptable.
If you made it here, the remainder will possibly be rather trivial – essentially: re-iterate. Find the thing bugging you the most about the status quo you just reached, figure out how to improve, and actually improve in another small step. Maybe giving your IT teams some time to get acquainted to the new structure and collect feedback on what should be the next step is a good idea, even if you have a clear agenda where you want to be heading. In our case, a few next steps that seem desirable and needed are …
- … using or adopting an internal Docker registry for better image handling,
- … dealing with configuration and container start / stop scripts in a smarter manner,
- … figuring a way for dealing with and delivering large binary data files in a better way than having to mount CIFS shares to all servers.
We do have a couple of ideas for each of those, but neither priorities nor actual solutions are finalized on that. Let’s see where this all is heading, maybe I’ll recap this in near future if there’s something worth noting. Stay tuned.
There’s possibly a load more to write about that, but at some point someone would be required to read all this, so I’ll keep it at that and get into details if anyone’s interested. Conclusions so far, however: Spending quite some of office as well as spare time on Docker and containers so far paid off, even though there’s a load of things that can be improved. Personally, looking at these things, I found great insights and entertainment in listening to podcasts and reading books and articles on DevOps and agile software development at the same time. I found that most developers (well, including myself at times) are likely to push for great, feature-laden new releases or new technologies or new frameworks and stacks with all the bells, whistles and kitchen sink possible these days. At the same time I perfectly understand each and every operations guy who’s supposed to adhere to a way more conservative point of view of keeping any changes done to a production environment as small as somewhat possible (or best of all avoiding them), knowing that change is the enemy of stability. Agile development and DevOps culture seem a good way to resolve this “conflict” – even though not an easy way, for several reasons. But that’s a different thing…