Many companies work with Kubernetes, also known as k8s, though many companies have horror stories as well. Many tried and have understood that they could not achieve business outcomes with this technology.
In Fliit, we use k8s for more than half a year. The journey was not comfortable, but we learned a lot of things. The journey was not comfortable, but the experience taught us a lot.
Until the beginning of last year, Fliit was running most of the production load in AWS EC2 instances. Development environment would run locally in containers — docker-compose. From local to production, we would use Capistrano to do the actual deploy to EC2 machines.
We could drop containers and keep old and good Linux machines, but we discarded this method for the following reasons:
- Our primary programming language is Ruby, but we also use Go, Node and Python. Maintaining deploy pipeline is hard.
- Very often we would have OOM while deploying
- Hard to scale, no High availability, monitoring and logs. All this practically comes out of the box with kubernetes
The list can go on, but in general, the industry is moving toward containers because of reasons like this. Our local environment is in docker-compose because we build in container awareness from day 0 of the company.
It was only a matter of time before we move to container orchestration of some sort.
Why not alternatives to Kubernetes?
We were struggling between AWS ECS, Nomad & EKS.
ECS sounded promising because of our small production infrastructure. It’s well integrated into the aws ecosystem, and it’s simple. The disadvantage of ECS is it’s too well-integrated into AWS; in fact, it is so well-consolidated it vendor-locks you and all ECS management tools are done by AWS in their web console. Logging and performance monitoring is done through CloudWatch.
Nomad was very promising, and it’s very well designed. We love declarative infrastructure-as-code and their HashiCorp configuration language (HCL), and we use it whenever possible. But there were some drawbacks too. For example, it lacked service discovery, for which we had to install Consul. For secrets or any key-value database, we would have to install Vault or Etcd.
EKS was not our first nor last choice. The Kubernetes control-plane would be controlled by AWS, and that is a lot of heavy lifting. EKS is first-class citizens of Cloud Foundation, which means that all features would be coming from upstream, which would avoid vendor lock-in. K8S ecosystem is enormous, with a lot of tools and support. Slack is VERY useful in times of need.
CD/CI with containers
We separated the CD and CI. We use CircleCI for CI for things like unittests and cypress and we use AWS CodeBuild for CI into k8s.
The deployment pipeline would be like this when you commit and push to master/staging, GitHub triggers two jobs:
1) CircleCI for CI for unit tests
2) AWS CodeBuild for CD
AWS CodeBuild then creates an image for ECR (AWS Docker Registry), changes the image for k8s and notify.
There are many tools for CD/CI for k8s. We decided to use a custom python script because we have special deploy pipeline with migration and seed data.
We use kubesec to deploy secrets. With kubesec we are able to edit a password and auto-encrypt it with AWS KMS. This comes in handy when spreading our secrets across microservices. While deploying, we update the secrets on k8s. And because we use AWS KMS, we are able to control which team can control which secrets. Moreover, we commit encrypted hashes to Git without worrying it will leak somewhere.
We use kustomize for generating our k8s configuration files.
An example of one of our services’ tree content is depicted below:
In this example, we have many services, deployments, jobs and secrets that are generated on each deploy. We did consider helm but we concluded it’s very good when you use community modules and it can be cumbersome those who are new users of k8s. Our goal is to teach every developer how production and k8s works.
For configuring EKS, we use terraform with the almighty aws-eks module. With the help of that module, it was straightforward to set our QA/Staging to run with AWS spot instance. This saves us up to 70% in costs.
Migration from EC2
This is where most of our magic happened. We had zero production downtime while migrating to k8s.
The first step is to run the k8s cluster aside with old infrastructure. We would use round-drop DNS between old and new infrastructure. After we prove and check that new infrastructure is working well, we would pull the switch to kubernetes.
“Kubernetes is a platform for building platforms. It’s a better place to start; not the endgame” 
K8s is a complex tool for managing containers! But it solves a lot of problems you would have with old infrastructure. The learning curve was very tricky but once you understand it, you would probably implement very similar orchestra like K8s. Scalability and reliability come out of the box, which is the point!