OpenShift 4.2 on Azure

OpenShift 4.2 on Azure Preview

NOTE: This blog originally appeard on the OpenShift Blog

In this blog we will be showing a video on how to get OpenShift 4 installed on Azure using the full stack automated method. This method differs from the pre-existing infrastructure method, as the full stack automation gets your from zero to a full OpenShift deployment, creating all the required infrastructure components automatically. Currently, installing OpenShift 4 on Azure is under tech preview. It won’t be supported until the GA release of OpenShift 4.2. This blog is meant for those who want to get a preview on what’s coming. Detailed instructions are below if you wish to follow along!


It’s important that you get familiar with the general prerequisites by looking at the official documentation for OpenShift. There you can find specific details about the requirements and installation details for either full-stack automated or for pre-existing infrastructure deployments. I have broken up the prerequisites into sections and have marked those that are optional.


You will need to have a DNS domain already controlled by Azure. The OpenShift installer will configure DNS resolution (internal and external) for the cluster. This can be done by buying a domain on Azure or delegating a domain (or subdomain) to Azure. In either case, make sure the domain is set ahead of time.

During the install, you will be providing a $CLUSTERID, this ID will be used as part of the FQDN of the components created for your cluster. In other words, the ID will become part of your DNS name. For example, a domain of and a $CLUSTERID of ocp4 will yield an OpenShift domain of for your cluster.

Choose wisely.

Azure CLI Tools (Optional)

It’s useful to install the Azure az CLI client. Although you can do all of what you need for Azure from the web UI, it’s helpful to have the CLI tool installed for debugging or streamlining the setup process.

Once you’ve installed the Azure CLI, you will need to login to set up the cli for access. Be sure to visit the Getting Started page for more information. Once set up, verify that you have a connection to your account with the following:

az account show

The output should look something like this

  "environmentName": "AzureCloud",
  "isDefault": true,
  "name": "Microsoft Azure Account",
  "state": "Enabled",
  "user": {
    "name": "",
    "type": "user"

Again, you don’t need the Azure CLI tool; but it does help.

OpenShift CLI Tools

In order to install and interact with OpenShift, you will need to download some CLI tools. These can be found by going to and logging in with your Red Hat Customer Portal credentials. Click on Azure (note that it’s only Developer Preview currently). You will need to download the following:

You may need the "dev preview" binaries instead, as dev previews are always being updated. Always consult for details.


In this section I will be going over the installing of OpenShift 4.2 dev preview on Azure, with the assumption you have an Azure account and that you did all the prerequisites. I will be installing the following:

  • Installer sets up 3 Master nodes, 3 Worker nodes, and 1 bootstrap node.
  • I will be using domain as an example.
  • I will be using openshift4 as my clusterid.
  • I am doing the install from a Linux host.

Creating a Service Principal

A Service Principal needs to be created for the installer to use. Service Principal can be thought of as a "robot" account for automation on Azure. More information about Service Principals can be found using the Microsoft Docs. To create a service principal; run the following command:

az ad sp create-for-rbac --name chernand-azure-video-sp

When successful, it should output the information about the service principal. Save this information somewhere as the installer will need it to do the install. The information should look something like this.

  "displayName": "chernand-azure-video-sp",
  "name": "http://chernand-azure-video-sp",

Next, you need to give the service principal the right roles in order to properly install OpenShift. The service principal needs to have at least Contributor and User Access Administrator roles assigned in your subscription.

az role assignment create --assignee \ 
az role assignment create --assignee \
ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ --role "User Access Administrator"

NOTE: The UUID passed to --assignee is the appId in the output when you created the service principal.

In order to properly mint credentials for components in the cluster, your service principal needs to request for the following application permissions before you can deploy OpenShift on Azure: Azure Active Directory Graph -> Application.ReadWrite.OwnedBy

You can request permissions using the Azure portal or the Azure CLI. (You can read more about Azure Active Directory Permissions at the Microsoft Azure website)

az ad app permission add --id ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ \
--api 00000002-0000-0000-c000-000000000000 \
--api-permissions 824c81eb-e3f8-4ee6-8f6d-de7f50d565b7=Role

NOTE: The Application.ReadWrite.OwnedBy permission is granted to the application only after it is provided an "Admin Consent" by the tenant administrator. If you are the tenant administrator, you can run the following to grant this permission.

az ad app permission grant --id \
--api 00000002-0000-0000-c000-000000000000

You will also need your Subscription ID; you can get this by running the following.

az account list --output table

Installing OpenShift

It’s best to create a working directory when creating a cluster. This directory will hold all the install artifacts, including the initial kubeadmin account.

mkdir ~/ocp4

Run the openshift-install create install-config command specifying this working directory. This creates the initial install config (install-config.yaml) and stores it in that directory. You will need information about your service principal you created earlier.

$ openshift-install create install-config --dir=~/ocp4
? SSH Public Key /home/chernand/.ssh/
? Platform azure
? azure subscription id 12345678-1234-1234-1234-123456789012
? azure service principal client id ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ
? azure service principal client secret [? for help] ***********
INFO Saving user credentials to "/home/chernand/.azure/osServicePrincipal.json"
? Region centralus
? Base Domain
? Cluster Name openshift4
? Pull Secret [? for help] ****************************

Let’s go over the Azure specific options.

  • azure subscription id - This is your subscription id. This can be obtained by running: az account list --output table
  • azure tenant id - Your tenant id (this was in the output when you created your service principal)
  • azure service principal client id - This is the appId from the service principal creation output.
  • azure service principal client secret - This is the password from the service principal creation output.

The install-config.yaml file is in the ~/ocp4 working directory. It also creates a ~/.azure/osServicePrincipal.json file. Inspect these files if you wish.

cat ~/ocp4/install-config.yaml
cat ~/.azure/osServicePrincipal.json

After you’ve inspected these files; go ahead and install OpenShift.

openshift-install create cluster --dir=~/ocp4/

When the install is finished, you’ll see the following output.

INFO Consuming "Install Config" from target directory
INFO Creating infrastructure resources...         
INFO Waiting up to 30m0s for the Kubernetes API at
INFO API v1.14.0+8e63b6d up                       
INFO Waiting up to 30m0s for bootstrapping to complete...
INFO Destroying the bootstrap resources...        
INFO Waiting up to 30m0s for the cluster at to initialize...
INFO Waiting up to 10m0s for the openshift-console route to be created...
INFO Install complete!                            
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/chernand/ocp4/auth/kubeconfig'
INFO Access the OpenShift web-console here:
INFO Login to the console with user: kubeadmin, password: 5char-5char-5char-5char

Set the KUBECONFIG environment variable to connect to your cluster.

export KUBECONFIG=$HOME/ocp4/auth/kubeconfig

Verify that your cluster is up and running.

$ oc cluster-info
Kubernetes master is running at

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Post Install

After your cluster is deployed, you may want to do some additional configuration tasks such as:

It’s important to note that the kubeadmin user is meant to be a temporary admin user. You should replace this user with a more permanent admin user when you configure authentication.


In this blog we went over how to install OpenShift 4 on Azure using the full stack automated method. It’s important to note that this method is marked as developer preview, meaning it’s not supported by Red Hat. However, the installer is ready for you to deploy and test for non-production workloads. Please feel free to try it and provide feedback by leaving a comment below or or reach out via the Customer Portal Discussions page.

OpenShift 4.1 Bare Metal Install Quickstart

NOTE: This post originally appeard when I wrote it on the OpenShift Blog site.

In this blog we will go over how to get you quickly up and running with an OpenShift 4.1 Bare Metal install on pre-existing infrastructure. Although this quickstart focuses on the bare metal installer, this can also be seen as a "manual" way to install OpenShift 4.1. Moreover, this is also applicable to installing to any platform which doesn't have the ability to provide ignition pre-boot. For more information about using this generic approach to install on untested platforms, please see this knowledge base article.


Openshift 4 introduces a new way of installing the platform that is automated, reliable, and repeatable. Based on the Kubernetes Cluster-API SIG, Red Hat has developed an OpenShift installer for full stack automated deployments. This means that the installer not only installs OpenShift, but it installs (and manages) the entire infrastructure as well, from DNS all the way down the stack to the VM. This provides a fully integrated system that can resize automatically with the needs of your workload. Currently, full stack automated deployment is supported on AWS.

For pre-existing infrastructure deployments is if you have existing infrastructure that you would like to use for the purposes of running OpenShift 4. Most are familiar with this method as it was the default (and only) way to install OpenShift 3. Currently guides for Pre-existing infrastructure installs are on AWS, VMWare vSphere, and bare metal. The latter being the "catch all", since you can use the bare metal method for non-tested platforms.

I will be going over installing OpenShift 4 Bare Metal, on a pre-existing infrastructure along with the prerequisites. However, as already stated, you can use this method for other infrastructure, for example VMs running on Red Hat Virtualization.


It's important that you get familiar with the prerequisites by reading the official documentation for OpenShift. There you can find more details about the prerequisites and what it entails. I have broken up the prerequisites into sections and have marked those that are optional.


Proper DNS setup is imperative for a functioning OpenShift cluster. DNS is used for name resolution (A records), certificate generation (PTR records), and service discovery (SRV records). Keep in mind that OpenShift 4 has a concept of a "clusterid" that will be incorporated into your clusters DNS records. Your DNS records will all have <clusterid>.<basedomain> in them. In other words, your "clusterid" will end up being part of your FQDN. Read the official documentation for more information.

Forward DNS Records

Create forward DNS records for your bootstrap, master, and worker nodes. Also, you'll need to create entries for both api and api-int and point them to their respective load balancers (NOTE both of those entries can point to the same load balancer). You will also need to create a wildcard DNS entry pointing to the load balancer. This entry is used by the OpenShift router. Here is a sample using bind with ocp4 as the <clusterid>.

; The api and api-inf can point to the IP of the same load balancer
api.ocp4            IN      A
api-int.ocp4        IN      A
; The wildcard points to the load balancer
*.apps.ocp4        IN      A
; Create entry for the bootstrap host
bootstrap.ocp4        IN      A
; Create entries for the master hosts
master0.ocp4        IN      A
master1.ocp4        IN      A
master2.ocp4        IN      A
; Create entries for the worker hosts
worker0.ocp4        IN      A
worker1.ocp4        IN      A

An example of a DNS zonefile with forward records can be found here.

Reverse DNS Records

Create reverse DNS records for your bootstrap, master, workers nodes, api, and api-int. The reverse records are important because that is how RHEL CoreOS sets the hostname for all the nodes. Furthermore, these PTR records are used in order to generate the various certificates OpenShift needs to operate. The following is an example using as the <basedomain> and using ocp4 as the <clusterid>. Again, this was done using bind.

; syntax is "last octet" and the host must have fqdn with trailing dot
97        IN      PTR
98          IN      PTR
99          IN      PTR
96          IN      PTR
5           IN      PTR
5           IN      PTR
11          IN      PTR
7           IN      PTR

An example of a DNS zonefile with reverse records can be found here.

DNS Records for ETCD

Two record types need to be created for ETCD. The forward record needs to point to the IPs of the masters (CNAMEs are fine as well). Also the names need to be etcd-<index> where <index> is a number starting at 0. An example will be etcd-0, etcd-1, and etcd-2. You will also need to create SRV records pointing to the various etcd-<index> entries. You'll need to set these records with a priority 0, weight 10 and port 2380. Below is an example using as the <basedomain> and using ocp4 as the <clusterid>.

; The ETCd cluster lives on the point these to the IP of the masters
etcd-0.ocp4             IN      A
etcd-1.ocp4             IN      A
etcd-2.ocp4             IN      A
; The SRV records point to FQDN of etcd...note the trailing dot at the end...
_etcd-server-ssl._tcp.ocp4      IN      SRV     0 10 2380
_etcd-server-ssl._tcp.ocp4      IN      SRV     0 10 2380
_etcd-server-ssl._tcp.ocp4      IN      SRV     0 10 2380

An example of these entries can be found in the example zonefile.

Load Balancer

You will need a load balancer to frontend the APIs, both internal and external, and the OpenShift router. Although Red Hat has no official recommendation to which Load Balancer to use, one that supports SNI is necessary (most load balancers do this today).

You will need to configure Port 6443 and 22623 to point to the bootstrap and master nodes. The below example is using HAProxy (NOTE that it must be TCP sockets to allow SSL passthrough)

frontend openshift-api-server
    bind *:6443
    default_backend openshift-api-server
    mode tcp
    option tcplog

backend openshift-api-server
    balance source
    mode tcp
    server bootstrap check
    server master0 check
    server master1 check
    server master2 check

frontend machine-config-server
    bind *:22623
    default_backend machine-config-server
    mode tcp
    option tcplog

backend machine-config-server
    balance source
    mode tcp
    server bootstrap check
    server master0 check
    server master1 check
    server master2 check

You will also need to configure 80 and 443 to point to the worker nodes. The HAProxy configuration is below (keeping in mind that we're using TCP sockets).

frontend ingress-http
    bind *:80
    default_backend ingress-http
    mode tcp
    option tcplog

backend ingress-http
    balance source
    mode tcp
    server worker0 check
    server worker1 check

frontend ingress-https
    bind *:443
    default_backend ingress-https
    mode tcp
    option tcplog

backend ingress-https
    balance source
    mode tcp
    server worker0 check
    server worker1 check

A full example of an haproxy.cfg file can be found here.


A webserver is needed in order to hold the ignition configurations and installation images for when you install RHEL CoreOS. Any webserver will work as long as the webserver can be reached by the bootstrap, master, and worker nodes during installation. I will be using Apache. Download either the metal-bios or the uefi-metal-bios file, depending on what your servers need, from here. For example, this is how I downloaded the metal-bios file to my webserver.

mkdir -p /var/www/html/{ignition,install}
cd /var/www/html/install
curl -J -L -O

Setup DHCP (Optional if doing static ips)

It is recommended to use the DHCP server to manage the node's IP addresses for the cluster long-term. Ensure that the DHCP server is configured to provide persistent IP addresses and host names to the cluster machines. Using DHCP with IP reservation ensures the IPs won't change on reboots. For a sample configuration; please see this dhcpd.conf file.

Reconciling Prerequisites

If you plan on installing OpenShift 4 in a "lab" environment (either on bare metal or using VMs); you might want to take a look at the "Helper Node" github page. The "Helper Node" ansible playbook sets up an "all-in-one" node with all the aforementioned prerequisites. This playbook has two modes: "standard" and "static ips".

Take a look at the quickstart to see if it might be of use. These steps are written for Libvirt, but the playbook is agnostic. So you can run it on your BareMetal environm


Unlike the full stack automated install method, the pre-existing infrastructure install is done in phases. The three main phases are: ignition config creation, bootstrap, and install complete. In this section I will be going over how to install OpenShift 4 on Bare Metal with the assumption that you have all the prerequisites in place. I will be installing the following:

  • 3 Master nodes, 2 Worker nodes, and 1 bootstrap node.
  • I will be using my internal domain.
  • I will be using ocp4 as my clusterid.
  • I will be using static IPs (but will go over DHCP as well)
  • I am doing the install from a "bastion" Linux host
  • Make sure you download the client and installer

Creating The Install Configuration

First (after all the prereqs are done), we need to create an install-config.yaml file. This is the file where we set parameters for our installation. Create a working directory to store all the files.

mkdir ~/ocp4
cd ~/ocp4

Once in this directory, create the install-config.yaml file based on the following template. Substitute your entries where applicable. I will go over the relevant configurations from a high level.

apiVersion: v1
- hyperthreading: Enabled
  name: worker
  replicas: 0
  hyperthreading: Enabled
  name: master
  replicas: 3
  name: ocp4
  - cidr:
    hostPrefix: 24
  networkType: OpenShiftSDN
  none: {}
pullSecret: '{"auths": ...}'
sshKey: 'ssh-ed25519 AAAA...'

Please note/change the following:

  • baseDomain - This is the domain of your environment
  • - This is your clusterid
    • Note: This will effectively make all FQDNS
  • pullSecret - This pull secret can be obtained by going to
    • Login with your Red Hat account
    • Click on "Bare Metal"
    • Either "Download Pull Secret" or "Copy Pull Secret"
  • sshKey - This is your public SSH key (e.g.

Note: The worker replicas is set to 0 doesn't mean you're going to install 0 means that we are not going to generate machineconfigs for the cluster.

Generate Ignition Configurations

Ignition is a tool for manipulating configuration during early boot, before the operating system starts. This includes things like writing files (regular files, systemd units, networkd units, etc.) and configuring users. Think of it as a cloud-init that runs once (during first boot).

OpenShift 4 installer generates these ignition configs to prepare the node as an OpenShift bootstrap/master/worker node. From within your working directory (in this example it's ~/ocp4) generate the ignition configs.

cd ~/ocp4
openshift-install create ignition-configs

REMINDER: Your install-config.yaml must be in your working directory (~/ocp4 in this example). Creating the ignition-configs will result in the install-config.yaml file being removed by the installer, you may want to create a copy and store it outside of this directory.

This will leave the following files in your ~/ocp4 working directory.

tree .
├── auth
│   ├── kubeadmin-password
│   └── kubeconfig
├── bootstrap.ign
├── master.ign
├── metadata.json
└── worker.ign

You will need to do one of the following, depending on what kind of installation you're doing.


If you're using DHCP, simply copy over the ignition files to your webserver. For example, this is what I did for my installation.

scp ~/ocp4/*.ign

Static IPs

For static IPs; you need to generate new ignition files based on the ones that the OpenShift installer generated. You can use the filetranspiler tool in order to make this process a little easier. When using filetranspiler you first need to create a "fakeroot" filesystem. This is an example form the bootstrap node.

cat <<EOF > bootstrap/etc/sysconfig/network-scripts/ifcfg-enp1s0

NOTE: Your interface WILL probably differ, be sure to determine the persistent name of the device(s) before creating the network configuration files.

Using filetranspiler, create a new ignition file based on the one created by openshift-install. Continuing with the example of my bootstrap server; it looks like this.

filetranspiler -i bootstrap.ign -f bootstrap -o bootstrap-static.ign

The syntax is: filetranspiler -i $ORIGINALIGN -f $FAKEROOT -o $OUTPUTIGN

NOTE: If you're using the container version of filetranspiler, you need to be in the directory where these files/dirs are. In other words, absolute paths won't work.

Once you created the new file, copy it over to your webserver:

scp ~/ocp4/bootstrap-static.ign

IMPORTANT: When using static IP addresses, you will need to do this for ALL nodes in your cluster. In my environment I ended up with six ignition files.

tree /var/www/html/ignition/
├── bootstrap-static.ign
├── master0.ign
├── master1.ign
├── master2.ign
├── worker0.ign
└── worker1.ign

0 directories, 6 files

Install Red Hat Enterprise Linux CoreOS

Installing RHEL CoreOS (RHCOS) is a straightforward process. Depending on which method you are doing (DHCP or Static IPs); choose one of the following.


Boot from the ISO, and you'll be greeted with the following screen.


Once you see this menu, press Tab and append the options needed to the boot line. These include the url for BIOS or UEFI image the node needs and the ignition file created by openshift-install (NOTE: The entries need to be all in one line). Here is an example.


Here is an explanation of the CoreOS options:

  • coreos.inst.install_dev - The block device which RHCOS will install to.
  • coreos.inst.image_url - The URL of the UEFI or BIOS image that you uploaded to the web server.
  • coreos.inst.ignition_url - The URL of the Ignition config file for this machine type.

Static IPs

Just like the DHCP method, boot from the ISO, and you'll be greeted with the following screen.


Once you see this menu, press tab and enter the options that will image the node using the bios file you downloaded, and prepare the node using the ignition file you'll provide. Here is an example that I did for my bootstrap server.


AGAIN: This needs to be all in one line. I only used line breaks for ease of readability. You will need to put it all in one like the example below.


Syntax for the ip= portion is: ip=$IP::$DEFAULTGW:$NETMASK:$HOSTNAME:$IFACE:none:$DNSSERVER

Finishing Up The Install

Once the bootstrap server is up and running, the install is actually already in progress. First the masters "check in" to the bootstrap server for it's configuration. After the masters are done being configured, the bootstrap server "hands off" responsibility to the masters. You can track the bootstrap process with the following command.

openshift-install wait-for bootstrap-complete --log-level debug

Once the bootstrap process is finished, you'll see the following message.

DEBUG OpenShift Installer v4.1.0-201905212232-dirty
DEBUG Built from commit 71d8978039726046929729ad15302973e3da18ce
INFO Waiting up to 30m0s for the Kubernetes API at
INFO API v1.13.4+838b4fa up
INFO Waiting up to 30m0s for bootstrapping to complete...
DEBUG Bootstrap status: complete
INFO It is now safe to remove the bootstrap resources

At this point you can remove the bootstrap server from the load balancer. If you're using VMs, you can safely delete the bootstrap node. If you're using bare metal, you can safely repurpose this machine.

Basic functionality of the cluster is now available, however the cluster is not ready for applications. You can now login and take a look at what's finishing up.

cd ~/ocp4
export KUBECONFIG=auth/kubeconfig
oc get nodes

You can take a look to see if any node CSRs are pending.

oc get csr

You can accept the CSRs by running oc adm certificate approve <csr_name> - conversely, you can run the following to approve them all (requires jq command).

oc get csr -ojson | jq -r '.items[] | select(.status == {} ) |' | xargs oc adm certificate approve

The install won't complete without you setting up some storage for the image registry. The below command sets up an "emptyDir" (temp storage). If you'd like to use a more permanent solution; please see this.

oc patch cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}'

Once that's set, finish up the installation by running the following command

openshift-install wait-for install-complete

You'll see the following information about your cluster, including information about the kubeadmin account. This is meant to be a temporary administrative account. Please see this doc to configure identity providers.

INFO Waiting up to 30m0s for the cluster at to initialize...
INFO Waiting up to 10m0s for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/root/ocp4/auth/kubeconfig'
INFO Access the OpenShift web-console here:
INFO Login to the console with user: kubeadmin, password: PftLM-P6i6B-SEZ2R-QLICJ

Upgrade Cluster

If you've installed an earlier Z release; you can upgrade it to the latest release from the command line. First check what version you have.

# oc get clusterversion
NAME      VERSION        AVAILABLE           PROGRESSING             SINCE         STATUS
version   4.1.6          True                False                   21m           Cluster version is 4.1.6

Initiate an upgrade with the following command

oc adm upgrade --to-latest=true

Check the status with the following

# oc get clusterversion
NAME          VERSION             AVAILABLE       PROGRESSING                  SINCE     STATUS
version       4.1.6               True            True                         45s       Working towards 4.1.7: 13% complete


In this blog, I went through how to install an OpenShift 4 cluster on pre-existing infrastructure on bare metal. This method can also be used on other environments that doesn’t yet have the ability to do an ignition pre-boot.

The new install and deploy process used by OpenShift 4 for bare metal can be a bit confusing and intimidating at first, however this guide, and the documentation, aim to explain the requirements and our goal is to help you be successful. The prerequisites, especially DNS and the load balancers, are critical to success and often the most complex part, so it’s important to read ahead of time to avoid deployment issues.

If you encounter issues, you can connect to the nodes using the SSH key you provided in the install-config.yaml to check the status and look for errors. Once the cluster has been instantiated, you can pull logs and diagnostic information from the nodes using standard oc CLI commands or the administrator GUI. And, you can always open a support case for help with any aspect of your OpenShift cluster.

After your cluster is deployed, you may want to do some additional configuration tasks such as:

  • Configuring authentication and additional users
  • Adding additional routes and/or sharding network traffic
  • Migrating OpenShift services to specific nodes
  • Configuring persistent storage or adding a dynamic storage provisioner
  • Adding more nodes to the cluster

If you have any questions, please leave a comment below or reach out via the Customer Portal Discussions page.

Understanding Service Mesh: Operations Guide


The birth of Kubernetes has made the ability to go completely into the cloud a reality. It has provided the industry with a platform to finally build "cloud aware" applications and truly be "cloud-native".

This new push of cloud-native came with, naturally, a set of challenges that needed to be solved. While Kubernetes solved the problem of orchestrating the workloads; the challenges of security, policy, and management still exist. Also with the introduction of serverless and knative; you also bring in more complexity into the mix.

One of the technologies that came to light is the idea of a "service mesh". A service mesh sets to tackle the challenge of security, policy, and traffic management of microservices in a cloud-native platform. In this blog I will explore the idea behind service mesh, explore Istio, and try to explain how it all fits together from an Operations point of view.

Application communication design

Current Design

Whether you are doing monolithic or microservices, the design on how different applications communicate with each other is generally the same. When you take a look at it; it's generally two or more applications communicating with each other over the network via HTTP/HTTPS.

This should look pretty familiar. The challenge with this design is (even taking away cloud-native and microservices for a moment); that security, circuit breaking, and Layer 7 and Layer 4 has to be built into the application stack itself. This is for each application stack across your entire ecosystem. Maintaining rulesets for hundreds of applications can become a maintenance nightmare.

This is where a Service Mesh can be powerful!

Service Mesh Fundamentals

In order to manage the rulesets for an application; it needs to be 1) Application agnostic and 2) Abstracted away from the application. In order to do this a proxy (currently the most popular one is Envoy) is implemented that holds these rules/polices independent of the application (and vice versa). This is deployed in a "sidecar pattern" design. This way ANY application can be plugged in without it needing to know about the Service Mesh.

In this design; Application 1 is talking to Application 2 via a proxy. You can even further dissect it as saying: Application 1's Proxy is talk to Application 2's proxy. Some of the advantages of this design are...

  • Rules and Polices (like routing and circuit breaking) no longer have to be built into the application.
  • Applications can be "plugged in" without them knowing they are being governed.
  • Having a single set of authority makes management easier.

One of the things you need to keep in mind is that once you have hundreds of application with hundreds of instances; the management can get pretty out of hand. Which is why taking the concept (Now I'm teasing Istio) of a control plane (an idea likely borrowed from Kubernetes) fits nicely here.

With this design you can control/manage a fleet of Proxy systems (along with their rluesets/policies) from a central location. But this is only part of the solution.

ISTIO Service Mesh

There are a few technologies that take this design pattern and create a solution based on it. Some include Linkerd, Gloo, and Conduit. The one solution that has gained a lot of favor in the community is Istio.

Istio can kind of get complex and there have been other blogs that have gone in depth on how Istio works under the hood. I also suggest you give Christian Posta a follow. I will try and keep this at a high level from an Operations understanding of Istio's implementation of a Service Mesh.

Istio is made up of two parts. A Control Plane and a Data Plane.

The Control Plane is made of the following components

  • Pilot - This is where the traffic management is set and the config data for the proxies is stored and pushed out of.
  • Mixer - This is the policy engine. It enforces access control and usage. It also collects telemetry from the mesh
  • Citadel - This provides mTLS by way of handing out certificates to the proxies and managing them.

The Data Plane is simply the Envoy Proxies themselves that enforce the rulesets stored on the control plane.

With Istio you can do

  • Circuit Breaking - This allows you to avoid concurrent request to a slow instance or avoid multiple concurrent requests to an instance.
  • Pool Ejection - This removes a failing instances from the pool.
  • Retries - This will foward a request forward the request to another instance just in case we get a falue (open circuit breaker and/or pool ejection)
  • Mutual TLS - This allows you to encrypt all traffic automatically (sometimes called "zero trust" architecture)
  • Telemetry/Tracing - This gives you observability into your microservices and able to trace failures and calls into (and out of) your services.


Service Meshes is still a new technology and is ever evolving. Some of these technologies (like Istio and Linkerd) overlap in functionality and others (like Istio and Gloo) can compliment each other. The important thing is to get familiar with these technologies before they start running in your environment. More importantly, also, so you can make an intelligent decision on which one to use!

I encourage everyone not familiar to go and check out the Katacoda Istio track to get hands on!

Getting Familiar With ClusterAPI


There are many tools around to get a Kubernetes cluster up and running. Some of these include kops, kubeadm, openshift-ansible, and kubicon (just to name a few). There is even a way dubbed "The Hard Way", as made famous by Kelsy Hightower.

Some of these tools (like kops and kubicon) aim to manage your entire stack. That is from the infrastructure layer all the way to the Kubernetes layer. This is what I like to think of a fully managed system/install. Other tools take the UPI approach (User Provided Infrastructure). Tools like openshift-ansible and kubeadm let a user bring an already existing infrastructure where you just layer Kubernetes on top of.

ClusterAPI is a SIG group that is trying to bring a declarative approach to setting up Kubernetes clusters. The idea here is that you have a "wanted state" (your described cluster) and ClusterAPI will reconcile that for you. The SIG group has the goal to have ClusterAPI be 1. Use declarative Kubernetes-style APIs and 2. Be environment agnostic (while still being flexible).

This Diagram taken from their github shows the architecture

In this blog I'm going to go through an example of installing Kubernetes on AWS using the ClusterAPI AWS provisioner


So I mostly followed the quickstart that is on the github page. There it lists some good tools to have (some as must have and others as nice to have). To summarize here are the MUST haves:

  • Linux or Mac (no Windows support at this time)
  • AWS Credentials
  • An IAM role to give to the k8s control-plane
  • KIND
    • KIND has it's own dependencies including docker
  • The gettext package installed

Some of the optional nice-to-haves are:

Once you have those; you'll need to install the cli tools. Below is what I installed as of 19-MAR-2019 ...please see here for the latest binaries

# wget
# wget
# chmod +x clusterctl-linux-amd64
# chmod +x clusterawsadm-linux-amd64
# mv clusterctl-linux-amd64 /usr/local/bin/clusterctl
# mv clusterawsadm-linux-amd64 /usr/local/bin/clusterawsadm

I also downloaded the examples tarball to help generate some files I'll need later

# wget
# tar -xf cluster-api-provider-aws-examples.tar

Setting up environment variables

There is a helper script in the cluster-api-provider-aws-examples.tar tarball that generates a lot of the manifests for you. In the doc it explains some, but not all, of the environment vars that you need to export. I dug around the script and found that these are helpful to set.

export AWS_REGION="us-west-1"
export SSH_KEY_NAME="chernand-ec2"
export CLUSTER_NAME="pony-unicorns"
export NODE_MACHINE_TYPE="m4.xlarge"

When exporting SSH_KEY_NAME, you need to make sure this key exists in AWS already.

I verified that my exports with the AWS cli

# aws sts get-caller-identity
    "Account": "123123123123",
    "UserId": "TH75ISMYR3F4RCHUS3R1D",
    "Arn": "arn:aws:iam::123123123123:user/clusterapiuser"

Generating manifests

When I untar-ed the cluster-api-provider-aws-examples.tar file it created an aws dir in my current working directory.

# tree ./aws
├── addons.yaml
├── cluster-network-spec.yaml.template
├── cluster.yaml.template
├── machines.yaml.template
└── provider-components-base.yaml

Running the script in this directory will generate the needed manifests files for the installer.

# cd ./aws
# ./ 
Done generating /root/aws/out/cluster.yaml
Done generating /root/aws/out/machines.yaml
Done copying /root/aws/out/addons.yaml
Generated credentials
Done writing /root/aws/out/provider-components.yaml
WARNING: /root/aws/out/provider-components.yaml includes credentials

Go ahead and go into the out directory and examine these files. Making sure they match what you set in your environment variables

# cd out
# cat *

Once you're okay with these can move along to the installer!

installing kubernetes on aws

Using the clusterctl command I created a cluster with the following command

# cd /root/aws/out
# clusterctl create cluster -v 3 \
--bootstrap-type kind \
--provider aws \
-m machines.yaml \
-c cluster.yaml \
-p provider-components.yaml \
-a addons.yaml

You should see the following output

I0319 19:11:27.808556   25430 createbootstrapcluster.go:27] Creating bootstrap cluster
I0319 19:11:27.808667   25430 kind.go:57] Running: kind [create cluster --name=clusterapi]
I0319 19:12:10.001664   25430 kind.go:60] Ran: kind [create cluster --name=clusterapi] Output: Creating cluster "clusterapi" ...
 • Ensuring node image (kindest/node:v1.13.3) 🖼  ...
 ✓ Ensuring node image (kindest/node:v1.13.3) 🖼
 • Preparing nodes 📦  ...
 ✓ Preparing nodes 📦
 • Creating kubeadm config 📜  ...
 ✓ Creating kubeadm config 📜
 • Starting control-plane 🕹️  ...
 ✓ Starting control-plane 🕹️
Cluster creation complete. You can now use the cluster with:

export KUBECONFIG="$(kind get kubeconfig-path --name="clusterapi")"
kubectl cluster-info
I0319 19:12:10.001735   25430 kind.go:57] Running: kind [get kubeconfig-path --name=clusterapi]
I0319 19:12:10.043264   25430 kind.go:60] Ran: kind [get kubeconfig-path --name=clusterapi] Output: /root/.kube/kind-config-clusterapi
I0319 19:12:10.046231   25430 clusterdeployer.go:78] Applying Cluster API stack to bootstrap cluster
I0319 19:12:10.046258   25430 applyclusterapicomponents.go:26] Applying Cluster API Provider Components
I0319 19:12:10.046273   25430 clusterclient.go:919] Waiting for kubectl apply...
I0319 19:12:11.757657   25430 clusterclient.go:948] Waiting for Cluster v1alpha resources to become available...
I0319 19:12:11.765143   25430 clusterclient.go:961] Waiting for Cluster v1alpha resources to be listable...
I0319 19:12:11.792776   25430 clusterdeployer.go:83] Provisioning target cluster via bootstrap cluster
I0319 19:12:11.852236   25430 applycluster.go:36] Creating cluster object pony-unicorns in namespace "default"
I0319 19:12:11.877091   25430 clusterdeployer.go:92] Creating control plane controlplane-0 in namespace "default"
I0319 19:12:11.897136   25430 applymachines.go:36] Creating machines in namespace "default"
I0319 19:12:11.915500   25430 clusterclient.go:972] Waiting for Machine controlplane-0 to become ready...

What's happening here is that the installer is creating a local kubernetes cluster using kind. There the local cluster uses your creds to install a kubernetes cluster on AWS. Open another terminal window and see the following pods come up.

# kubectl  get pods  --all-namespaces 
NAMESPACE             NAME                                               READY   STATUS    RESTARTS   AGE
aws-provider-system   aws-provider-controller-manager-0                  1/1     Running   0          72s
cluster-api-system    cluster-api-controller-manager-0                   1/1     Running   0          72s
kube-system           coredns-86c58d9df4-4r2jx                           1/1     Running   0          73s
kube-system           coredns-86c58d9df4-lg2zd                           1/1     Running   0          73s
kube-system           etcd-clusterapi-control-plane                      1/1     Running   0          24s
kube-system           kube-apiserver-clusterapi-control-plane            1/1     Running   0          5s
kube-system           kube-controller-manager-clusterapi-control-plane   1/1     Running   0          16s
kube-system           kube-proxy-qj7qp                                   1/1     Running   0          73s
kube-system           kube-scheduler-clusterapi-control-plane            1/1     Running   0          17s
kube-system           weave-net-qpcq2                                    2/2     Running   0          73s

Once they are all running, tail the log of the aws-provider-controller-manager-0 pod to see what's happening (useful for debugging).

# kubectl logs -f -n aws-provider-system aws-provider-controller-manager-0

Once it's done you'll see an output that looks something like this (note that the KIND cluster is only temporary)

I0319 19:22:58.000769   25430 clusterdeployer.go:143] Done provisioning cluster. You can now access your cluster with kubectl --kubeconfig kubeconfig
I0319 19:22:58.000823   25430 createbootstrapcluster.go:36] Cleaning up bootstrap cluster.
I0319 19:22:58.000832   25430 kind.go:57] Running: kind [delete cluster --name=clusterapi]
I0319 19:22:58.882121   25430 kind.go:60] Ran: kind [delete cluster --name=clusterapi] Output: Deleting cluster "clusterapi" ...
$KUBECONFIG is still set to use /root/.kube/kind-config-clusterapi even though that file has been deleted, remember to unset it

Now the moment of truth...See if I can see my cluster...

# kubectl get nodes --kubeconfig=kubeconfig 
NAME                                      STATUS   ROLES    AGE   VERSION   Ready    <none>   70m   v1.13.3   Ready    master   72m   v1.13.3

It works! I have one controller and one worker node. Looks like they are also preparing for multimaster since I can see that an ELB was created for me.

# kubectl config view --kubeconfig=kubeconfig 
apiVersion: v1
- cluster:
    certificate-authority-data: DATA+OMITTED
  name: pony-unicorns
- context:
    cluster: pony-unicorns
    user: kubernetes-admin
  name: kubernetes-admin@pony-unicorns
current-context: kubernetes-admin@pony-unicorns
kind: Config
preferences: {}
- name: kubernetes-admin
    client-certificate-data: REDACTED
    client-key-data: REDACTED

uninstalling and cleanuP

To delete everything I created, I used the clusterctl command

# clusterctl delete cluster \
--bootstrap-type kind \
--kubeconfig kubeconfig -p provider-components.yaml 
I0319 20:37:14.156586    2852 clusterdeployer.go:149] Creating bootstrap cluster
I0319 20:37:14.156630    2852 createbootstrapcluster.go:27] Creating bootstrap cluster
I0319 20:37:56.466031    2852 clusterdeployer.go:157] Pivoting Cluster API stack to bootstrap cluster
I0319 20:37:56.466130    2852 pivot.go:67] Applying Cluster API Provider Components to Target Cluster
I0319 20:37:57.876975    2852 pivot.go:72] Pivoting Cluster API objects from bootstrap to target cluster.
I0319 20:38:33.196752    2852 clusterdeployer.go:167] Deleting objects from bootstrap cluster
I0319 20:38:33.196782    2852 clusterdeployer.go:214] Deleting MachineDeployments in all namespaces
I0319 20:38:33.198438    2852 clusterdeployer.go:219] Deleting MachineSets in all namespaces
I0319 20:38:33.200085    2852 clusterdeployer.go:224] Deleting Machines in all namespaces
I0319 20:38:43.227284    2852 clusterdeployer.go:229] Deleting MachineClasses in all namespaces
I0319 20:38:43.229738    2852 clusterdeployer.go:234] Deleting Clusters in all namespaces
I0319 20:41:13.253792    2852 clusterdeployer.go:172] Deletion of cluster complete
I0319 20:41:13.254168    2852 createbootstrapcluster.go:36] Cleaning up bootstrap cluster.

This was the easiest and most straight forward of the whole process.


In this blog I took a look at ClusterAPI and tested the ClusterAPI AWS Provider. The ClusterAPI SIG aims to unify how we provide the infrastructure to/for Kubernetes clusters. It aims to rebuild what we currently have out there by learning from what we got out of tools like kops, kubicon, and ansible.

The project is still in it's infancy and is bound to change. I encourage you to try it out and provide feedback. There is also a channel on the Kubernetes Slack that you can join as well.

Exploring Kubernetes Storage


When you think about Kubernetes the first thing that usually comes to mind is running stateless application. The very nature of the design of Kubernetes lends itself to running stateless applications. However, since Kubernetes runs on Linux, you were able to attach storage systems to support stateful applications. But how do you face the challenge of adding support for new volume plugins?

This is where CSI comes in. CSI (or Container Storage Interface) provides a standard to expose block/file storage to containers. This allows storage vendors to write storage plugins for Kubernetes without having to modify the Kubernetes core code. CSI was introduced in v1.9 and is now GA in v1.13

This has enabled various storage vendors to integrate their storage systems into Kubernetes, and even cloud providers have provided integrated solutions (Like EBS on Amazon, for example).

In this blog I will be taking look at GlusterFS and Rook and exploring some of the advantages and pitfalls.


For this blog I have installed/setup the following for my environment (although using minikube should work as well)

Rook installation

To install rook, I took a look at their latest documentation page. There I found this helpful quickstart page that provided an easy way to deploy rook with ceph using helm. To get a rook system up and running consists of 3 parts: The operator, the rook ceph cluster, and the storageclass.

To install the rook operator I used helm. This is pretty straight forward and I was able to install following the documentation.

$ helm repo add rook-stable
$ helm install --namespace rook-ceph-system rook-stable/rook-ceph

Now that the operator is up and running, we need to deploy a rook cluster. More specific; we want to deploy a rook ceph cluster. I will be deploying a copy of the yaml from my github page but you should look at the quickstart page for an up to date yaml.

$ kubectl create -f

This creates (among other things) the rook CRDs for ceph. Please see the documentation if you need to customize any values.

Next is the storageClass. However before we can create the storageClass, we have to create the CR of CephBlookPool. This custom resource will notify the operator to create a 3 way replicate cluster to serve block storage. I included it in my yaml, along with my storageclass; but more information can be found in the docs.

kubectl create -f

After a bit, you should see rook-ceph-block as an available storageClass.

$ kubectl get sc
NAME                 PROVISIONER            AGE
rook-ceph-block     3m

Testing Rook/Ceph

In order to test this I created a namespace and then deployed a sample application (that accepts file uploads) to that namespace

$ kubectl create ns test
namespace/test created
$ kubectl create deployment upload -n test
deployment.apps/upload created

I also exposed this deployment and created an ingress as well in order for me to test the upload.

Now, when I created the pvc I specified that I wanted to use the block storage provided by rook/ceph by using the rook-ceph-block annotation in my yaml file.

apiVersion: v1
kind: PersistentVolumeClaim
 name: ceph-block-pvc0001
 annotations: rook-ceph-block
  - ReadWriteOnce
     storage: 1Gi

Now I just loaded this yaml to create my pvc

$ kubectl create -f -n test
persistentvolumeclaim/ceph-block-pvc0001 created

Here, rook will create my block volume on the fly for me, creating the pv that satisfies my claim. Checking the pvc status shows that I have it bound to a pv.

$ kubectl get pvc -n test
NAME                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
ceph-block-pvc0001   Bound    pvc-0419077f-4510-11e9-bd1f-42010a8e0033   1Gi        RWO            rook-ceph-block   5m

I edited my deployment using kubectl edit deployment upload -n test and adding the volumeMounts and volumes section highlighted below.

apiVersion: extensions/v1beta1
kind: Deployment
  annotations: "2"
  creationTimestamp: null
  generation: 1
    app: upload
  name: upload
  selfLink: /apis/extensions/v1beta1/namespaces/test/deployments/upload
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
      app: upload
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
      creationTimestamp: null
        app: upload
      - image:
        imagePullPolicy: Always
        name: upload
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        - mountPath: /opt/app-root/src/uploaded
          name: upload-storage
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      - name: upload-storage
          claimName: ceph-block-pvc0001

Taking a look in the container; you can see this appears as /dev/rbd0 on the container and it's already mounted /formatted.

$ kubectl exec -it upload-7d9d6b987-fhq69 -n test -- df -h /opt/app-root/src/uploaded
Filesystem      Size  Used Avail Use% Mounted on
/dev/rbd0      1014M   33M  982M   4% /opt/app-root/src/uploaded

Issues and resolutions

When I went and tested the application; I got the following error: Permission denied in /opt/app-root/src/upload.php

Doing some digging around I found that the permissions are wrong on my directory.

bash-4.2$ ls -ld /opt/app-root/src/uploaded/
drwxr-xr-x 2 root root 6 Mar 12 22:04 /opt/app-root/src/uploaded/

This is an issue since I am running this container as a non-root user so I can't just chmod the directory. A little "hacking" was in order. First I figured out where the pod was running.

$ kubectl get pod upload-7d9d6b987-fhq69 -n test -o jsonpath='{.spec.nodeName}{"\n"}'

Looks like this pod is running on node nodes-8z9nn. So I logged into this node

$ gcloud compute ssh nodes-8z9n --zone us-east1-d

I used docker commands to findout what docker ID the container had...then used nsenter to get into the namespace.

 $ nsenter --target $PID --mount --uts --ipc --net --pid 

Once inside I was able to chmod the directory

# chmod 777 /opt/app-root/src/uploaded/

After I did that, I was able to use my app to upload files in the ceph block storage system.

Glusterfs Installation

In order to test gluster; I fist needed to add some raw storage devices to the nodes. Gluster (specifically glusterfs-kubernetes) likes to work with raw devices. I added 100GB volumes to each of my 3 nodes.

Note that ceph can also work with raw disks and not just use directories to store data.

I mainly used the github page for installation. Also I went through and made sure all the prereqs were done on all servers. In short, I did the following (I also added iptables rules)

# for i in dm_snapshot dm_mirror dm_thin_pool; do modprobe $i; done
# apt -y install glusterfs-client glusterfs-common

After the prereqs are done, I cloned the git repo to use the installation script provided.

$ git clone

After you have that, take the sample topology file (provided in the repo) and create your own. Being careful to make sure your settings are right. Mine looked like this.

  "clusters": [
      "nodes": [
          "node": {
            "hostnames": {
              "manage": [
              "storage": [
            "zone": 1
          "devices": [
          "node": {
            "hostnames": {
              "manage": [
              "storage": [
            "zone": 2
          "devices": [
          "node": {
            "hostnames": {
              "manage": [
              "storage": [
            "zone": 3
          "devices": [

I'll try and break this down a bit.

  • manage - This is the actual node name that you get from the kubectl get nodes command
  • storage - This is kind of missnamed. This is the IP address of the node itself (the actual IP not an SDN ip)
  • zone - the way that glusterfs works, it'll pick a node from each zone to create a 3way replicate volume. This is basically failure domains and you need at least 3 (if you're running 1 because of minikube, that's okay)
  • devices - this is an array of raw devices. Minimum is 1.

NOTE: Please see the following bug about gluster-blockd. I had to edit the file deploy/kube-templates/glusterfs-daemonset.yaml and change GLUSTER_BLOCKD_STATUS_PROBE_ENABLE to 0

Now, using the gk-deploy command I run the following (NOTE: you may need to run it with --single-node if you're using minikube)

$ ./gluster-kubernetes/deploy/gk-deploy gfs.json -g \
-c kubectl  -n glusterfs -w 1200 --no-object -y

You will get a message that it's complete and you can verify that all the pods are running

$ kubectl get pods -n glusterfs
NAME                      READY   STATUS    RESTARTS   AGE
glusterfs-bfhqx           1/1     Running   0          12m
glusterfs-hwb98           1/1     Running   0          12m
glusterfs-xpc2r           1/1     Running   0          12m
heketi-7495cdc5fd-b6s82   1/1     Running   0          4m11s

Now you need to create the storageClass based on the service address. Using my example yaml as a template; I created the following spec. (Note that I got the resturl by running the kubectl get svc -n glusterfs command and looking at the heketi service address)

kind: StorageClass
  name: gluster-container
  resturl: ""
  restuser: "admin"
  volumetype: "replicate:3"

Now you should be able to see the storageclass

$ kubectl get sc  gluster-container
NAME                PROVISIONER               AGE
gluster-container   21s

Testing gluster

I will be using tha same deployment as before; I will modify it to reference the new storage. First I verify it's running

$ kubectl get pods -n test
NAME                           READY   STATUS    RESTARTS   AGE
upload-bb9df669f-twmq6         1/1     Running   0          65s

Now using my pvc template for gluster; I created a pvc. And just like rook, gluster creates the pv on the fly to satisfy my pvc request.

$ kubectl create -f -n test
persistentvolumeclaim/gluster-pvc0001 created

Checking my pvc status, I see that I have storage bound

$ kubectl get pvc -n test
NAME              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
gluster-pvc0001   Bound    pvc-8ed0bbfc-4538-11e9-8da3-001a4a16011b   1Gi        RWX            gluster-container   7m14s

Next, I used kubectl edit deploy/upload -n test to edit my deployment to specify the new gluster volume. In the end my deployment looked like this.

apiVersion: extensions/v1beta1
kind: Deployment
  annotations: "2"
  creationTimestamp: null
  generation: 1
    app: upload
  name: upload
  selfLink: /apis/extensions/v1beta1/namespaces/test/deployments/upload
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
      app: upload
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
      creationTimestamp: null
        app: upload
      - image:
        imagePullPolicy: Always
        name: upload
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        - mountPath: /opt/app-root/src/uploaded
          name: upload-storage
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      - name: upload-storage
          claimName: gluster-pvc0001

If you look inside the pod, you will see the network mount (since glusterfs is filebased storage; you won't see it as a block device)

$ kubectl exec -it upload-7cb79f89cb-pjhls -n test -- df -h uploaded
Filesystem                                         Size  Used Avail Use% Mounted on 1014M   43M  972M   5% /opt/app-root/src/uploaded

Issues/Resolutions for Gluster

As noted above; I ran into this bug and had to disable block. Since I wasn't using block it wasn't such a big deal. However you do need to watch out for it since your install won't work without disabling it.

Also I spent quite a bit of time getting the firewall rules right. This took some trial and error on my part. In the end I ran this on ALL servers in my kubernetes cluster

iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 24007 -j ACCEPT
iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 24008 -j ACCEPT
iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 2222 -j ACCEPT
iptables -A INPUT -p tcp -m state --state NEW -m multiport --dports 49152:49664 -j ACCEPT
iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 24010 -j ACCEPT
iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 3260 -j ACCEPT
iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 111 -j ACCEPT

Also, since gluster requires raw devices, you need to check with your provider on how to do so. You may run into challenges if you have instance-groups and the like.

I am happy to see that I was able to use the volume without the need to change permissions.


In this blog we took a brief look at CSI and how it will help storage vendors to write storage plugins for Kubernetes. We also explored glusterfs and rook, to test how block and file storage works in Kubernetes.

There are a plethora of other storage providers for k8s including OpenEBS, Trident, logDNA, and many more.

As Kubernetes becomes more and more of a standard; I expect to see a lot more storage vendors and storage projects providing solutions. This will provide a wide range of choice for many workloads.

Automating OpenShift Installs


I've been involved with OpenShift since it's pre-Kubernetes days. I've also been through it's re-write when Kubernetes came on the scene about four years ago. I've been through the evolution of DevOps being built around Kubernetes and the birth of "Cloud Native" and the CNCF.

When I started working on automating installs on my github page the first thought that came to mind was people asking "Why?". I've done a lot of engagements with many customers and every one of them started with the "One Cluster to rule them all" frame of mind, to always end up with multiple clusters across multiple data centers.

If you're running Kubernetes/OpenShift in production; you will quickly learn that you can't do what you've always done, but just use Kubernetes. (Kelsey Hightower has a great talk where he says "You can't rub Kubernetes over your situation to make it better")

In the end you are going to be running many clusters and automating that is going to save you lots of time.

Let's get started!

Technologies Used

I used the following technologies in my tests

  • OpenShift Container Platform v3.11
  • Red Hat Enterprise Virtualization 4.2
  • Red Hat Identity Manager 4.6
  • Ansible 2.7

Although I haven't tested it, this same set up should work with okd, ovirt, and freeipa as well.

For those not familiar with RHEV/oVirt or RHIDM/FreeIPA; I'll give a short explanation on what these provide.

RHEV/oVirt provides a vitualization platform that is comparable to VMWare ESXi with vCenter. RHEV has a rich api and templating system that I will be leveraging to create the vms where I'll be installing OpenShift

Red Hat IdM/FreeIPA
provides a centrally managed Identity, Policy, and Audit server. It combines, LDAP, MIT Kerberos, NTP, DNS, and a CA Certificate system. For my testing I am mainly using the DNS system to dynamically create my DNS entries using the api.

Prework and Assumptions

I'm going to make a few assumptions, mostly having to do with infrastructure. These are things that are either out of the scope of this post or things that I'm assuming you already have in place in your environment.

  • I have created a VM Template based on the section in the OpenShift doc that describes how to prepare your hosts for OpenShift.
  • All these hosts had the correct SSH keys installed
  • I have an IPA domain/realm of cloud.chx that I also use for DNS.
  • I have DHCP set up with all my IPs in DNS (forward AND reverse are setup)

Automated Installs

I'm going to go through, at a high level, some of the sections of my playbook from my github repo. For a more detailed overview please see that repo itself. This is just to give you an idea of the thought process in automating installs.

RHEV/oVirt Auth

So first and foremost, you'll need to set up how ansible authenticates to RHEVM/oVirt. This settings just sets up credentials to use for subsequent ovirt_vm module calls. The config looks like this.

  - name: Get RHEVM Auth
      username: admin@internal
      password: "{{ lookup('env','OVIRT_PASSWORD') }}"
      insecure: true
      state: present

Note the {{ lookup('env','OVIRT_PASSWORD') }} value. This says that ansible will be looking up the password in an environment variable (I'll be doing this a lot in my playbook).

VM Creation

In order to create a VM from my template; I will need to call the ovirt_vm module. This is where you specify the size and specs of the servers. You also specify the template (this is the one I created that I used the host preparation guide against).

  - name: Create VMs for OpenShift
      auth: "{{ ovirt_auth }}"
      name: "{{ item }}"
      comment: This is {{ item }} for the OCP cluster
      state: running
      cluster: Default
      template: rhel-7.6-template
      memory: 16GiB
      memory_max: 24GiB
      memory_guaranteed: 12GiB
      cpu_threads: 2
      cpu_cores: 2
      cpu_sockets: 1
      type: server
      operating_system: rhel_7x64
      - name: nic1
        profile_name: ovirtmgmt
      wait: true
      - master1
      - app1
      - app2
      - app3

Adding OCS Disk

Since I am using OpenShift Container Storage (OCS); I used ovirt_disk to attach an extra disk to my application servers

  - name:  Attach OCS disk to VM 
      auth: "{{ ovirt_auth }}"
      name: "{{ item }}_disk2"
      vm_name: "{{ item }}"
      state: attached
      size: 250GiB
      storage_domain: vmdata
      format: cow
      interface: virtio_scsi
      wait: true
      - app1
      - app2
      - app3

Creating an install hostfile

In order to install OpenShift v3.11 you will need to create an ansible host file (since OpenShift v3.x uses ansible to install). I use ansible templating to dynamically create this file to use for installation. Here I am getting the server information from what I created and using it to build my ansible host file for OpenShift

  - name: Obtain VM information
      auth: "{{ ovirt_auth }}"
      pattern: name=master* or name=app* and cluster=Default
      fetch_nested: true
      nested_attributes: ips

  - name: Write out a viable hosts file for OCP installation
      src: ../templates/poc-generated_inventory.j2
      dest: ../output_files/poc-generated_inventory.ini

Now that I have that file, I add (what will be) the master to the in memory inventory file in order to copy that inventory file to the master.

  - name: Obtain Master1 VM information
      auth: "{{ ovirt_auth }}"
      pattern: name=master1 and cluster=Default
      fetch_nested: true
      nested_attributes: ips

  - name: Set Master1 VM Fact
      ocp_master: "{{ ovirt_vms.0.fqdn }}"

  - name: Add "{{ ocp_master }}" to in memory inventory
      name: "{{ ocp_master }}"

  - name: Copy inventory to "{{ ocp_master }}"
      src: ../output_files/poc-generated_inventory.ini
      dest: /etc/ansible/hosts
      owner: root
      group: root
      mode: 0644
    delegate_to: "{{ ocp_master }}"

Note I'm using delegate_to in oder to reference the master that just got provisioned.

Creating DNS entries

Since I'm using IPA for DNS, I will be tapping into the API in order to create entries. I set up some defaults (like domains and such) as variables.

- hosts: all
    wildcard_domain: ""
    console_fqdn: ""
    zone_fwd: "cloud.chx"

Using these variables; I created DNS entries pointing the wildcard DNS and console DNS to the master (where these objects will be running

  - name: Create DNS CNAME record
      ipa_pass: "{{ lookup('env','IPA_PASSWORD') }}"
      ipa_user: admin
      ipa_timeout: 30
      validate_certs: false
      zone_name: "{{ zone_fwd }}"
      record_name: openshift
      record_type: 'CNAME'
      record_value: "{{ ocp_master }}."
      record_ttl: 3600
      state: present

  - name: IPA create Wildcard DNS record
      ipa_pass: "{{ lookup('env','IPA_PASSWORD') }}"
      ipa_user: admin
      ipa_timeout: 30
      validate_certs: false
      zone_name: "{{ zone_fwd }}"
      record_name: '*.osa'
      record_type: 'CNAME'
      record_value: "{{ ocp_master }}."
      record_ttl: 3600
      state: present

Note: This overwrites the entries if it's not already set. If nothing is set, it will create them.

Preparing the hosts

When I created the VM template I created it as generic as possible. Still, it's nice to make sure the servers are updated with the proper packages. This is kind of ugly and a work in progress.

  - name: Update packages via ansible from "{{ ocp_master }}"
    shell: |
      ansible all -m shell -a "subscription-manager register --username {{ lookup('env','OREG_AUTH_USER') }} --password {{ lookup('env','OREG_AUTH_PASSWORD') }}"
      ansible all -m shell -a "subscription-manager attach --pool {{ lookup('env','POOL_ID') }}"
      ansible all -m shell -a "subscription-manager repos --disable=*"
      ansible all -m shell -a "subscription-manager repos --enable=rhel-7-server-rpms --enable=rhel-7-server-extras-rpms --enable=rhel-7-server-ose-3.11-rpms --enable=rhel-7-server-ansible-2.6-rpms --enable=rh-gluster-3-client-for-rhel-7-server-rpms"
      #ansible all -m shell -a "yum -y update"
      # Temp fix because of
      ansible all -m shell -a "yum -y update --exclude java-1.8.0-openjdk*"
      ansible all -m shell -a "systemctl reboot"
    delegate_to: "{{ ocp_master }}"

  - name: Wait for servers to restart
      host: "{{ ocp_master }}"
      port: 22
      delay: 30
      timeout: 300

In the above I'm using the shell module. Ideally you'd want to use the package and the redhat_subscription modules. For right now this should work fine.

Note: I reference a bug where you have to add an exclude to your yum update command.

Running the installer

Now that I have my hostsfile for the installer, my DNS in place, and my VMs ready to go; I can go ahead with the install. I run the playbook directly on master from my laptop via the playbook. Again I'm using delegate_to to do this.

  - name: Running OCP prerequisites from "{{ ocp_master }}"
    shell: |
      ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml
    delegate_to: "{{ ocp_master }}"

  - name: Running OCP installer from "{{ ocp_master }}"
    shell: |
      ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml
    delegate_to: "{{ ocp_master }}"

  - name: Display OCP artifacts from "{{ ocp_master }}"
    shell: oc get pods --all-namespaces
    register: ocpoutput
    delegate_to: "{{ ocp_master }}"

  - debug: var=ocpoutput.stdout_lines

Once the install is done I run an oc getpods --all-namespaces to see what the status of everything is and I print them out


I took a "cattle" approach in building this. Since I'm creating everything dynamically; I basically destroyed everything, fixed what was wrong, and then re-ran the playbook. This is the beauty of automating installs. Here is an example of my destroy/uninstall playbook

  - name: Delete VM
      auth: "{{ ovirt_auth }}"
      name: "{{ item }}"
      state: absent
      cluster: Default
      wait: true
      - master1
      - master2
      - master3
      - infra1
      - infra2
      - infra3
      - app1
      - app2
      - app3
      - lb

As you can see, I just delete everything. Since my DNS get's updated on creation, there's not a NEED to remove those entries either! (although it's probably good that you do).


In this blog I took you through a high level overview on how you can automate Kubernetes/Openshift installations using opensource tools. I used OpenShift, RHEV, Red Hat IdM, and Ansible specifically in my example.

You can also apply this to other tools like Dyamic DNS, Kubernetes, Puppet, Chef, VMWare ESXi vCenter, etc. The tools aren't necessarily important but getting to where you are automating installs is!

If you plan on running Kubernetes/OpenShift in production you will be running many clusters, and having a way to stamp these out will be paramount for your cloud native environment.

Use less YAML

I have been thinking about what should my first blog post be about. I figured since I just took the CKA (by the way, I passed!), I have kubernetes short hand commands on the brain; so I'll write about using less YAML when working with k8s.

When studying for the CKA; I came across a lot of blogs/howtos that show things like creating pods and deployments by creating a YAML file and using kubectl create -f ... or (what's worse) you'll see a cat <<EOF | kubectl create -f - to create a resource.

Now don't get me wrong; I'm not bashing using YAML. When working with kubernetes, you'll inevitably have to use YAML at some point. It's also completely valid way to do things. But when you're doing something like an exam, where time is precious. These can come in handy!

During the CKA exam, you have (if you average it out); 7 minutes per question. So time is precious and I learned how to generate resources through the kubectl command to save time.

So to create a deployment you can do the following

$ kubectl create deployment welcome-php
deployment.apps/welcome-php created

This creates all my resources
$ kubectl get deploy,rs,pod
NAME                                DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.extensions/welcome-php   1         1         1            1           7m

NAME                                          DESIRED   CURRENT   READY   AGE
replicaset.extensions/welcome-php-57db6cbb6   1         1         1       7m

NAME                              READY   STATUS    RESTARTS   AGE
pod/welcome-php-57db6cbb6-kjtcz   1/1     Running   0          7m

Now, I can actually create a service by exposing the deplpyment
$ kubectl expose deploy welcome-php --port=8080 --target-port=8080
service/welcome-php exposed

Now I have all the resources I need for my application.
$ kubectl get deploy,svc,rs,pod
NAME                                DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.extensions/welcome-php   1         1         1            1           15m

NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/welcome-php   ClusterIP   <none>        8080/TCP   1m

NAME                                          DESIRED   CURRENT   READY   AGE
replicaset.extensions/welcome-php-57db6cbb6   1         1         1       15m

NAME                              READY   STATUS    RESTARTS   AGE
pod/welcome-php-57db6cbb6-kjtcz   1/1     Running   0          15m

If you look at the kubectl create -h it'll show you what you can create via the cli. Here is a snippet.
Available Commands:
  clusterrole         Create a ClusterRole.
  clusterrolebinding  Create a ClusterRoleBinding for a particular ClusterRole
  configmap           Create a configmap from a local file, directory or literal value
  deployment          Create a deployment with the specified name.
  job                 Create a job with the specified name.
  namespace           Create a namespace with the specified name
  poddisruptionbudget Create a pod disruption budget with the specified name.
  priorityclass       Create a priorityclass with the specified name.
  quota               Create a quota with the specified name.
  role                Create a role with single rule.
  rolebinding         Create a RoleBinding for a particular Role or ClusterRole
  secret              Create a secret using specified subcommand
  service             Create a service using specified subcommand.
  serviceaccount      Create a service account with the specified name

So to can run the following to create and expose your application without using any YAML.
$ kubectl create deployment welcome-php
$ kubectl expose deploy welcome-php --port=8080 --target-port=8080

You can even do the same with a pod and node port. (note that I named the nodeport the same as the pod)
$ kubectl run nginx --image=nginx --generator=run-pod/v1 -l app=nginx
pod/nginx created

$ kubectl create service nodeport nginx --node-port=32000 --tcp=80:80 
service/nginx created

When working with kubernetes, you will run into lots of YAMLs that you will be copying and pasting. You can save yourself some typing if use the kubectl to create these resources for you!