VMware Tanzu Basic: Installing TKG Extensions 1.2.0 – Part 1

VMware Tanzu

Recently, I set up VMware Tanzu Basic on my home lab (4 Intel NUCs). This write up documents my experiences with the setup and testing with the TKG Extensions 1.2.0. Installing the necessary infrastructure and tools is already very well documented. I added some links to the official documentation for installing the TKG Extensions as well and am mentioning only the pitfalls and learnings from my own installation.

vCenter with Tanzu Basic
vCenter with Tanzu Basic

Preparation Steps

If you want to achieve a very clean setup, you should get or create your own certificate authority. I created my own using the KeyStore Explorer and followed the write up by Eric Shanks for the first steps within vCenter. If you plan to use your own docker registry, you will benefit of this invest later on. I will go further into the details on how to use the CA also within the workload clusters in a later blogpost but I leave you with a link into the Tanzu documentation if you want to start right away. The workload clusters are highly customizable but this comes at the price of a higher complexity.

KeyStore Explorer
KeyStore Explorer

Networking (Home Lab)

I had quite some issues with networking, because the additional NICs on USB were not running reliably. I followed the lead of Samuel Legrand and installed the USB Network Native Driver for ESXi (I took the /etc/rc.local.d/local.sh from here). This worked like a charm and I got additional 1 Gbit links over USB.

To create a second L2 network I installed pfSense on vSphere and used a second switch. This worked well initially, but then I realized that the second network over USB started to loose connectivity to certain IPs. This was a quite weird effect that I could overcome by creating a second VDS without Network I/O control for it.

2nd VDS for USB NICs
2nd VDS for USB NICs

For my first tests I chose to take the vSphere networking with HAproxy as load balancer. I got this quick start guide and checked also the official documentation for Tanzu on vSphere. For my installation I used a certificate signed from my CA and did a customized setup. I started from my home network on 192.168.0.0/24 where my ESXi hosts and vCenter are directly running on and carved out a VIP range for HAproxy (192.168.0.144/28). The workload networks are on a 10.0.0.0/20 network with pfSense as the gateway routing device. This way I have enough IPs for multiple workload networks.

Test Cluster

After creating a namespace and a successful log in on the CLI (see quick start), I deployed my first cluster:

apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TanzuKubernetesCluster
metadata:
  annotations:
    tkg/plan: dev
  labels:
    tkg.tanzu.vmware.com/cluster-name: s01
  name: s01
  namespace: niceneasy
spec:
  distribution:
    version: 1.18.5+vmware.1-tkg.1.c40d30d
  settings:
    network:
      cni:
        name: antrea
      pods:
        cidrBlocks:
        - 100.96.0.0/11
      serviceDomain: tph.local
      services:
        cidrBlocks:
        - 100.64.0.0/13
    storage:
      classes:
      - vsan-default-storage-policy
      defaultClass: vsan-default-storage-policy
  topology:
    controlPlane:
      class: best-effort-medium
      count: 1
      storageClass: vsan-default-storage-policy
    workers:
      class: best-effort-medium
      count: 3
      storageClass: vsan-default-storage-policy

and deployed it by caling

kubectl apply -f cluster.yaml

I didn’t find, yet, where to access the custom resource definition for TanzuKubernetesCluster directly and didn’t get access to the ones deployed in the vSphere management cluster. I found, however, a ytt template in the ~/.tkg folder (.tkg/providers/infrastructure-tkg-service-vsphere/v1.0.0/ytt/base-template.yaml). But you can use the openapi capabilities of the kubernetes API server by calling

curl -k https://<api-server>/openapi/v2 -H "Authorization: Bearer <token>" > swagger.json

Please get URL and token by examining the output of

kubectl config view

This is quite a nice feature from k8s because it enables generating client libraries also for kubernetes extensions. I will explain this in a later blog post.

Important!
I changed the service domain to “tph.local” for the cluster, default value is “cluster.local”.

The service domain is used in /etc/resolve.conf of each node and creates the following entry for the namespace “tanzu-system-monitoring”:

search tanzu-system-monitoring.svc.tph.local svc.tph.local tph.local

This setting can have impact on helm charts or manifest templates where the service domain cannot be customized (as in TKG extensions).

After the cluster has been successfully deployed, you have to login again but this time with the cluster name added:

kubectl vsphere login --server=<your endpoint> --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify --tanzu-kubernetes-cluster-namespace niceneasy --tanzu-kubernetes-cluster-name s01

Now you can switch the kubernetes context to your newly created cluster. Hint: take a look at the k8s utilities kubectx and kubens. bash-completion and k9s are further recommendations from my side…

Admission Controller – Pod Security Policies

Tanzu on vSphere has an admission controller enabled. I have adapted the installation procedure for using the default pod security policies installed with VMware Tanzu. Although the cluster admin should have the right to deploy workloads directly, it is best practice to explicitly define the PSP per application. As I didn’t want to check each pod if privileged access is really needed, I simply allowed root access for all (time saving, but not suited for production).

It is possible to deactivate the admission controller, but I leave for now and just adapt the extensions to the default setting of oob Tanzu Basic on vSphere.

TKG Extensions

Download the extensions and unzip them to your working directory.

Cert-Manager

Cert-Manager is creating and updating certificates for kubernetes components.

First I checked the file 03-cert-manager.yaml and found the service accounts “cert-manager”,”cert-manager-cainjector” and “cert-manager-webhook”. A ClusterRoleBinding is created with the following command (the namespace “cert-manager” does not have to exist, yet):

kubectl create clusterrolebinding cert-manager --clusterrole=psp:vmware-system-privileged --serviceaccount=cert-manager:cert-manager --serviceaccount=cert-manager:cert-manager-cainjector --serviceaccount=cert-manager:cert-manager-webhook

Now you can simply cd into the cert-manager directory and deploy everything with

kubectl apply -f .

Contour Ingress Controller

Contour is one of the ingress controllers that is widely used like NGINX. Again, prepare PSPs with the available service accounts and deploy the workload, this time with ytt.

kubectl create clusterrolebinding contour-privileged --clusterrole=psp:vmware-system-privileged --serviceaccount=tanzu-system-ingress:contour --serviceaccount=tanzu-system-ingress:envoy

ytt --ignore-unknown-comments -f common/ -f ingress/contour/ -v infrastructure_provider="vsphere" | kubectl apply -f-

Try to test it with the samples, they are well explained in the README.md – but do not forget to add a service account and create a clusterrolebinding. If you should forget this, you will see an error like

Error creating: pods "<pod name>" is forbidden: unable to validate against any pod security policy: []

if you use

kubectl describe replicaset <service name>

So adapt accordingly:

kubectl create sa sample
kubectl create clusterrolebinding sample --clusterrole=psp:vmware-system-privileged --serviceaccount=test-ingress:sample

cd ingress/examples/common
-> add service account to 02-deployments.yaml

If the tests are successful you have an ingress controller enabling you to use one IP on the load balancer and route the traffic according to the hostnames to the target k8s services.

daniele@ubuntu-dt:~/dev/tkg-extensions$ kubectl get svc -n tanzu-system-ingress
NAME      TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE
contour   ClusterIP      100.68.64.67    <none>          8001/TCP                     3d4h
envoy     LoadBalancer   100.70.251.33   192.168.0.149   80:30761/TCP,443:30788/TCP   3d4h

With this command I found  the external IP of my contour ingress, 192.168.0.149. Because the traffic is routed by hostname, a DNS entry (CNAME) or an entry in /etc/hosts of the client is needed for each fqdn used over this ingress for further testing.

2 Trackbacks / Pingbacks

  1. VMware Tanzu Basic: Installing TKG Extensions 1.2.0 – Part 2 - vDan's Blogs
  2. VMware Tanzu Basic: Installing TKG Extensions 1.2.0 – Part 3 - vDan's Blogs

Leave a Reply

Your email address will not be published.


*