
I have done some research on the customization possibilities for the TKG service (integrated into vSphere) because I hit some limits (see previous blog post). Luckily I have an installation on my home lab where I can go some steps further than I would do on a productive system. The following write-up introduces some actions that won’t be supported by VMware and should only be followed if you know exactly what you’re doing and if you don’t rely on the official support. My PoC, however, can be installed on the management or a workload cluster, it’s up to you, how far you want to go.
Implemented Functionality
After having issues with custom CAs and insecure docker registries, the first goal of my PoC was to introduce a solution to this problem and if possible further reconfiguration possibilities that are not available out of the box, yet. I had a discussion with one of the PMs for Tanzu and he promised to come with a solution integrated into the vCenter UI.
The second part is reuse of some code I have written for a Swisscom project, it provides automated DNS record management for Active Directory DNS servers over WinRM. I’m using the watch API functionality of K8s and the possibility to generate API stubs in Java over Swagger.
Access to the Management Cluster (Danger Zone)
If you do not want to install something on the Management Cluster (not recommended!), you can skip this paragraph. The most important things can also be achieved by switching the Kube context to the namespace you have defined in vCenter.
I was quite surprised that it is possible to access all the Tanzu clusters, even the vCenter Management Cluster. You have to ssh into vCenter and execute
/usr/lib/vmware-wcp/decryptK8Pwd.py
you’ll get the ssh PW and the Cluster IP. You can also look up the IPs in vCenter and login with root.
Copy the file
/etc/kubernetes/admin.conf
to your working environment and adapt the server address within this file to the cluster IP. Now you can access the management cluster as a cluster-admin by setting an environment variable:
export KUBECONFIG=<path-to-your-file>
Now you are able to inspect what is under the hood of these managed components!
Access to the Workload Clusters
I have configured a namespace called “niceneasy” and a workload cluster called “workload-01”. First I switch the Kube context and get all secrets in this namespace – btw: it is a namespace within the management cluster.
> kubens niceneasy Context "kubernetes-admin@kubernetes" modified. Active namespace is "niceneasy". > kubectl get secrets NAME TYPE DATA AGE default-token-9bnkn kubernetes.io/service-account-token 3 3d23h workload-01-antrea kubernetes.io/tls 3 3d23h workload-01-ca Opaque 2 3d23h workload-01-ccm-token-j59vx kubernetes.io/service-account-token 3 3d23h workload-01-control-plane-sb9rq cluster.x-k8s.io/secret 1 3d23h workload-01-encryption Opaque 1 3d23h workload-01-etcd Opaque 2 3d23h workload-01-kubeconfig Opaque 1 3d23h workload-01-proxy Opaque 2 3d23h workload-01-pvcsi-token-72tvv kubernetes.io/service-account-token 3 3d23h workload-01-sa Opaque 2 3d23h workload-01-ssh kubernetes.io/ssh-auth 1 3d23h workload-01-ssh-password Opaque 1 3d23h workload-01-workers-lk46m-65nq6 cluster.x-k8s.io/secret 1 3d23h workload-01-workers-lk46m-fstz4 cluster.x-k8s.io/secret 1 3d23h workload-01-workers-lk46m-tgsc5 cluster.x-k8s.io/secret 1 3d23h
The following secrets are particularly interesting:
- workload-01-kubeconfig contains the admin.conf for this cluster
- workload-01-ssh contains a private key for SSH access
- workload-01-ssh-password contains the password for SSH, user: vmware-system-user
These are the credentials which are needed by the approach of Eric Shanks and William Lam. I did similar tests but I hit the following two problems:
- My first implementation was trying to restart containerd on the same node that the container was running on. This led to errors, systemctl came back with a message that some files have not been accessible.
- I was looking for a very generic solution without messing with passwords in configurations.
That’s why I planned to install the component directly in the management cluster. Here you have full access to all necessary information and credentials and can solve the problem without exposing any secrets. In addition, the management cluster should be managing the nodes…
Now let’s extract the private key to the VMs for the workload cluster:
> kubectl get secret workload-01-ssh -o jsonpath='{.data.ssh-privatekey}' | base64 -d > ~/.ssh/workload-01-ssh > chmod 600 ~/.ssh/workload-01-ssh > kubectx workload-01 > kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME workload-01-control-plane-4cpqg Ready master 4d1h v1.18.10+vmware.1 10.0.10.7 <none> VMware Photon OS/Linux 4.19.154-3.ph3-esx containerd://1.3.4 workload-01-workers-fs9dc-7c57f6d49b-hb2lz Ready <none> 4d1h v1.18.10+vmware.1 10.0.10.14 <none> VMware Photon OS/Linux 4.19.154-3.ph3-esx containerd://1.3.4 workload-01-workers-fs9dc-7c57f6d49b-lxhqx Ready <none> 4d1h v1.18.10+vmware.1 10.0.10.6 <none> VMware Photon OS/Linux 4.19.154-3.ph3-esx containerd://1.3.4 workload-01-workers-fs9dc-7c57f6d49b-ztjqh Ready <none> 4d1h v1.18.10+vmware.1 10.0.10.9 <none> VMware Photon OS/Linux 4.19.154-3.ph3-esx containerd://1.3.4 > ssh vmware-system-user@10.0.10.7 -i .ssh/workload-01-ssh
The first line extracts and decodes the key into a file that needs restricting permissions for use as an SSH identity key. The third line switches to the workload context and lists the nodes of the cluster. “-o wide” offers the IP of the VMs. And the last line does the ssh into the control-plane VM.
This extraction and decoding with jsonpath is quite handy. If you are not sure, how the structure of a secret looks like, just do a “-o yaml” to check the valid field names.
Management Component
Check out the code from GitHub and adapt the config files to your environment (see documentation). This time, I used the built-in kustomize utility, so if you have added the necessary kubeconf and account tokens, deploy it simply by issueing
kubectl apply -k k8s/base
from the root of the project. Kustomize will generate configmaps and secrets as needed.
You can deploy it as-is on the management cluster or you can roll it out on a workload cluster, but in this case, I would remove the host networking (lines 22 and 31 in deployment.yaml). I have installed Tanzu (very) Basic without NSX-T and it looks to me as if the management cluster is running without an (accessible?) CNI. When I tried to deploy the component without host networking, an error of a missing NSX-T CNI was popping up. I assume this would be solved if the management cluster would run on NSX-T. For a workload cluster, you have to state a CNI implementation, hence you should not need host networking at all.
Limitations: currently only one DNS server and only one workload cluster can be configured.
Node Customizer Component
In the folder k8s/daemonset you’ll find the component that will add CAs to each node and demonstrates some other tricks. You need to add a service account with a clusterrolebinding to privileged access in order to permit file modifications on the underlying VM. Adjust the referenced endpoint for the management component. It will create a load balancer and you can get the exposed IP by looking up the tanzu-customizer service (external IP).
Leave a Reply