This is part 3 out of 5 of the series [Building a Kubernetes cloud-controller-manager]({{< ref "." >}}).
Now that our side-quest is finished, we can finally go back to the part why you are all here: Implementing the controllers of our cloud-controller-manager.
The `kubelet` is usually responsible for creating the `Node` object in Kubernetes on its first startup.
This does not work well for the removal of the node, as the `kubelet` might not get a chance to do this. Instead, the Node Lifecycle controller regularly checks if the Node was deleted in the Cloud API, and if it was it also deletes the `Node` object in Kubernetes.
Luck has is, `k8s.io/cloud-provider` already implements most of the logic, so we can focus on the integration with our Cloud Provider (Hetzner Cloud in my case).
Similar to the general `cloudprovider.Interface` from [Part 1]({{< ref "part-1.md#cloudproviderinterface" >}}), this post focuses on the [`cloudprovider.InstancesV2`](https://pkg.go.dev/k8s.io/cloud-provider@v0.28.2#InstancesV2) interface.
We also need to parse and the save the ID of the network that our nodes use to communicate with each other. We will use this later to figure out the internal IP of the node.
When we start our program now, it should detect that the `InstancesV2` interface is supported and start the _Node_&_NodeLifecycle_ controllers.
### Matching Up Nodes
For all methods in this interface, we need to ask the Cloud API about details of our server. For this, we need to somehow match the Kubernetes `Node` with its Cloud API server equivalent.
On the first run, we need to use some heuristic to make this match. All methods in the interface receive the `Node` as a parameter, so we could use any of its fields, like the Name, IP, Labels or Annotations. To keep this simple, we are going to use the Node name and expect that the server in the Hetzner Cloud API has the same name.
In `InstanceMetadata` we return a `ProviderID`, which is written to `Node.spec.providerID` by `k/cloud-provider`. After the node was initialized by us, we can therefore use the ID we previously saved, and do not need to keep using the name heuristic. We are going to use the format `ccm-from-scratch://$ID` for our `ProviderID`. By using a unique prefix, we can avoid issues where users accidentally install multiple cloud-providers into their cluster.
The `InstanceMetadata` method is called by the _Node_ controller when the Node is initialized for the first time.
To get started, we need to get the Server from the API, using the method described in the [previous section](#matching-up-nodes). Matching up by name is pretty easy:
```go
package ccm
// ccm/instances_v2.go
import (
"fmt"
)
const (
// A constant for the ProviderID. If you want, you can also update `CloudProvider.ProviderName` to use it.
providerName = "ccm-from-scratch"
)
func (i *InstancesV2) InstanceMetadata(ctx context.Context, node *corev1.Node) (*cloudprovider.InstanceMetadata, error) {
var server *hcloud.Server
var err error
if node.Spec.ProviderID == "" {
// If no ProviderID was set yet, we use the name to get the right server
We already discussed `ProviderID`, the values for `InstanceType` is also quite easy, we are going to use the Server Type from the Hetzner Cloud API). `Zone` and `Region` are a bit more open ended, and I recommend you read the docs for their respective labels ([Region][label-region], [Zone][label-zone]). In short, the Zone is smaller and Regions span multiple zones. For the Hetzner Cloud API, we are going to use the Network Zone (eg. `eu-central`) as the Region, and the Location (eg. `fsn1`) as the Zone. For your own API, you should evaluate what makes the most sense based on the availability guarantees of your Zones/Regions/DCs/... and the needs of your customers.
The `NodeAddresses` is a bit more complicated. We need to list all the addresses of the server along with their respective type (`NodeHostName`, `NodeExternalIP`, `NodeInternalIP`). We will put this into a separate function to keep the `InstanceMetadata` method clean.
Now that we have the basic functionality in, I would like to extract the code to parse the `ProviderID` and get the node into their own functions. This will make them reusable for the other two methods in the Interface.
Now that is looking better! One last thing I want to do is to replace the inline error (`errors.New("server not found")`) with a predefined error, which makes implementing the next method way easier.
```go {hl_lines=[5,32]}
package ccm
// ccm/instances_v2.go
var (
errServerNotFound = errors.New("server not found")
)
func (i InstancesV2) getServerForNode(ctx context.Context, node *v1.Node) (*hcloud.Server, error) {
var server *hcloud.Server
var err error
if node.Spec.ProviderID == "" {
// If no ProviderID was set yet, we use the name to get the right server
`InstanceExists` is being called by the _NodeLifecycle_ controller. Based on the return value the `Node` will be deleted. It is probably the easiest out of the three in this interface, as we only need to return `true` if a server matching the `Node` exists in the API, and `false` otherwise.
```go
package ccm
// ccm/instances_v2.go
func (i InstancesV2) InstanceExists(ctx context.Context, node *corev1.Node) (bool, error) {
// We get the server using the method from the previous section
_, err := i.getServerForNode(ctx, node)
if err != nil {
// If we get back our predefined error, we know that
`InstaceShutdown` is also being called by the _NodeLifecyle_ controller. Based on the return value, the `Node` will get the taint [`node.cloudprovider.kubernetes.io/shutdown`][taint-shutdown], the taint is removed if we return true. Similar to `InstanceExists`, this is pretty easy to implement. For the Hetzner Cloud API, we just need to check if the Server is in status `"off"`.
```go
package ccm
// ccm/instances_v2.go
func (i *InstancesV2) InstanceShutdown(ctx context.Context, node *corev1.Node) (bool, error) {