In this post, I will install the kubernetes with external etcd. Most of instrcution come from here. I will follow step by step as the beginner.

 

kubeadm-init.yaml
0.00MB

 

1. Pre-requsite

 

At first I need to prepare the hosts for Kubernetes components. In my case, I have 5 nodes, one for master, one for external etcd and three for workers. After I prepare these nodes. I need to update "/etc/hostname" and "/etc/hosts" files propertly. For example, I have to change hostname to "m1" as master node.

# hostname m1

# vi /etc/hostname
m1

# vi /etc/hosts
127.0.0.1       localhost       m1
147.75.94.251 m1
147.75.92.69 e1
147.75.92.139 w1
147.75.92.71 w2

After this, I need to log-out and log-in again.

Kubernetes use Docker engine. Therefore, I need to install Docker. I will follow this instruction. In my case, I will run this kubernetes over Ubuntu 16.04. This is overview.

sudo apt-get update
sudo apt-get install -y \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io

I need to check docker info. It will be like below. I need to check "Cgroup Driver: cgroupfs" as default.

# docker info
Server Version: 18.09.6
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive

From this instruction, I will change this driver from "cgroupsfs" to "systemd". This change affect lots of issue during Kubernetes installation. Therefore I need to remember this.

cat > /etc/docker/daemon.json <{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF
mkdir -p /etc/systemd/system/docker.service.d
systemctl daemon-reload
systemctl restart docker

After this, Cgroup Driver will be changed 

# docker info
Cgroup Driver: systemd

 

2. Install kubelet, kubeadm and kubectl

 

Kubelet (daemon on each node has the role to handle conatiner), kubeadm (daemon on each node has the role to setup master and workers) and kubectl (daemon on each node has the role to manage the cluster) are necessary components.

apt-get update && apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat </etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl

After installation, I need to check the directories which are created during the installation. "/etc/kubernetes/" and "/var/lib/kubelet/pki/" and "/etc/systemd/system/kubelet.service.d/" are created. 

# ls /etc/kubernetes/ 
manifests 
# ls /etc/kubernetes/manifests/ 
# ls /var/lib/kubelet/ 
pki 
# ls /var/lib/kubelet/pki/ 
kubelet.crt  kubelet.key 
# ls /etc/systemd/system/kubelet.service.d/
10-kubeadm.conf 

To work Kubernetes normally, Kubelet work normally at first. I can check the status with "systemctl status kubelet". At this moment, it is not activate.

# systemctl status kubelet

  kubelet.service - kubelet: The Kubernetes Node Agent

   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)

  Drop-In: /etc/systemd/system/kubelet.service.d

           └─10-kubeadm.conf

   Active: activating (auto-restart) (Result: exit-code) since Wed 2019-05-22 10:59:59 UTC; 2s ago

     Docs: https://kubernetes.io/docs/home/

  Process: 12883 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=

 Main PID: 12883 (code=exited, status=255)

May 22 10:59:59 e1 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a

May 22 10:59:59 e1 systemd[1]: kubelet.service: Unit entered failed state.

May 22 10:59:59 e1 systemd[1]: kubelet.service: Failed with result 'exit-code'.

To see more details, I will look at the logs in "/var/log/syslog". From logs, I can check I need "config.yaml" file. Keep this issue and go to next step. This file will be created with "kubectl init" command later.

# tail -f /var/log/syslog

systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.

systemd[1]: Stopped kubelet: The Kubernetes Node Agent.

systemd[1]: Started kubelet: The Kubernetes Node Agent.

kubelet[12865]: F0522 10:59:49.732907   12865 server.go:193] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory

systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a

systemd[1]: kubelet.service: Unit entered failed state.

systemd[1]: kubelet.service: Failed with result 'exit-code'.

I need to match the cgroup driver with the value "systemd". In the instruction, there is the comment like below.

However there are no "/etc/default/kubelet" and "/var/lib/kubelet/kubeadm-flags.env". So I can not update this with this instruction. Please look at the result from "systemctl status kubelet". I can see "/etc/systemd/system/kubelet.service.d/10-kubeadm.conf". This is the loaded file when kubelet is started.

# Note: This dropin only works with kubeadm and kubelet v1.11+

[Service]

Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"

Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"

Environment="KUBELET_EXTRA_ARGS=--cgroup-driver=systemd"

# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically

EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env

# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use

# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.

EnvironmentFile=-/etc/default/kubelet

ExecStart=

ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

I will add the red text, Environment="KUBELET_EXTRA_ARGS=--cgroup-driver=systemd". And restart the kubelet with "systemctl daemon-reload" and "systemctl restart kubelet". However, it is still not activated due to no config.yaml.

 

 

3. Install external etcd cluster

 

I have alread posted in here. To create the etcd cluster, It is not easy. However, I can make these with this instruction easliy. This etcd will be created as one of container application which are managed by Kubernetes. On the external etcd node, I have to run command below.

cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf 

[Service] 

ExecStart= 

ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true 

Restart=always 

EOF

systemctl daemon-reload

systemctl restart kubelet

 Please note that there is no option for cgroup drivers. Therefore, I need to change something.

Environment="KUBELET_EXTRA_ARGS=--cgroup-driver=systemd"
ExecStart=
ExecStart=/usr/bin/kubelet --address=0.0.0.0 --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true $KUBELET_EXTRA_ARGS
Restart=always

The file name start with 20-etcd-service-manager.conf due to "20", so it has higher priority rather than 10-kubeadm.conf. After running this, I can check the "systemctl status kubelet" and "/var/log/syslog". The kubelet does not works with some error. This error is different from above. The first error above is happend after 10-kubeadm.conf loading. However this error is happend during 20-etcd-service-manager.conf loading. So I have to solve this error at first.

tail -f /var/log/syslog

systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.

systemd[1]: Stopped kubelet: The Kubernetes Node Agent.

systemd[1]: Started kubelet: The Kubernetes Node Agent.

kubelet[28517]: Flag --address has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.

kubelet[28517]: Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.

kubelet[28517]: Flag --allow-privileged has been deprecated, will be removed in a future version

kubelet[28517]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.

systemd[1]: Started Kubernetes systemd probe.

kubelet[28517]: I0522 14:05:31.497808   28517 server.go:417] Version: v1.14.2

kubelet[28517]: I0522 14:05:31.498194   28517 plugins.go:103] No cloud provider specified.

kubelet[28517]: W0522 14:05:31.498229   28517 server.go:556] standalone mode, no API client

kubelet[28517]: W0522 14:05:31.598549   28517 server.go:474] No api server defined - no events will be sent to API server.

kubelet[28517]: I0522 14:05:31.598601   28517 server.go:625] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /

kubelet[28517]: F0522 14:05:31.599231   28517 server.go:265] failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: [Filename#011#011#011#011Type#011#011Size#011Used#011Priority /dev/sda2  partition#0111996796#0110#011-1]

systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a

systemd[1]: kubelet.service: Unit entered failed state.

systemd[1]: kubelet.service: Failed with result 'exit-code'.

There are some error and warning messages. To solve this, I have to create "kubeadmcfg.yaml" from this instruction. In my case, I will difference file name like below

# cat external-etcd-cfg.yaml
apiVersion: "kubeadm.k8s.io/v1beta1"
kind: ClusterConfiguration
controlPlaneEndpoint: 147.75.94.251:6443
etcd:
    local:
        serverCertSANs:
        - "147.75.92.69"
        peerCertSANs:
        - "147.75.92.69"
        extraArgs:
            initial-cluster: e1=https://147.75.92.69:2380
            initial-cluster-state: new
            name: e1
            listen-peer-urls: https://147.75.92.69:2380
            listen-client-urls: https://147.75.92.69:2379
            advertise-client-urls: https://147.75.92.69:2379
            initial-advertise-peer-urls: https://147.75.92.69:2380

"controlPlaneEndpoint: 147.75.94.251:6443" is master IP address and Port for API server. This parameter solve this warning "No api server defined - no events will be sent to API server.". 147.75.92.69 ip address is external etcd node interface IP addresss which is announced for outside. In this yaml file, some options are included. Therefore, I have to revise 20-etcd-service-manager.conf file like below. 

Even if I added more options in yaml file, I do not handle root cause why the kubelet fail. In the syslog, "failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false" is the main reason. To solve this, there are 3 option.

# swapoff -a

# vi /etc/fstab

UUID=b7edcf12-2397-489a-bed1-4b5a3c4b1df3       /       ext4    errors=remount-ro       0       1

# UUID=a3babb9d-5aee-4a1d-a894-02aa816ef6e7       none    swap    none    0       0

"swapoff -a" is temporary option. Comment "swap" in "/etc/fstab" is permenant option. Adding " --fail-swap-on=false" in 20-etcd-service-manager.conf is another permenant option. After then, the status of kubelet will be like this.

Environment="KUBELET_EXTRA_ARGS=--cgroup-driver=systemd" 
ExecStart= 
ExecStart=/usr/bin/kubelet --address=0.0.0.0 --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --fail-swap-on=false $KUBELET_EXTRA_ARGS 
Restart=always

Now, I am ready to deploy etcd container deployment. I need certificate and private key for etcd cluster to work well. These values are located in "/etc/kubernetes/pki/etcd". However, you do not have this directory at this time. As the instruction, these file will be generated with command below.

# kubeadm init phase certs etcd-ca
[certs] Generating "etcd/ca" certificate and key
# ls /etc/kubernetes/pki/etcd/
ca.crt  ca.key

If you are want to make cluster of etcd nodes, I have copy these files on each etcd nodes. However, it is not enough to work well. I need more certiface and key.

kubeadm init phase certs etcd-server --config=external-etcd-cfg.yaml

kubeadm init phase certs etcd-peer --config=external-etcd-cfg.yaml

kubeadm init phase certs etcd-healthcheck-client --config=external-etcd-cfg.yaml

kubeadm init phase certs apiserver-etcd-client --config=external-etcd-cfg.yaml

after then, serveral key will be generated. In fact, some of key is not necessary if you run single external etcd node. Now, I can run etcd container with command below

# kubeadm init phase etcd local --config=external-etcd-cfg.yaml

# docker ps
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS               NAMES
f7f427f30a04        2c4adeb21b4f           "etcd --advertise-cl…"   16 minutes ago      Up 16 minutes                           k8s_etcd_etcd-e1_kube-system_9a37a797efa5968588eac7b51458eecc_0
5686bb62b14f        k8s.gcr.io/pause:3.1   "/pause"                 16 minutes ago      Up 16 minutes                           k8s_POD_etcd-e1_kube-system_9a37a797efa5968588eac7b51458eecc_0

 

# docker exec f7f427f30a04 etcdctl --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --ca-file /etc/kubernetes/pki/etcd/ca.crt --endpoints https://147.75.92.69:2379 cluster-health
member d3b00bc687dc09ae is healthy: got healthy result from https://147.75.92.69:2379
cluster is healthy

I can also check the health status.

 

 

4. Install master nodes

 

There are 2 instructions for master node, single master and cluser masters. To setup the master node, I need configuration file. I will refer from this instruction. This is the sample configuration yaml file what I have

cat < ./kubeadm-config.yaml

apiVersion: kubeadm.k8s.io/v1beta1

kind: InitConfiguration

localAPIEndpoint:

  advertiseAddress: 147.75.94.251

  bindPort: 6443

---

apiVersion: kubeadm.k8s.io/v1beta1

kind: ClusterConfiguration

kubernetesVersion: v1.14.1

clusterName: crenet-cluster

controlPlaneEndpoint: 147.75.94.251:6443

networking:

  podSubnet: 10.244.0.0/16

controllerManager:

  extraArgs:

    deployment-controller-sync-period: "50s"

---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs

EOF

With this configuration above, I can run "kubeadm init". Please note that "mode: ipvs" will change the kube-proxy mode to work with "ipvsadm" kerneal module. In the post what I wrote, It will be undertand how kube-proxy work with this part. This is overview to install and load kernal module

apt-get install ipvsadm
echo "net.ipv4.conf.all.arp_ignore=1" >> /etc/sysctl.conf
echo "net.ipv4.conf.all.arp_announce=2" >> /etc/sysctl.conf
sysctl -p /etc/sysctl.conf

modprobe ip_vs_rr  
modprobe ip_vs_wrr  
modprobe ip_vs_sh 
modprobe ip_vs

Sometimes, some module are not loaded. So I need to do manually with "modeprobe". If these module are not loaed when kubeadm init is started, I can see this message "kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh]

Now, it will load the file "/etc/systemd/system/kubelet.service.d/10-kubeadm.conf".

# cat /etc/systemd//system/kubelet.service.d/10-kubeadm.conf

# Note: This dropin only works with kubeadm and kubelet v1.11+

[Service]

Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"

Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"

Environment="KUBELET_EXTRA_ARGS=--cgroup-driver=systemd"

# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically

EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env

# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use

# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.

EnvironmentFile=-/etc/default/kubelet

ExecStart=

ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

With this configuration I can run "kubeadm init" 

swapoff -a

sudo kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs

before run kubeadm init, I need to "swapoff -a". I can get some error message without "swapoff -a"

[init] Using Kubernetes version: v1.14.1

[preflight] Running pre-flight checks

error execution phase preflight: [preflight] Some fatal errors occurred:

        [ERROR Swap]: running with swap on is not supported. Please disable swap

[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

Anyway, I will get this result when I make master node succefully.

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube

  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.

Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:

  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join 147.75.94.251:6443 --token 9nbo8n.y8oke84pmjgaj3px \

    --discovery-token-ca-cert-hash sha256:ac3693f08617ad045bf2e18a1ec0c3240e0651057dcd29e52daa21176a02b9f6 \

    --experimental-control-plane --certificate-key 1eae6218e326a045b866b38e4c0b0135d49203073bfe4f06f602e415dcd9b7a6

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!

As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use

"kubeadm init phase upload-certs --experimental-upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 147.75.94.251:6443 --token 9nbo8n.y8oke84pmjgaj3px \

    --discovery-token-ca-cert-hash sha256:ac3693f08617ad045bf2e18a1ec0c3240e0651057dcd29e52daa21176a02b9f6

I need more steps. First I configure for "kubectl" enable.

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

After this, I can use "kubectl" command. Sometimes, master status can be "NotReady". At this time, I should check the status of pods with "kubectl get pods --all-namespace"

# kubectl get nodes

NAME   STATUS     ROLES    AGE     VERSION

m1     NotReady   master   4m43s   v1.14.2

 

 

From this instruction, I need to POD network to solve this issue.

To install POD network, look at this instuction.

In my case, I will use "Flannel".

echo "net.bridge.bridge-nf-call-iptables=1" >> /etc/sysctl.conf
sysctl -p /etc/sysctl.conf
wget https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml

And then, the master status is normal

# kubectl get pods --all-namespaces

NAMESPACE     NAME                          READY   STATUS    RESTARTS   AGE

kube-system   coredns-fb8b8dccf-627rr       1/1     Running   0          16m

kube-system   coredns-fb8b8dccf-kwvb4       1/1     Running   0          16m

kube-system   etcd-m1                       1/1     Running   0          15m

kube-system   kube-apiserver-m1             1/1     Running   0          15m

kube-system   kube-controller-manager-m1    1/1     Running   0          15m

kube-system   kube-flannel-ds-amd64-6s7xh   1/1     Running   0          54s

kube-system   kube-proxy-gm6bl              1/1     Running   0          16m

kube-system   kube-scheduler-m1             1/1     Running   0          15m

 

# kubectl get nodes

NAME   STATUS   ROLES    AGE   VERSION

m1     Ready    master   17m   v1.14.2

Oh It is work now. See status with more information

# systemctl status kubelet
  kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Wed 2019-05-22 15:44:37 UTC; 1h 29min ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 1490 (kubelet)
    Tasks: 21
   Memory: 35.9M
      CPU: 4min 37.546s
   CGroup: /system.slice/kubelet.service
           └─1490 /usr/bin/kubelet --fail-swap-on=false --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib

# cat /var/lib/kubelet/config.yaml | grep -i cgroup
cgroupDriver: cgroupfs
cgroupsPerQOS: true

There is something strange. "cgroupDriver: cgroupfs" is strange. It should be "systemd", becuase I update "Environment="KUBELET_EXTRA_ARGS=--cgroup-driver=systemd" in "/etc/systemd/system/kubelet.service.d/10-kubeadm.conf". I think it is not work. I will update for this

# cat kubeadm-config.yaml

apiVersion: kubeadm.k8s.io/v1beta1

kind: InitConfiguration

localAPIEndpoint:

  advertiseAddress: 147.75.94.251

  bindPort: 6443

---

apiVersion: kubeadm.k8s.io/v1beta1

kind: ClusterConfiguration

kubernetesVersion: v1.14.1

clusterName: crenet-cluster

controlPlaneEndpoint: 147.75.94.251:6443

networking:

  podSubnet: 10.244.0.0/16

controllerManager:

  extraArgs:

    deployment-controller-sync-period: "50s"

---

apiVersion: kubelet.config.k8s.io/v1beta1

kind: KubeletConfiguration

cgroupDriver: systemd

------
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs

I will run "kubeadm reset" and "kubeadm init" again. Now

# cat /var/lib/kubelet/config.yaml | grep -i cgroup
cgroupDriver: systemd

However, this is not all of things. I have alread made external etcd. I have not configure for this yet. Default, etcd conatiner is located on each master node. They are not clustered.

# kubectl get pods --all-namespaces

NAMESPACE     NAME                          READY   STATUS    RESTARTS   AGE

kube-system   coredns-fb8b8dccf-627rr       1/1     Running   0          73m

kube-system   coredns-fb8b8dccf-kwvb4       1/1     Running   0          73m

kube-system   etcd-m1                       1/1     Running   0          72m

kube-system   kube-apiserver-m1             1/1     Running   0          72m

kube-system   kube-controller-manager-m1    1/1     Running   0          72m

kube-system   kube-flannel-ds-amd64-6s7xh   1/1     Running   0          57m

kube-system   kube-proxy-gm6bl              1/1     Running   0          73m

kube-system   kube-scheduler-m1             1/1     Running   0          72m

From this instruction, there are serveral step for this. At first I need to copy the certificate and key from etcd node.

Now, I need to update my configuration file

# cat kubeadm-config.yaml

apiVersion: kubeadm.k8s.io/v1beta1

kind: InitConfiguration

localAPIEndpoint:

  advertiseAddress: 147.75.94.251

  bindPort: 6443

---

apiVersion: kubeadm.k8s.io/v1beta1

kind: ClusterConfiguration

kubernetesVersion: v1.14.1

clusterName: crenet-cluster

controlPlaneEndpoint: 147.75.94.251:6443

networking:

  podSubnet: 10.244.0.0/16

controllerManager:

  extraArgs:

    deployment-controller-sync-period: "50s"

etcd:

    external:

        endpoints:

        - https://147.75.92.69:2379

        caFile: /etc/kubernetes/pki/etcd/ca.crt

        certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt

        keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key

---

apiVersion: kubelet.config.k8s.io/v1beta1

kind: KubeletConfiguration

cgroupDriver: systemd

---

---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs

Now, I will run "kubeadm reset" and "kubeadm init" again. Please note that the certificate and key will be remove when I run kubeadm reset. Therefore, I need to copy again from etcd node

error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR ExternalEtcdVersion]: couldn't load external etcd's server certificate /etc/kubernetes/pki/etcd/ca.crt: open /etc/kubernetes/pki/etcd/ca.crt: no such file or directory
        [ERROR ExternalEtcdClientCertificates]: /etc/kubernetes/pki/etcd/ca.crt doesn't exist
        [ERROR ExternalEtcdClientCertificates]: /etc/kubernetes/pki/apiserver-etcd-client.crt doesn't exist
        [ERROR ExternalEtcdClientCertificates]: /etc/kubernetes/pki/apiserver-etcd-client.key doesn't exist
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

After I copy the certificate and key, I can check the status again

# kubectl get pods --all-namespaces

NAMESPACE     NAME                          READY   STATUS    RESTARTS   AGE

kube-system   coredns-fb8b8dccf-nzv2s       0/1     Running   0          2m30s

kube-system   coredns-fb8b8dccf-v74jz       0/1     Running   0          2m30s

kube-system   kube-apiserver-m1             1/1     Running   0          113s

kube-system   kube-controller-manager-m1    1/1     Running   0          92s

kube-system   kube-flannel-ds-amd64-6sqzv   1/1     Running   0          18s

kube-system   kube-proxy-s6sj7              1/1     Running   0          2m29s

kube-system   kube-scheduler-m1             1/1     Running   0          94s

etcd is not existed on master node. 

 

5. Setup the Worker nodes

 

It is almost same. I need docker engin and kubeadm, kubectl, kubelet. I have already explain about these above. After creating master node, I can see the message like below

 kubeadm join 147.75.94.251:6443 --token 9nbo8n.y8oke84pmjgaj3px \

    --discovery-token-ca-cert-hash sha256:ac3693f08617ad045bf2e18a1ec0c3240e0651057dcd29e52daa21176a02b9f6 \

    --experimental-control-plane --certificate-key 1eae6218e326a045b866b38e4c0b0135d49203073bfe4f06f602e415dcd9b7a6

 

kubeadm join 147.75.94.251:6443 --token 9nbo8n.y8oke84pmjgaj3px \

    --discovery-token-ca-cert-hash sha256:ac3693f08617ad045bf2e18a1ec0c3240e0651057dcd29e52daa21176a02b9f6

First is for adding master node, second is for adding worker node. I need some information such as token.

# kubeadm token list

TOKEN                     TTL       EXPIRES                USAGES                   DESCRIPTION                                           EXTRA GROUPS

seiuxa.fsaa2107s0vl10kg   1h        2019-05-22T19:27:32Z                      Proxy for managing TTL for the kubeadm-certs secret   

yo0mwz.5mc41r1gn90akio2   23h       2019-05-23T17:27:33Z   authentication,signing                                                   system:bootstrappers:kubeadm:default-node-token

 

# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \

>    openssl dgst -sha256 -hex | sed 's/^.* //'

aeebcc777388b5ecde64801a51d2e37cf7ff2410e8e64f1832694fc3946d7c27

With these values, I can run command on each worker node. I will check the status of the cluster.

# kubectl get nodes

NAME   STATUS   ROLES    AGE   VERSION

m1     Ready    master   24m   v1.14.2

w1     Ready       39s   v1.14.2

w2     Ready       36s   v1.14.2

Now I made the kubernetes cluster.

 

Reference

[ 1 ] https://kubernetes.io/docs/setup/independent/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl

[ 2 ] https://docs.docker.com/install/linux/docker-ce/ubuntu/

[ 3 ] https://kubernetes.io/docs/setup/cri/

[ 4 ] https://kubernetes.io/docs/setup/independent/setup-ha-etcd-with-kubeadm/ 

[ 5 ] https://createnetech.tistory.com/17?category=679927

[ 6 ] https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/

[ 7 ] https://kubernetes.io/docs/setup/independent/high-availability/

[ 8 ] https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1

[ 9 ] https://kubernetes.io/docs/setup/independent/troubleshooting-kubeadm/#coredns-or-kube-dns-is-stuck-in-the-pending-state

[ 10 ] https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd-nodes

[ 11 ] https://createnetech.tistory.com/36?category=672585

[ 12 ] https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/#is-kube-proxy-writing-iptables-rules

'Network Engineering > Docker and Kubernetes Learning' 카테고리의 다른 글

How does the flannel work?  (0) 2019.01.09

How does the flannel work?


Recently, I am studying kubernetis. During studying, I have known about the flannel. There is some instruction to reproduct simply. I will follow this instruction.


1. Pre-requisite


There are some preparation. Docker and Etcd should be installed before. Here is instruction for installation of Docker. I will install community engine on 2 nodes running "ubuntu 16.04".


sudo apt-get remove docker docker-engine docker.io

sudo apt-get update

sudo apt-get install \

    apt-transport-https \

    ca-certificates \

    curl \

    software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo apt-key fingerprint 0EBFCD88

sudo add-apt-repository \

   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \

   $(lsb_release -cs) \

   stable"

sudo apt-get update

sudo apt-get install docker-ce  


And then, I will install Etcd. I have already written how to install and use Etcd in this post. However, it is more simple in this instruction. In my case, I will run the command in "root directory"


wget https://github.com/coreos/etcd/releases/download/v3.0.12/etcd-v3.0.12-linux-amd64.tar.gz

tar zxvf etcd-v3.0.12-linux-amd64.tar.gz

cd etcd-v3.0.12-linux-amd64


After installation, I will edit the "/etc/hosts" file before running this Etcd. Please note that host name has only single IP address on this file.


vi /etc/hosts

147.75.65.69    node1

147.75.65.63    node2


Each nodes can communicate with each other with these IP addresses.


# At node 1

nohup ./etcd --name docker-node1 --initial-advertise-peer-urls http://node1:2380 \

--listen-peer-urls http://node1:2380 \

--listen-client-urls http://node1:2379,http://localhost:2379 \

--advertise-client-urls http://node1:2379 \

--initial-cluster-token etcd-cluster \

--initial-cluster docker-node1=http://node1:2380,docker-node2=http://node2:2380 \

--initial-cluster-state new&


# At node 2

nohup ./etcd --name docker-node2 --initial-advertise-peer-urls http://node2:2380 \

--listen-peer-urls http://node2:2380 \

--listen-client-urls http://node2:2379,http://localhost:2379 \

--advertise-client-urls http://node2:2379 \

--initial-cluster-token etcd-cluster \

--initial-cluster docker-node1=http://node1:2380,docker-node2=http://node2:2380 \

--initial-cluster-state new&


Now, I am ready to start the flannel configuration.


cd etcd-v3.0.12-linux-amd64

./etcdctl cluster-health

member 43bb846a7344a01f is healthy: got healthy result from http://node2:2379

member a9aee06e6a14d468 is healthy: got healthy result from http://node1:2379

cluster is healthy


2. Flannel installation.


Download and install the flannel command on each nodes. In my case, I will run the command in "root directory"


wget https://github.com/coreos/flannel/releases/download/v0.6.2/flanneld-amd64 -O flanneld && chmod 755 flanneld


I will make configuration file to create the network topology.


vi flannel-network-config.json

{

    "Network": "172.16.0.0/12",

    "SubnetLen": 24,

    "SubnetMin": "172.16.16.0",

    "SubnetMax": "172.31.247.0",

    "Backend": {

        "Type": "vxlan",

        "VNI": 172,

        "Port": 8889

    }

}


In this documentation, there are the meaning of the above parameters. Especially, "SubnetLen" is IP address range allocated in each host. Flannel use the configuration from Etcd, /coreos.com/network/config. I will set the configuration on Node 1


# At node 1

cd etcd-v3.0.12-linux-amd64/

~/etcd-v3.0.12-linux-amd64$ ./etcdctl set /coreos.com/network/config < ../flannel-network-config.json


I can check if the configuration is set or not on Node 2.


# At node 2

cd etcd-v3.0.12-linux-amd64/

~/etcd-v3.0.12-linux-amd64$ ./etcdctl get /coreos.com/network/config | jq .

{

  "Network": "172.16.0.0/12",

  "SubnetLen": 24,

  "SubnetMin": "172.16.16.0",

  "SubnetMax": "172.31.247.0",

  "Backend": {

    "Type": "vxlan",

    "VNI": 172,

    "Port": 8889

  }

}


Now, I am ready to start the flannel. Before start flannel, I have to look at my network interface status.


# At node 1

nohup sudo ./flanneld -iface=bond0 &


# At node 2

nohup sudo ./flanneld -iface=bond0 &


After start flannel, I can see new interface which is named with "flannel.VNI". In this case, It should be flannel.172.


flannel.172 Link encap:Ethernet  HWaddr 82:41:de:d4:77:d3

          inet addr:172.16.75.0  Bcast:0.0.0.0  Mask:255.240.0.0

          inet6 addr: fe80::8041:deff:fed4:77d3/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:8 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)


Also, I can get some information from Etcd.


cd etcd-v3.0.12-linux-amd64/

~/etcd-v3.0.12-linux-amd64# ./etcdctl ls /coreos.com/network/subnets

/coreos.com/network/subnets/172.16.68.0-24

/coreos.com/network/subnets/172.16.75.0-24


This mean that each host has these subnets each. I can see more detail. 


cd etcd-v3.0.12-linux-amd64/

~/etcd-v3.0.12-linux-amd64# ./etcdctl get /coreos.com/network/subnets/172.16.68.0-24 | jq .

{

  "PublicIP": "147.75.65.63",

  "BackendType": "vxlan",

  "BackendData": {

    "VtepMAC": "7a:ac:15:15:2b:61"

  }

}


This is configuration for the flannel. I can also see the what flannel network is assigned with "/var/run/flannel/subnet.env". Please note that this file will be used for next step, docker daemon configuration.


cat /var/run/flannel/subnet.env

FLANNEL_NETWORK=172.16.0.0/12

FLANNEL_SUBNET=172.16.68.1/24

FLANNEL_MTU=1450

FLANNEL_IPMASQ=false


3. Docker daemon configuration.


Docker does not use flannel default. It has swarm mode to make overlay network. Therefore, it is necessary to change to use this flannel as default network module. At first, I have to stop the Docker daemon on each hosts, node1 and node2.


# Node 1 and Node 2

sudo service docker stop


I will restart Docker daemon with this flannel configuration.


# Node 1 and Node 2

source /run/flannel/subnet.env
sudo ifconfig docker0 ${FLANNEL_SUBNET}
sudo dockerd --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU} &


Before restart Docker daemon, there is "docker0" interface default which has "172.17.0.1" IP address. It will be changed with the network what I defined.


# Before restart with flannel configuration

ifconfig docker0

docker0   Link encap:Ethernet  HWaddr 02:42:f6:10:ac:49

          inet addr:172.17.0.1  Bcast:172.17.255.255  Mask:255.255.0.0

          UP BROADCAST MULTICAST  MTU:1500  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)


# After restart with flannel configuration

ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:f6:10:ac:49
          inet addr:172.16.75.1  Bcast:172.16.75.255  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)


4. Create the Test container and Start it.


Now, I will create 2 container for the test. 


# At the node 1

sudo docker run -d --name test1  busybox sh -c "while true; do sleep 3600; done"


# At the node 2

ssudo docker run -d --name test2  busybox sh -c "while true; do sleep 3600; done"


Depends on the container properties, the container can be stopped after exit. "sh -c "while true; do sleep 3600; done" make this container keep alive for 1 hour. It is enough for the test.


5. Analysis container networks.


In this post, I will explain how to work in docker swarm mode. It is good to analysis the network topology of docker. At first, go to "/var/run/docker", there is the "netns" directory. There is the network configuration of the container.


# At node 1 and node 2 

cd /var/run/

ln -s docker/netns/ netns


After this symbolic link, I can see the network namespace list like below. "ip netns list" show the namespace ID.


ip netns list

faf9928b897f (id: 0)


I can see more detail information with this ID. "ip netns exec" show the same result with "docker exec". Thus I can see the same result with "docker exec test1 ip -d addr show"


# At the Node 1

ip netns exec b5380e6b336a ip -d addr show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

11: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

    link/ether 02:42:ac:10:4b:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0

    veth

    inet 172.16.75.2/24 brd 172.16.75.255 scope global eth0

       valid_lft forever preferred_lft forever


# At the Node 2

ip netns exec faf9928b897f ip -d addr show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

15: eth0@if16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

    link/ether 02:42:ac:10:44:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0

    veth

    inet 172.16.68.2/24 brd 172.16.68.255 scope global eth0

       valid_lft forever preferred_lft forever


6. Test and Troubleshooting


Now, I have know that "172.16.75.2" is container IP address on Node 1 and "172.16.68.2" is container IP address on Node 2. Flannel offers the overlay network between hosts. So, I will send ICMP (ping) from node1 to node2


ip netns exec b5380e6b336a ping 172.16.68.2

PING 172.16.68.2 (172.16.68.2) 56(84) bytes of data.

^C

--- 172.16.68.2 ping statistics ---

6 packets transmitted, 0 received, 100% packet loss, time 5040ms


Hmm. It does not work. From Docker 1.13 default iptables policy for FORWARDING is DROP,


# At Node 1 and Node 2

sudo iptables -P FORWARD ACCEPT


Wow, I can send ICMP each other.


ip netns exec b5380e6b336a ping 172.16.68.2

ip netns exec b5380e6b336a ping 172.16.68.2 -c 4

PING 172.16.68.2 (172.16.68.2) 56(84) bytes of data.

64 bytes from 172.16.68.2: icmp_seq=1 ttl=62 time=0.364 ms

64 bytes from 172.16.68.2: icmp_seq=2 ttl=62 time=0.310 ms

64 bytes from 172.16.68.2: icmp_seq=3 ttl=62 time=0.319 ms

64 bytes from 172.16.68.2: icmp_seq=4 ttl=62 time=0.308 ms


--- 172.16.68.2 ping statistics ---

4 packets transmitted, 4 received, 0% packet loss, time 2998ms

rtt min/avg/max/mdev = 0.308/0.325/0.364/0.026 ms


This is the flannel.


Reference


[ 1 ] https://docs.docker.com/install/linux/docker-ce/ubuntu/

[ 2 ] https://docker-k8s-lab.readthedocs.io/en/latest/docker/docker-etcd.html

[ 3 ] https://docker-k8s-lab.readthedocs.io/en/latest/docker/docker-flannel.html

[ 4 ] https://github.com/coreos/flannel/blob/master/Documentation/configuration.md

+ Recent posts