So, I'm working on the Multiple Schedulers exercise. I had actually completed th . . .

Craig Shea:
So, I’m working on the Multiple Schedulers exercise. I had actually completed this successfully about 2 months ago. I’m coming back through the course, and now Kuberetes 1.19 is in use. NBD. I’ve been playing with it. BUT, whenever I copy kube-scheduler.yaml to my-scheduler.yaml and make the appropriate changes, the pod never starts and goes into a crash loop. Looking at the logs, the error is failed to listen on 127.0.0.1:10259: ...address already in use. BUT, I explicitly used a containerPort: 10269 and changed the healthz ports to 10269 as well. Oh, and made sure the image command specified --port=10269. All to no avail. Am I missing something?

Craig Shea:

apiVersion: v1
kind: Pod
metadata:
  labels:
    component: kube-scheduler
    tier: control-plane
  name: my-scheduler
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=false
    - --scheduler-name=my-scheduler
    - --port=10269
    image: <http://k8s.gcr.io/kube-scheduler:v1.19.0|k8s.gcr.io/kube-scheduler:v1.19.0>
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10269
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    ports:
      - containerPort: 10269
    name: kube-scheduler
    resources:
      requests:
        cpu: 100m
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10269
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /etc/kubernetes/scheduler.conf
      name: kubeconfig
      readOnly: true
  hostNetwork: true
  priorityClassName: system-node-critical
  volumes:
  - hostPath:
      path: /etc/kubernetes/scheduler.conf
      type: FileOrCreate
    name: kubeconfig
status: {}

Rodolfo Siqueira:
remove the --port=10269 under command section
remove ports: -containerPort:10269
move again 10259 under httpGet and Statupprobe

Rodolfo Siqueira:
remove the pods and recreate it using the definition file…

Craig Shea:
Yes, I’ve done all that before, but, I will try it again :slightly_smiling_face:

Craig Shea:
According to your instructions, here is my YAML file:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-scheduler
    tier: control-plane
  name: my-scheduler
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=false
    - --scheduler-name=my-scheduler
    - --port=0
    image: <http://k8s.gcr.io/kube-scheduler:v1.19.0|k8s.gcr.io/kube-scheduler:v1.19.0>
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: my-scheduler
    resources:
      requests:
        cpu: 100m
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /etc/kubernetes/scheduler.conf
      name: kubeconfig
      readOnly: true
  hostNetwork: true
  priorityClassName: system-node-critical
  volumes:
  - hostPath:
      path: /etc/kubernetes/scheduler.conf
      type: FileOrCreate
    name: kubeconfig
status: {}

After putting this into /etc/kubernetes/manifests, here is the output from kubectl logs -n kube-system my-scheduler-controlplane command:

controlplane $ kubectl logs -n kube-system my-scheduler-controlplane 
I0221 02:31:27.637253       1 registry.go:173] Registering SelectorSpread plugin
I0221 02:31:27.637329       1 registry.go:173] Registering SelectorSpread plugin
I0221 02:31:28.603613       1 serving.go:331] Generated self-signed cert in-memory
failed to create listener: failed to listen on 127.0.0.1:10259: listen tcp 127.0.0.1:10259: bind: address already in use

Craig Shea:
This is no different than the error I had with the previous YAML file where I changed ports.

Craig Shea:
Thank you for your suggestion.

Kunal Kandoi:
Guess my custom scheduler is not be created as static pod. Can you try moving your my scheduler yaml to some other location and try creating using kubectl create command.

Craig Shea:
Sure, I could do that, but the exercise calls for creating as a static pod. Besides, it shouldn’t really matter whether it’s a static pod vs. a pod created via the API. A pod is a pod.

Felipe:
there was also a parameter that was like “secure-port” (can’t remember right now) that has to be changed

Rahul Narula:
Were you able to solve it?

Hi. I was able to define my custom scheduler with the following modifications:

  • Renamed the scheduler
  • Disable leader election
  • Define custom secure port (–secure-port)
  • Use the same port for the Probes

Here is the my-scheduler - yaml - file:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-scheduler
    tier: control-plane
  name: my-scheduler
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=false
    - --scheduler-name=my-scheduler
    - --secure-port=10261
    - --port=0
    image: k8s.gcr.io/kube-scheduler:v1.19.0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10261
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: my-scheduler
    resources:
      requests:
        cpu: 100m
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10261
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /etc/kubernetes/scheduler.conf
      name: kubeconfig
      readOnly: true
  hostNetwork: true
  priorityClassName: system-node-critical
  volumes:
  - hostPath:
      path: /etc/kubernetes/scheduler.conf
      type: FileOrCreate
    name: kubeconfig
status: {}

With this pod started:

controlplane $ kubectl get pods -n kube-system my-scheduler-controlplane 
NAME                        READY   STATUS    RESTARTS   AGE
my-scheduler-controlplane   1/1     Running   0          28m
controlplane $

But when I clicked the ‘Check’ button, the check failed.

For a testing purpose I created a simple pod:

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  schedulerName: my-scheduler
  containers:
  - name: pause
    image: k8s.gcr.io/pause:2.0

Which I was schedule successfully by ‘my-schduler’:

controlplane $ kubectl get events -o wide | grep Scheduled
25m         Normal   Scheduled                 pod/test-pod                                 my-scheduler, my-scheduler-controlplane   Successfully assigned default/test-pod to node01                      25m          1       test-pod.16720119f6cfa881
controlplane $

Is there an issue with this solution that I am not aware of as the check didn’t pass?