Job And CronJob简介
我们在日常的工作中经常都会遇到一些需要进行批量数据处理和分析的需求,当然也会有按时间来进行调度的工作,在我们的 Kubernetes 集群中为我们提供了 Job 和 CronJob 两种资源对象来应对我们的这种需求。
Job 负责处理任务,即仅执行一次的任务,它保证批处理任务的一个或多个 Pod 成功结束。而CronJob 则就是在 Job 上加上了时间调度。
Job
先创建一个Job资源对象
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| [root@pool1 Job_CronJob]# vi Job.yaml apiVersion: batch/v1 kind: Job metadata: name: job-demo spec: template: spec: restartPolicy: Never containers: - name: counter image: centos:7 imagePullPolicy: IfNotPresent command: - "bin/sh" - "-c" - "for i in 9 8 7 6 5 4 3 2 1; do echo $i; done && ls /"
|
RestartPolicy仅支持Never和Onfailure两种,不支持Always,因为Job是一次性任务处理资源,执行完该Pod就结束了,不会再有其他操作,如果写了Always的话,没执行结束一次就会再次重启,陷入死循环;
创建Job对象
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
| [root@pool1 Job_CronJob] job.batch/job-demo created [root@pool1 Job_CronJob] NAME READY STATUS RESTARTS AGE job-demo-nzjfk 0/1 Completed 0 4s [root@pool1 Job_CronJob] Name: job-demo-nzjfk Namespace: default Priority: 0 Node: pool3/10.99.2.162 Start Time: Thu, 16 Dec 2021 17:18:58 +0800 Labels: controller-uid=8a008103-d977-49ec-a3d0-6e90c2b044d0 job-name=job-demo Annotations: cni.projectcalico.org/containerID: cb5a471e2f45c49fe325b2c26eeb6c237e104abe5c56d987d90a31f28a1c6e02 cni.projectcalico.org/podIP: cni.projectcalico.org/podIPs: Status: Succeeded IP: 10.244.206.27 IPs: IP: 10.244.206.27 Controlled By: Job/job-demo Containers: counter: Container ID: docker://46f3453f2ca5a188b3e6b14a9a34bb87b8b3a1499fa811c9f6698f9ebd207827 Image: centos:7 Image ID: docker://sha256:5e35e350aded98340bc8fcb0ba392d809c807bc3eb5c618d4a0674d98d88bccd Port: <none> Host Port: <none> Command: bin/sh -c for i in 9 8 7 6 5 4 3 2 1; do echo $i; done && ls / State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 16 Dec 2021 17:18:59 +0800 Finished: Thu, 16 Dec 2021 17:18:59 +0800 Ready: False Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-nzxhw (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: default-token-nzxhw: Type: Secret (a volume populated by a Secret) SecretName: default-token-nzxhw Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 16s default-scheduler Successfully assigned default/job-demo-nzjfk to pool3 Normal Pulled 15s kubelet Container image "centos:7" already present on machine Normal Created 15s kubelet Created container counter Normal Started 15s kubelet Started container counter
|
看到Pod状态为Completed,这正是因为Pod执行任务结束正常退出;
查看Job日志
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| [root@pool1 Job_CronJob] 9 8 7 6 5 4 3 2 1 anaconda-post.log bin dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
|
如果的任务执行失败了,我们这里定义了 restartPolicy=Never,那么任务在执行失败后 Job 控制器就会不断地尝试创建一个新 Pod,当然,这个尝试肯定不能无限进行下去。我们可以通过 Job 对象的 spec.backoffLimit 字段来定义重试次数,另外需要注意的是 Job 控制器重新创建 Pod 的间隔是呈指数增加的,即下一次重新创建 Pod 的动作会分别发生在 10s、20s、40s… 后。
如果我们定义的 restartPolicy=OnFailure,那么任务执行失败后,Job 控制器就不会去尝试创建新的 Pod了,它会不断地尝试重启 Pod 里的容器。