- Kubernetes
- 基本理念
- 自动化部署,缩扩容和管理容器应用
- 预期状态管理(Desired State Management)
- Kubernetes API 对象(声明预期状态)
- Kubernetes Control Plane(确保集群当前状态匹配预期状态)
- Kubernetes Master
- kube-apiserver(API Server)
- 对外提供各种对象的CRUD REST接口
- 对外提供Watch机制,通知对象变化
- 将对象存储到Etcd中
- kube-controller-manager(守护进程)
- 功能:通过apiserver监视集群的状态,并做出相应更改,以使得集群的当前状态向预期状态靠拢
- controllers
- replication controller
- endpoints controller
- namespace controller
- serviceaccounts controller
- ……
- kube-scheduler(调度器)
- 功能:将Pod调度到合适的工作节点上运行
- 调度的考虑因素
- 资源需求
- 服务治理要求
- 硬件/软件/策略限制
- 亲和以及反亲和要求
- 数据局域性
- 负载间的干扰
- ……
- kube-apiserver(API Server)
- Work Node
- Kubelet(节点代理)
- 接受通过各种机制(主要是通过apiserver)提供的一组PodSpec
- 确保PodSpec中描述的容器处于运行状态且运行状况良好
- Kube-proxy(节点网络代理)
- 在节点上提供Kubernetes API中定义Service
- 设置Service对应的IPtables规则
- 进行流量转发(userspace模式)
- Kubelet(节点代理)
- Kubernetes Master
- 部署模式
- Single node
- Single head node,multiple workers
- API Server,Scheduler,and Controller Manager run on a single node
- Single etcd,HA heade nodes,multiple workers
- Multiple API Server instances fronted by a load balancer
- Multiple Scheduler and Controller Manager instances with leader election
- Single etcd node
- HA etcd,HA head nodes,multiple workers
- Multiple API Server instances fronted by a load balancer
- Multiple Scheduler and Controller Manager instances with leader election
- Etcd cluster run on nodes seperate from the Kubernetes head nodes
- Kubernetes Federation
- 商业模式
- 云服务用户:避免使用单一云提供商导致的厂商锁定,避免技术和成本风险
- 云服务厂商:使用Kubernetes来打破AWS的先入垄断地位,抢夺市场份额
- Workload
- Pod
- Smalleset deployable computing unit
- Consist of one or more containers
- All containers in a pod share storage, network namespacem and cgroup
- Workload resources(Controllers)
- Deployment & RelicaSet
- Deployment is used to deploy stateless appliations.
- ReplicaSet ensured a specified numbers of pod replicas are running at a given time.
- Deployment is used to rollout/update/rollback ReplicaSet.
- ReplicaSet is not supposed to be used directly, it should be managed by Deployments.
- StatefulSet
- StatefulSet is used to deploy stateful applications.
- SetatefSet require a Headless Service to provide network identity for the pods.
- DaemonSet
- DaemonSet ensures that all(or some) Nodes run a copy of a Pod.
- Use cases: cluster storage daemon, logs collection daemon, node monitoring daemon.
- Job & CronJob
- Job runs pods until a specified number of them have been succcessfully executed.
- CronJob runs a job periodically on a given schedule.
- Deployment & RelicaSet
- Pod
- Storage
- Volume
- purpose
- Persist data across the life span of a Pod
- Data won’t lost when a container is restarted
- Share data between containers running together in a Pod
- Volume can be mounted to mutiple containers inside a Pod
- Persist data across the life span of a Pod
- type
- configMap
- emptyDir
- hostPath
- local
- persistentVolumeClaim
- purpose
- Volume
- Policies
- ResourceQuota
- purpose
- Limit the aggregated resource consumption of a Namespace
- Scope
- Namespaced: ResourceQuota is enforced in a Namespace scope, different Namespaces have different Resouce limit
- Type
- Compute Resource Quota
- CPU (limits.cpu requests.cpu)
- Memory (limits.memory requets.memory)
- Storage Resource Quota
- Persistent Storage (storage)
- Ephemeral Storage (ephermal-storage)
- Object Count Quota
- Limit of total number of Namespaced resources (count/services)
- Compute Resource Quota
- Request and Limit
- Request: Resources that are guaranteed to get
- Limit: The maximum amount of resources that one can get
- purpose
- ResourceQuota
- Network
- Linux Network Virtualization
- Network Namespace
- Veth Pair
- Linux bridge
- Vlan
- Vxlan
- Routing Protocol
- Distance Vector Protocol
- BGP
- Link-State Protocol
- OSPF
- Distance Vector Protocol
- K8s Network
- Service
- Cluster IP
- Provides access in the cluster internally
- The ClusterIP range is defined in API server startup option
-service-cluster-ip-range
- Service port is defined in the Service Manifest
- NodePort
- Provides access at the node level
- The NodePort range is defined in API server startup option
--service-node-port-range
- LoadBalancer
- Provides an external IP to allow access from outside of the cluster
- ExternalName
- An alias to an external service
- DNS redirection
- Headless
- Define a Headless service: specify “None” in for the cluster IP(.spec.clusterIP)
- No cluster IP allocated to Headless services
- No load balancing and proxying for Headless service
- Kube dns returns the IP of the pods backing the service
- Cluster IP
- Kube Proxy
- Provides a proxy server or appliction-level gateway between localhost and the K8s API server
- Handles locating the apiserver and authenticating (uses cluster configuration and user credential in .kube/config)
- Can send requests to API server (for example: get the list of services in default namespace
localhost:proxy-port/api/v1/namespaces/default/services
) - Can send requests to services via url
localhost:proxy-port/api/v1/namespaces/namespace_name/services/service_name[:port_name]/proxy/[application url]
- Kubectl port-forward
- Forward local ports to a pod
- kebectl port-forward deployment/mydeployment localport:port
- kebectl port-forward service/myservice localport:port
- kebectl port-forward pod/mypod localport:port
- Ingress
- DNS
- Service
- Normal Service
- A/AAA record which resolves name to the Cluster IP
- Name:
my-svc.my-namespace.svc.cluster-domain.example
- Example:
kubernetes.default.svc.cluster.local. 30 IN A 172.20.252.11
- Name:
- SRV record for each named service port
- Name:
_my-port-name._my-port-protocol.my-svc.my-namespace.svc.cluster-domain.
example - Example:
_https._tcp.kubernetes.default.svc.cluster.local. 5 IN SRV 0 100 443 kubernetes.default.svc.cluster.local.
- Name:
- A PTR record which resolves Cluster IP to domain name
- Example
1.252.20.172.in-addr.arpa. 5 IN PTR kubernetes.default.svc.cluster.local.
- Example
- A/AAA record which resolves name to the Cluster IP
- Headless Service
- A/AAA record which resolves to the set of IPs of the pods selected by the service
- N*M SRV records (N pods, M named ports in service)
- A PTR record which resolves pod IP to domain name of each pod
- ExternalName
- A CNAME pointing to the domain name of the external service
- Normal Service
- Pod
- A/AAA record which resolves name to the pod IP
- General name
- Name:
pod-ip-address.my-namespace.pod.cluster-domain.example
- Example:
172-20-0-57.default.pod.cluster.local. 3 IN A 172.20.0.57
- Name:
- Pod created by Deployment or DaemonSet exposed by a Service
pod-ip-address.deployment-name.my-namespace.svc.cluster-domain.example
- CoreDNS
- Plugins
- errors: Erros are logged to stdout
- prometheus: Metrics of CoreDNS are available at
http://localhost:9153/metrics
in Prometheus format - kubernetes: CoreDNS will reply to DNS queries based on IP of the services and pods of Kubernetes.
- Plugins
- Service
- 腾讯云
- API Gateway+Service Mesh
- Kubernetes CNI插件
- Service
- Scheduling
- Algorithm: Predicate find a set of available nodes -> Priority select the best suitable node
- Predicates: find available nodes through some conditions: check memory, cpu, disk, etc.
- Priorities: select a node to run the scheduled pod: select the node with the least amount of pods by default
- Policy: specify a number of predicates and priorities
- Run a customscheduler
- Policy:
--policy-config-file
- Name:
--scheduler-name
- Policy:
- Pod Specification: hits for pod scheduling
- NodeName: assign pods to the named node
- NodeSelector: assign pods to a group of nodes with particular labels
- Affinity and anti-affinity:
- Node
- Node affinity: has the same ability to constrain pods to particular nodes, but is more expressive and powerful
- Node anti-affinity: use
NotIn
andDoesNotExist
to achieve node anti-affinity
- Inter-Pod
- Inter-Pod affinity: co-locate some pods in the same nodes
- Inter-Pod anti-affinity: distribute some pods in different nodes
- Node
- taints and tolerations
- allow a node to repel a set of pods
- allow pods to be scheduled onto nodes with matching taints
- SchedulerName: choose a specific scheduler to schedule a pod
- Algorithm: Predicate find a set of available nodes -> Priority select the best suitable node
- Security
- Background Knowledge
- User Type
- Service Account
- Managed by Kubernetes
- Represent workloads in the cluster
- Bound to a specific namespace
- Normal User
- Managed out side of Kubernetes
- Authenticated with a valid certicated signed by the cluster’s CA
- User name: Certificate subject Common Name field
- Group: Certificate subject Organization field
- Service Account
- Authentication
- Service account tokens for service accounts
- Client certifications for normal users
- Certifications for control plane components communication
- Bootstrap Token for clusters and nodes bootstrapping
- Authorization
- RBAC
- Namespace Scope
- Role
- RoleBinding (Associate users retrived from authentication process to Roles)
- Cluster Scope
- ClusterRole
- CluseterRoleBinding (Associate users retrived from authentication process to ClusteRoles)
- Namespace Scope
- RBAC
- Helm: package management tool for K8s applications
- Chart: package all k8s manifests as a single tarball
- Chart.yaml: this files contains metadata about this Chart: name, version, keywords
- templeates: this directorey contains the resource manifests that makes up this application
- deployment
- services
- secretes
- …
- values.yaml: this files contains keys and values that are used to generate the release. These values are replaced in the resource manifests using the Go template syntax
- Repository: HTTP servers that contains charts
- Helm commands
- helm search hub redis: find redis chart and its repository in helm hub
- helm sarch repo redis: find redis chart in repositories
- helm install redis bitnami/redis: install redis chart
- Extending the Kubernetes API
- Custom Resource
- CRD: Define custom resources
- Custom Resources/Ojbects: Declare the desired spec of a custom resource
- Custom Controllers: watch-loop to make sure the actual state meet the declared spec
- Aggregated API Server
- Deploy an extension API server
- Register APIService objects
- Group: API groups this extension API server hosts
- Version: API version this extension API server hosts
- kube-apiserver proxies client requests to the extension API server
- Custom Resource
- Chart: package all k8s manifests as a single tarball
- 基本理念