Markmap

Example 1.252.20.172.in-addr.arpa. 5 IN PTR kubernetes.default.svc.cluster.local.

Example: _https._tcp.kubernetes.default.svc.cluster.local. 5 IN SRV 0 100 443 kubernetes.default.svc.cluster.local.

Name: _my-port-name._my-port-protocol.my-svc.my-namespace.svc.cluster-domain.example

Example: kubernetes.default.svc.cluster.local. 30 IN A 172.20.252.11

Name: my-svc.my-namespace.svc.cluster-domain.example

......

负载间的干扰

数据局域性

亲和以及反亲和要求

硬件/软件/策略限制

服务治理要求

资源需求

......

serviceaccounts controller

namespace controller

endpoints controller

replication controller

kubernetes: CoreDNS will reply to DNS queries based on IP of the services and pods of Kubernetes.

prometheus: Metrics of CoreDNS are available at http://localhost:9153/metrics in Prometheus format

errors: Erros are logged to stdout

pod-ip-address.deployment-name.my-namespace.svc.cluster-domain.example

Example: 172-20-0-57.default.pod.cluster.local. 3 IN A 172.20.0.57

Name: pod-ip-address.my-namespace.pod.cluster-domain.example

A CNAME pointing to the domain name of the external service

A PTR record which resolves pod IP to domain name of each pod

N*M SRV records (N pods, M named ports in service)

A/AAA record which resolves to the set of IPs of the pods selected by the service

A PTR record which resolves Cluster IP to domain name

SRV record for each named service port

A/AAA record which resolves name to the Cluster IP

进行流量转发（userspace模式）

设置Service对应的IPtables规则

在节点上提供Kubernetes API中定义Service

确保PodSpec中描述的容器处于运行状态且运行状况良好

接受通过各种机制（主要是通过apiserver）提供的一组PodSpec

调度的考虑因素

功能：将Pod调度到合适的工作节点上运行

controllers

功能：通过apiserver监视集群的状态，并做出相应更改，以使得集群的当前状态向预期状态靠拢

将对象存储到Etcd中

对外提供Watch机制，通知对象变化

对外提供各种对象的CRUD REST接口

Version: API version this extension API server hosts

Group: API groups this extension API server hosts

CluseterRoleBinding (Associate users retrived from authentication process to ClusteRoles)

ClusterRole

RoleBinding (Associate users retrived from authentication process to Roles)

Role

Group: Certificate subject Organization field

User name: Certificate subject Common Name field

Inter-Pod anti-affinity: distribute some pods in different nodes

Inter-Pod affinity: co-locate some pods in the same nodes

Node anti-affinity: use NotIn and DoesNotExist to achieve node anti-affinity

Node affinity: has the same ability to constrain pods to particular nodes, but is more expressive and powerful

Plugins

Pod created by Deployment or DaemonSet exposed by a Service

General name

A/AAA record which resolves name to the pod IP

ExternalName

Headless Service

Normal Service

Kube dns returns the IP of the pods backing the service

No load balancing and proxying for Headless service

No cluster IP allocated to Headless services

Define a Headless service: specify "None" in for the cluster IP(.spec.clusterIP)

DNS redirection

An alias to an external service

Provides an external IP to allow access from outside of the cluster

The NodePort range is defined in API server startup option --service-node-port-range

Provides access at the node level

Service port is defined in the Service Manifest

The ClusterIP range is defined in API server startup option -service-cluster-ip-range

Provides access in the cluster internally

Limit of total number of Namespaced resources (count/services)

Ephemeral Storage (ephermal-storage)

Persistent Storage (storage)

Memory (limits.memory requets.memory)

CPU (limits.cpu requests.cpu)

Volume can be mounted to mutiple containers inside a Pod

Data won't lost when a container is restarted

Kube-proxy（节点网络代理）

Kubelet（节点代理）

kube-scheduler（调度器）

kube-controller-manager（守护进程）

kube-apiserver（API Server）

kube-apiserver proxies client requests to the extension API server

Deploy an extension API server

Custom Controllers: watch-loop to make sure the actual state meet the declared spec

Custom Resources/Ojbects: Declare the desired spec of a custom resource

CRD: Define custom resources

...

secretes

services

deployment

Cluster Scope

Namespace Scope

Authenticated with a valid certicated signed by the cluster's CA

Managed out side of Kubernetes

Bound to a specific namespace

Represent workloads in the cluster

Managed by Kubernetes

allow pods to be scheduled onto nodes with matching taints

allow a node to repel a set of pods

Inter-Pod

Node

Pod

Istio Ingress Gateway

K8s Ingress

kebectl port-forward pod/mypod localport:port

kebectl port-forward service/myservice localport:port

kebectl port-forward deployment/mydeployment localport:port

Forward local ports to a pod

Can send requests to services via url localhost:proxy-port/api/v1/namespaces/namespace_name/services/service_name[:port_name]/proxy/[application url]

Can send requests to API server (for example: get the list of services in default namespace localhost:proxy-port/api/v1/namespaces/default/services)

Handles locating the apiserver and authenticating (uses cluster configuration and user credential in .kube/config)

Provides a proxy server or appliction-level gateway between localhost and the K8s API server

OSPF

BGP

Limit: The maximum amount of resources that one can get

Request: Resources that are guaranteed to get

Object Count Quota

Storage Resource Quota

Compute Resource Quota

Namespaced: ResourceQuota is enforced in a Namespace scope, different Namespaces have different Resouce limit

Limit the aggregated resource consumption of a Namespace

persistentVolumeClaim

local

hostPath

emptyDir

configMap

Share data between containers running together in a Pod

Persist data across the life span of a Pod

CronJob runs a job periodically on a given schedule.

Job runs pods until a specified number of them have been succcessfully executed.

Use cases: cluster storage daemon, logs collection daemon, node monitoring daemon.

DaemonSet ensures that all(or some) Nodes run a copy of a Pod.

SetatefSet require a Headless Service to provide network identity for the pods.

StatefulSet is used to deploy stateful applications.

ReplicaSet is not supposed to be used directly, it should be managed by Deployments.

Deployment is used to rollout/update/rollback ReplicaSet.

ReplicaSet ensured a specified numbers of pod replicas are running at a given time.

Deployment is used to deploy stateless appliations.

Work Node

Kubernetes Master

Aggregated API Server

Custom Resource

helm install redis bitnami/redis: install redis chart

helm sarch repo redis: find redis chart in repositories

helm search hub redis: find redis chart and its repository in helm hub

values.yaml: this files contains keys and values that are used to generate the release. These values are replaced in the resource manifests using the Go template syntax

templeates: this directorey contains the resource manifests that makes up this application

Chart.yaml: this files contains metadata about this Chart: name, version, keywords

RBAC

Bootstrap Token for clusters and nodes bootstrapping

Certifications for control plane components communication

Client certifications for normal users

Service account tokens for service accounts

Normal User

Service Account

Kubernetes 中使用到的证书

Certificate and PKI

SchedulerName: choose a specific scheduler to schedule a pod

taints and tolerations

Affinity and anti-affinity:

NodeSelector: assign pods to a group of nodes with particular labels

NodeName: assign pods to the named node

Name: --scheduler-name

Policy: --policy-config-file

Policy: specify a number of predicates and priorities

Priorities: select a node to run the scheduled pod: select the node with the least amount of pods by default

Predicates: find available nodes through some conditions: check memory, cpu, disk, etc.

Kubernetes CNI插件

API Gateway+Service Mesh

腾讯云

DNS

Ingress

Kubectl port-forward

Kube Proxy

Service

Link-State Protocol

Distance Vector Protocol

Linux 上实现 vxlan 网络

Vxlan原理

Linux tun/tap

Request and Limit

Type

Scope

purpose

type

purpose

Job & CronJob

DaemonSet

StatefulSet

Deployment & RelicaSet

All containers in a pod share storage, network namespacem and cgroup

Consist of one or more containers

Smalleset deployable computing unit

Etcd cluster run on nodes seperate from the Kubernetes head nodes

Multiple Scheduler and Controller Manager instances with leader election

Multiple API Server instances fronted by a load balancer

Single etcd node

Multiple Scheduler and Controller Manager instances with leader election

Multiple API Server instances fronted by a load balancer

API Server，Scheduler，and Controller Manager run on a single node

Kubernetes Control Plane（确保集群当前状态匹配预期状态）

Kubernetes API 对象（声明预期状态）

Extending the Kubernetes API

Helm commands

Repository: HTTP servers that contains charts

Chart: package all k8s manifests as a single tarball

Authorization

Authentication

User Type

Background Knowledge

Pod Specification: hits for pod scheduling

Run a customscheduler

Algorithm: Predicate find a set of available nodes -> Priority select the best suitable node

K8s Network

Routing Protocol

Vxlan

Vlan

Linux bridge

Veth Pair

Network Namespace

Linux Network Virtualization

ResourceQuota

Volume

Workload resources(Controllers)

Pod

云服务厂商：使用Kubernetes来打破AWS的先入垄断地位，抢夺市场份额

云服务用户：避免使用单一云提供商导致的厂商锁定，避免技术和成本风险

Kubernetes Federation

HA etcd，HA head nodes，multiple workers

Single etcd，HA heade nodes，multiple workers

Single head node，multiple workers

Single node

预期状态管理(Desired State Management)

自动化部署，缩扩容和管理容器应用

Helm: package management tool for K8s applications

Security

Scheduling

Network

Policies

Storage

Workload

商业模式

部署模式

基本理念

Kubernetes