前言 經過前五篇的學習,你已經掌握了 Kubernetes 的核心概念。本篇將綜合運用這些知識,完成一個完整微服務應用的部署 :
設計應用架構
使用 Helm 簡化部署
多環境配置管理
日誌與監控整合
常見問題排除
讓我們開始真正的實戰!
應用架構設計 範例應用:電商平台 我們將部署一個簡化版的電商微服務系統:
graph TD
USER[使用者] -->|HTTPS| ING[Ingress]
ING --> GW[API Gateway]
GW --> USVC[User Service]
GW --> PSVC[Product Service]
GW --> OSVC[Order Service]
USVC --> REDIS[(Redis)]
PSVC --> PDB[(PostgreSQL<br/>Products)]
OSVC --> ODB[(PostgreSQL<br/>Orders)]
OSVC --> MQ[RabbitMQ]
MQ --> NSVC[Notification Service]
NSVC --> EMAIL[Email]
subgraph "可觀測性"
PROM[Prometheus]
GRAF[Grafana]
LOKI[Loki]
end
目錄結構 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 k8s-ecommerce/ ├── helm/ │ └── ecommerce/ │ ├── Chart.yaml │ ├── values.yaml │ ├── values-dev.yaml │ ├── values-prod.yaml │ └── templates/ │ ├── _helpers.tpl │ ├── namespace.yaml │ ├── api-gateway/ │ ├── user-service/ │ ├── product-service/ │ ├── order-service/ │ ├── databases/ │ ├── redis/ │ ├── rabbitmq/ │ └── ingress.yaml ├── manifests/ # 原始 K8s 清單(備用) │ └── ... └── README.md
Helm 簡介 為什麼需要 Helm? 直接使用 kubectl 管理多個 YAML 檔案會遇到以下問題:
問題
Helm 解決方案
多環境配置困難
Values 檔案
重複的配置
模板化
版本管理難
Chart 版本控制
安裝/升級/回滾繁瑣
helm install/upgrade/rollback
Helm 核心概念 graph LR
CHART[Chart<br/>應用程式包] --> RELEASE[Release<br/>安裝的實例]
VALUES[Values<br/>配置參數] --> RELEASE
REPO[Repository<br/>Chart 倉庫] --> CHART
Chart :Helm 套件,包含所有 K8s 資源定義
Values :可配置的參數
Release :Chart 的一個實例
Repository :存放 Chart 的倉庫
Helm 基本指令 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 brew install helm helm repo add bitnami https://charts.bitnami.com/bitnami helm repo update helm search repo nginx helm search hub nginx helm show chart bitnami/nginx helm show values bitnami/nginx helm install my-nginx bitnami/nginx helm install my-nginx bitnami/nginx -f values.yaml helm install my-nginx bitnami/nginx --set replicaCount=3 helm upgrade my-nginx bitnami/nginx -f values.yaml helm rollback my-nginx 1 helm list helm uninstall my-nginx
創建 Helm Chart 初始化 Chart
這會生成標準的 Chart 結構:
1 2 3 4 5 6 7 8 9 10 11 ecommerce/ ├── Chart.yaml # Chart 元數據 ├── values.yaml # 預設配置 ├── charts/ # 依賴的 sub-charts ├── templates/ # K8s 資源模板 │ ├── _helpers.tpl # 模板助手 │ ├── deployment.yaml │ ├── service.yaml │ ├── ingress.yaml │ └── ... └── .helmignore
Chart.yaml 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 apiVersion: v2 name: ecommerce description: E-commerce microservices application type: application version: 0.1 .0 appVersion: "1.0.0" dependencies: - name: postgresql version: 12. x.x repository: https://charts.bitnami.com/bitnami alias: productdb condition: productdb.enabled - name: postgresql version: 12. x.x repository: https://charts.bitnami.com/bitnami alias: orderdb condition: orderdb.enabled - name: redis version: 17. x.x repository: https://charts.bitnami.com/bitnami condition: redis.enabled - name: rabbitmq version: 11. x.x repository: https://charts.bitnami.com/bitnami condition: rabbitmq.enabled
values.yaml(預設值) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 global: imageRegistry: "" imagePullSecrets: [] environment: production namespace: ecommerce apiGateway: replicaCount: 2 image: repository: ecommerce/api-gateway tag: "1.0.0" pullPolicy: IfNotPresent resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "200m" memory: "256Mi" service: type: ClusterIP port: 8080 userService: replicaCount: 2 image: repository: ecommerce/user-service tag: "1.0.0" pullPolicy: IfNotPresent resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "200m" memory: "256Mi" productService: replicaCount: 2 image: repository: ecommerce/product-service tag: "1.0.0" pullPolicy: IfNotPresent orderService: replicaCount: 2 image: repository: ecommerce/order-service tag: "1.0.0" pullPolicy: IfNotPresent ingress: enabled: true className: nginx host: ecommerce.example.com tls: enabled: false secretName: ecommerce-tls productdb: enabled: true auth: database: products username: product_user orderdb: enabled: true auth: database: orders username: order_user redis: enabled: true architecture: standalone auth: enabled: false rabbitmq: enabled: true auth: username: admin autoscaling: enabled: true minReplicas: 2 maxReplicas: 10 targetCPUUtilization: 70
模板範例 templates/_helpers.tpl 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 {{/* Common labels */ }}{{- define "ecommerce.labels" - }} helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }} app.kubernetes.io/managed-by: {{ .Release.Service }}app.kubernetes.io/instance: {{ .Release.Name }}app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}{{- end }} {{/* Selector labels */ }}{{- define "ecommerce.selectorLabels" - }} app.kubernetes.io/name: {{ .name }}app.kubernetes.io/instance: {{ .Release.Name }}{{- end }} {{/* Full name */ }}{{- define "ecommerce.fullname" - }} {{- printf "%s-%s" .Release.Name .Chart.Name | trunc 63 | trimSuffix "-" }} {{- end }}
templates/api-gateway/deployment.yaml 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 {{- if .Values.apiGateway }} apiVersion: apps/v1 kind: Deployment metadata: name: {{ include "ecommerce.fullname" . }}-api-gateway namespace: {{ .Values.namespace }} labels: {{- include "ecommerce.labels" . | nindent 4 }} app.kubernetes.io/component: api-gateway spec: replicas: {{ .Values.apiGateway.replicaCount }} selector: matchLabels: {{- include "ecommerce.selectorLabels" (dict "name" "api-gateway" "Release" .Release) | nindent 6 }} template: metadata: labels: {{- include "ecommerce.selectorLabels" (dict "name" "api-gateway" "Release" .Release) | nindent 8 }} spec: containers: - name: api-gateway image: "{{ .Values.apiGateway.image.repository }} :{{ .Values.apiGateway.image.tag }} " imagePullPolicy: {{ .Values.apiGateway.image.pullPolicy }} ports: - containerPort: 8080 env: - name: USER_SERVICE_URL value: "http://{{ include "ecommerce.fullname" . }} -user-service:8080" - name: PRODUCT_SERVICE_URL value: "http://{{ include "ecommerce.fullname" . }} -product-service:8080" - name: ORDER_SERVICE_URL value: "http://{{ include "ecommerce.fullname" . }} -order-service:8080" resources: {{- toYaml .Values.apiGateway.resources | nindent 10 }} livenessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 10 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 {{- end }}
templates/ingress.yaml 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 {{- if .Values.ingress.enabled }} apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: {{ include "ecommerce.fullname" . }}-ingress namespace: {{ .Values.namespace }} labels: {{- include "ecommerce.labels" . | nindent 4 }} annotations: nginx.ingress.kubernetes.io/ssl-redirect: "{{ .Values.ingress.tls.enabled }} " spec: ingressClassName: {{ .Values.ingress.className }} {{- if .Values.ingress.tls.enabled }} tls: - hosts: - {{ .Values.ingress.host }} secretName: {{ .Values.ingress.tls.secretName }} {{- end }} rules: - host: {{ .Values.ingress.host }} http: paths: - path: /api/users pathType: Prefix backend: service: name: {{ include "ecommerce.fullname" . }}-user-service port: number: 8080 - path: /api/products pathType: Prefix backend: service: name: {{ include "ecommerce.fullname" . }}-product-service port: number: 8080 - path: /api/orders pathType: Prefix backend: service: name: {{ include "ecommerce.fullname" . }}-order-service port: number: 8080 - path: / pathType: Prefix backend: service: name: {{ include "ecommerce.fullname" . }}-api-gateway port: number: 8080 {{- end }}
多環境配置 values-dev.yaml 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 global: environment: development namespace: ecommerce-dev apiGateway: replicaCount: 1 resources: requests: cpu: "50m" memory: "64Mi" limits: cpu: "100m" memory: "128Mi" userService: replicaCount: 1 productService: replicaCount: 1 orderService: replicaCount: 1 ingress: enabled: true host: dev.ecommerce.local tls: enabled: false autoscaling: enabled: false productdb: primary: persistence: size: 1Gi orderdb: primary: persistence: size: 1Gi
values-prod.yaml 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 global: environment: production namespace: ecommerce-prod apiGateway: replicaCount: 3 resources: requests: cpu: "200m" memory: "256Mi" limits: cpu: "500m" memory: "512Mi" userService: replicaCount: 3 productService: replicaCount: 3 orderService: replicaCount: 3 ingress: enabled: true host: ecommerce.example.com tls: enabled: true secretName: ecommerce-prod-tls autoscaling: enabled: true minReplicas: 3 maxReplicas: 20 targetCPUUtilization: 70 productdb: primary: persistence: size: 50Gi readReplicas: replicaCount: 2 orderdb: primary: persistence: size: 100Gi readReplicas: replicaCount: 2 redis: architecture: replication replica: replicaCount: 2 rabbitmq: replicaCount: 3
部署不同環境 1 2 3 4 5 6 7 8 helm install ecommerce-dev ./ecommerce -f values-dev.yaml -n ecommerce-dev --create-namespace helm install ecommerce-prod ./ecommerce -f values-prod.yaml -n ecommerce-prod --create-namespace helm upgrade ecommerce-prod ./ecommerce -f values-prod.yaml -n ecommerce-prod
日誌與監控整合 Prometheus 監控 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 serverFiles: prometheus.yml: scrape_configs: - job_name: "kubernetes-pods" kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape ] action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path ] action: replace target_label: __metrics_path__ regex: (.+)
Pod 中添加註解:
1 2 3 4 5 metadata: annotations: prometheus.io/scrape: "true" prometheus.io/port: "8080" prometheus.io/path: "/metrics"
服務應用程式中暴露 Metrics 以 Go 語言為例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" ) var ( httpRequestsTotal = prometheus.NewCounterVec( prometheus.CounterOpts{ Name: "http_requests_total" , Help: "Total number of HTTP requests" , }, []string {"method" , "endpoint" , "status" }, ) ) func init () { prometheus.MustRegister(httpRequestsTotal) } http.Handle("/metrics" , promhttp.Handler())
Grafana Dashboard 使用 Helm 部署:
1 2 3 4 helm install grafana grafana/grafana \ --set adminPassword=admin \ --set persistence.enabled=true \ --set persistence.size=5Gi
故障排除指南 常見問題診斷流程 graph TD
A[問題發生] --> B{Pod 狀態?}
B -->|Pending| C[檢查調度問題]
B -->|CrashLoopBackOff| D[檢查容器日誌]
B -->|ImagePullBackOff| E[檢查映像檔名稱/權限]
B -->|Running 但不工作| F[檢查服務連線]
C --> C1[資源不足?]
C --> C2[Node Selector/Affinity?]
C --> C3[Taints/Tolerations?]
D --> D1[應用程式錯誤]
D --> D2[配置錯誤]
D --> D3[資源不足 OOMKilled]
E --> E1[映像檔不存在]
E --> E2[Registry 認證]
F --> F1[Service 選擇器正確?]
F --> F2[端口配置正確?]
F --> F3[網路策略阻擋?]
常用診斷指令 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 kubectl get pods -o wide kubectl describe pod <pod-name> kubectl logs <pod-name> kubectl logs <pod-name> -c <container-name> kubectl logs <pod-name> --previous kubectl exec -it <pod-name> -- /bin/sh kubectl exec -it <pod-name> -c <container-name> -- /bin/sh kubectl get events --sort-by='.lastTimestamp' kubectl get events -n <namespace> --field-selector type =Warning kubectl top nodes kubectl top pods kubectl run debug --image=busybox --rm -it -- /bin/sh nslookup <service-name> wget -qO- <service-name>:<port> kubectl get endpoints <service-name> kubectl describe ingress <ingress-name>
常見問題解決 1. Pod 一直 Pending 1 2 3 4 5 6 7 8 9 10 kubectl describe pod <pod-name> kubectl describe nodes | grep -A 5 "Allocated resources" kubectl get pvc kubectl describe pvc <pvc-name>
2. CrashLoopBackOff 1 2 3 4 5 6 7 8 kubectl logs <pod-name> --previous
3. Service 無法存取 1 2 3 4 5 6 7 8 9 10 kubectl get endpoints <service-name> kubectl get pods --show-labels kubectl run debug --image=busybox --rm -it -- /bin/sh wget -qO- <service-name>:<port>
4. Ingress 不工作 1 2 3 4 5 6 7 8 9 kubectl get pods -n ingress-nginx kubectl describe ingress <ingress-name>
生產環境清單 部署前檢查清單 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 ## 資源配置 - [ ] 所有 Pod 都設定了 requests/limits- [ ] 設定了適當的 QoS(Guaranteed 用於關鍵服務)- [ ] 配置了 LimitRange 和 ResourceQuota## 高可用 - [ ] replicas >= 2- [ ] 設定了 PodDisruptionBudget- [ ] 跨 AZ 部署(podAntiAffinity)## 健康檢查 - [ ] 所有服務都有 livenessProbe- [ ] 所有服務都有 readinessProbe- [ ] 慢啟動服務設定 startupProbe## 擴展 - [ ] 配置了 HPA- [ ] 設定合理的 min/max replicas## 安全 - [ ] 使用 Secret 管理敏感資訊- [ ] 配置了 Network Policy- [ ] 使用非 root 用戶運行- [ ] 設定了 SecurityContext## 可觀測性 - [ ] 配置了 Prometheus/Grafana- [ ] 設定了日誌收集- [ ] 關鍵服務有告警規則## 備份與恢復 - [ ] 資料庫有定期備份- [ ] 測試過恢復流程
最終部署配置模板 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 apiVersion: apps/v1 kind: Deployment metadata: name: {{ .name }} labels: app: {{ .name }} version: {{ .version }} spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 selector: matchLabels: app: {{ .name }} template: metadata: labels: app: {{ .name }} version: {{ .version }} annotations: prometheus.io/scrape: "true" prometheus.io/port: "8080" spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchLabels: app: {{ .name }} topologyKey: topology.kubernetes.io/zone securityContext: runAsNonRoot: true runAsUser: 1000 containers: - name: {{ .name }} image: {{ .image }}:{{ .tag }} imagePullPolicy: IfNotPresent ports: - containerPort: 8080 resources: requests: cpu: "250m" memory: "256Mi" limits: cpu: "500m" memory: "512Mi" startupProbe: httpGet: path: /healthz port: 8080 failureThreshold: 30 periodSeconds: 10 livenessProbe: httpGet: path: /healthz port: 8080 periodSeconds: 10 failureThreshold: 3 readinessProbe: httpGet: path: /ready port: 8080 periodSeconds: 5 failureThreshold: 3 securityContext: readOnlyRootFilesystem: true capabilities: drop: - ALL envFrom: - configMapRef: name: {{ .name }}-config - secretRef: name: {{ .name }}-secrets --- apiVersion: v1 kind: Service metadata: name: {{ .name }} spec: selector: app: {{ .name }} ports: - port: 80 targetPort: 8080 --- apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: {{ .name }}-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: {{ .name }} minReplicas: 3 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 --- apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: {{ .name }}-pdb spec: minAvailable: 2 selector: matchLabels: app: {{ .name }}
本章重點回顧 Helm
簡化部署 :模板化 + 版本控制
多環境支援 :values-dev.yaml / values-prod.yaml
依賴管理 :Chart dependencies
多環境配置
環境隔離 :使用不同 Namespace
資源差異化 :dev 小、prod 大
功能開關 :條件性啟用功能
故障排除
診斷流程 :狀態 → 事件 → 日誌 → 偵錯
常用指令 :describe、logs、exec、events
網路偵錯 :使用 debug Pod
生產就緒
資源管理 :requests/limits
高可用 :replicas、anti-affinity、PDB
健康檢查 :三種探針
安全 :SecurityContext、Network Policy
系列總結 恭喜你完成了整個 Kubernetes 系列教學!讓我們回顧一下學習路徑:
graph LR
P1[入門概念] --> P2[工作負載]
P2 --> P3[網路]
P3 --> P4[配置存儲]
P4 --> P5[資源擴展]
P5 --> P6[實戰部署]
style P1 fill:#4ecdc4
style P2 fill:#95e1d3
style P3 fill:#f38181
style P4 fill:#fce38a
style P5 fill:#eaffd0
style P6 fill:#aa96da
下一步建議
持續實踐 :在自己的專案中應用所學
深入特定領域 :
StatefulSet 有狀態應用
Operator Pattern
Service Mesh(Istio、Linkerd)
考取認證 :
CKA(Certified Kubernetes Administrator)
CKAD(Certified Kubernetes Application Developer)
系列文章導覽
Part 1:入門篇 - 認識 K8s 與核心架構
Part 2:基礎篇 - Pod、Deployment 與 Service
Part 3:網路篇 - 服務發現與流量管理
Part 4:配置與存儲篇 - ConfigMap、Secret 與 Volume
Part 5:進階篇 - 資源管理與自動擴展
Part 6:實戰篇 - 部署完整微服務應用 (本篇)
參考資源