Kubernetes 系列（六）：實戰篇 - 部署完整微服務應用

前言

經過前五篇的學習，你已經掌握了 Kubernetes 的核心概念。本篇將綜合運用這些知識，完成一個完整微服務應用的部署：

設計應用架構
使用 Helm 簡化部署
多環境配置管理
日誌與監控整合
常見問題排除

讓我們開始真正的實戰！

應用架構設計

範例應用：電商平台

我們將部署一個簡化版的電商微服務系統：

graph TD
    USER[使用者] -->|HTTPS| ING[Ingress]

    ING --> GW[API Gateway]

    GW --> USVC[User Service]
    GW --> PSVC[Product Service]
    GW --> OSVC[Order Service]

    USVC --> REDIS[(Redis)]
    PSVC --> PDB[(PostgreSQL<br/>Products)]
    OSVC --> ODB[(PostgreSQL<br/>Orders)]
    OSVC --> MQ[RabbitMQ]

    MQ --> NSVC[Notification Service]
    NSVC --> EMAIL[Email]

    subgraph "可觀測性"
        PROM[Prometheus]
        GRAF[Grafana]
        LOKI[Loki]
    end

目錄結構

k8s-ecommerce/
├── helm/
│   └── ecommerce/
│       ├── Chart.yaml
│       ├── values.yaml
│       ├── values-dev.yaml
│       ├── values-prod.yaml
│       └── templates/
│           ├── _helpers.tpl
│           ├── namespace.yaml
│           ├── api-gateway/
│           ├── user-service/
│           ├── product-service/
│           ├── order-service/
│           ├── databases/
│           ├── redis/
│           ├── rabbitmq/
│           └── ingress.yaml
├── manifests/            # 原始 K8s 清單（備用）
│   └── ...
└── README.md

Helm 簡介

為什麼需要 Helm？

直接使用 kubectl 管理多個 YAML 檔案會遇到以下問題：

問題	Helm 解決方案
多環境配置困難	Values 檔案
重複的配置	模板化
版本管理難	Chart 版本控制
安裝/升級/回滾繁瑣	helm install/upgrade/rollback

Helm 核心概念

graph LR
    CHART[Chart<br/>應用程式包] --> RELEASE[Release<br/>安裝的實例]
    VALUES[Values<br/>配置參數] --> RELEASE
    REPO[Repository<br/>Chart 倉庫] --> CHART

Chart：Helm 套件，包含所有 K8s 資源定義
Values：可配置的參數
Release：Chart 的一個實例
Repository：存放 Chart 的倉庫

Helm 基本指令

# 安裝 Helm（macOS）
brew install helm

# 添加倉庫
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

# 搜尋 Chart
helm search repo nginx
helm search hub nginx

# 查看 Chart 資訊
helm show chart bitnami/nginx
helm show values bitnami/nginx

# 安裝 Release
helm install my-nginx bitnami/nginx
helm install my-nginx bitnami/nginx -f values.yaml
helm install my-nginx bitnami/nginx --set replicaCount=3

# 升級
helm upgrade my-nginx bitnami/nginx -f values.yaml

# 回滾
helm rollback my-nginx 1

# 列出 Releases
helm list

# 卸載
helm uninstall my-nginx

創建 Helm Chart

初始化 Chart

1	helm create ecommerce

這會生成標準的 Chart 結構：

ecommerce/
├── Chart.yaml          # Chart 元數據
├── values.yaml         # 預設配置
├── charts/             # 依賴的 sub-charts
├── templates/          # K8s 資源模板
│   ├── _helpers.tpl    # 模板助手
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── ingress.yaml
│   └── ...
└── .helmignore

Chart.yaml

apiVersion: v2
name: ecommerce
description: E-commerce microservices application
type: application
version: 0.1.0 # Chart 版本
appVersion: "1.0.0" # 應用版本

dependencies:
  - name: postgresql
    version: 12.x.x
    repository: https://charts.bitnami.com/bitnami
    alias: productdb
    condition: productdb.enabled

  - name: postgresql
    version: 12.x.x
    repository: https://charts.bitnami.com/bitnami
    alias: orderdb
    condition: orderdb.enabled

  - name: redis
    version: 17.x.x
    repository: https://charts.bitnami.com/bitnami
    condition: redis.enabled

  - name: rabbitmq
    version: 11.x.x
    repository: https://charts.bitnami.com/bitnami
    condition: rabbitmq.enabled

values.yaml（預設值）

# 全域設定
global:
  imageRegistry: ""
  imagePullSecrets: []
  environment: production

# Namespace
namespace: ecommerce

# API Gateway
apiGateway:
  replicaCount: 2
  image:
    repository: ecommerce/api-gateway
    tag: "1.0.0"
    pullPolicy: IfNotPresent
  resources:
    requests:
      cpu: "100m"
      memory: "128Mi"
    limits:
      cpu: "200m"
      memory: "256Mi"
  service:
    type: ClusterIP
    port: 8080

# User Service
userService:
  replicaCount: 2
  image:
    repository: ecommerce/user-service
    tag: "1.0.0"
    pullPolicy: IfNotPresent
  resources:
    requests:
      cpu: "100m"
      memory: "128Mi"
    limits:
      cpu: "200m"
      memory: "256Mi"

# Product Service
productService:
  replicaCount: 2
  image:
    repository: ecommerce/product-service
    tag: "1.0.0"
    pullPolicy: IfNotPresent

# Order Service
orderService:
  replicaCount: 2
  image:
    repository: ecommerce/order-service
    tag: "1.0.0"
    pullPolicy: IfNotPresent

# Ingress
ingress:
  enabled: true
  className: nginx
  host: ecommerce.example.com
  tls:
    enabled: false
    secretName: ecommerce-tls

# Databases
productdb:
  enabled: true
  auth:
    database: products
    username: product_user

orderdb:
  enabled: true
  auth:
    database: orders
    username: order_user

# Redis
redis:
  enabled: true
  architecture: standalone
  auth:
    enabled: false

# RabbitMQ
rabbitmq:
  enabled: true
  auth:
    username: admin

# HPA
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilization: 70

模板範例

templates/_helpers.tpl

{{/*
Common labels
*/}}
{{- define "ecommerce.labels" -}}
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "ecommerce.selectorLabels" -}}
app.kubernetes.io/name: {{ .name }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
Full name
*/}}
{{- define "ecommerce.fullname" -}}
{{- printf "%s-%s" .Release.Name .Chart.Name | trunc 63 | trimSuffix "-" }}
{{- end }}

templates/api-gateway/deployment.yaml

{{- if .Values.apiGateway }}
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "ecommerce.fullname" . }}-api-gateway
  namespace: {{ .Values.namespace }}
  labels:
    {{- include "ecommerce.labels" . | nindent 4 }}
    app.kubernetes.io/component: api-gateway
spec:
  replicas: {{ .Values.apiGateway.replicaCount }}
  selector:
    matchLabels:
      {{- include "ecommerce.selectorLabels" (dict "name" "api-gateway" "Release" .Release) | nindent 6 }}
  template:
    metadata:
      labels:
        {{- include "ecommerce.selectorLabels" (dict "name" "api-gateway" "Release" .Release) | nindent 8 }}
    spec:
      containers:
      - name: api-gateway
        image: "{{ .Values.apiGateway.image.repository }}:{{ .Values.apiGateway.image.tag }}"
        imagePullPolicy: {{ .Values.apiGateway.image.pullPolicy }}
        ports:
        - containerPort: 8080
        env:
        - name: USER_SERVICE_URL
          value: "http://{{ include "ecommerce.fullname" . }}-user-service:8080"
        - name: PRODUCT_SERVICE_URL
          value: "http://{{ include "ecommerce.fullname" . }}-product-service:8080"
        - name: ORDER_SERVICE_URL
          value: "http://{{ include "ecommerce.fullname" . }}-order-service:8080"
        resources:
          {{- toYaml .Values.apiGateway.resources | nindent 10 }}
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
{{- end }}

templates/ingress.yaml

{{- if .Values.ingress.enabled }}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ include "ecommerce.fullname" . }}-ingress
  namespace: {{ .Values.namespace }}
  labels:
    {{- include "ecommerce.labels" . | nindent 4 }}
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "{{ .Values.ingress.tls.enabled }}"
spec:
  ingressClassName: {{ .Values.ingress.className }}
  {{- if .Values.ingress.tls.enabled }}
  tls:
  - hosts:
    - {{ .Values.ingress.host }}
    secretName: {{ .Values.ingress.tls.secretName }}
  {{- end }}
  rules:
  - host: {{ .Values.ingress.host }}
    http:
      paths:
      - path: /api/users
        pathType: Prefix
        backend:
          service:
            name: {{ include "ecommerce.fullname" . }}-user-service
            port:
              number: 8080
      - path: /api/products
        pathType: Prefix
        backend:
          service:
            name: {{ include "ecommerce.fullname" . }}-product-service
            port:
              number: 8080
      - path: /api/orders
        pathType: Prefix
        backend:
          service:
            name: {{ include "ecommerce.fullname" . }}-order-service
            port:
              number: 8080
      - path: /
        pathType: Prefix
        backend:
          service:
            name: {{ include "ecommerce.fullname" . }}-api-gateway
            port:
              number: 8080
{{- end }}

多環境配置

values-dev.yaml

global:
  environment: development

namespace: ecommerce-dev

apiGateway:
  replicaCount: 1
  resources:
    requests:
      cpu: "50m"
      memory: "64Mi"
    limits:
      cpu: "100m"
      memory: "128Mi"

userService:
  replicaCount: 1

productService:
  replicaCount: 1

orderService:
  replicaCount: 1

ingress:
  enabled: true
  host: dev.ecommerce.local
  tls:
    enabled: false

autoscaling:
  enabled: false

# 開發環境資料庫
productdb:
  primary:
    persistence:
      size: 1Gi

orderdb:
  primary:
    persistence:
      size: 1Gi

values-prod.yaml

global:
  environment: production

namespace: ecommerce-prod

apiGateway:
  replicaCount: 3
  resources:
    requests:
      cpu: "200m"
      memory: "256Mi"
    limits:
      cpu: "500m"
      memory: "512Mi"

userService:
  replicaCount: 3

productService:
  replicaCount: 3

orderService:
  replicaCount: 3

ingress:
  enabled: true
  host: ecommerce.example.com
  tls:
    enabled: true
    secretName: ecommerce-prod-tls

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 20
  targetCPUUtilization: 70

# 生產環境資料庫
productdb:
  primary:
    persistence:
      size: 50Gi
  readReplicas:
    replicaCount: 2

orderdb:
  primary:
    persistence:
      size: 100Gi
  readReplicas:
    replicaCount: 2

redis:
  architecture: replication
  replica:
    replicaCount: 2

rabbitmq:
  replicaCount: 3

部署不同環境

# 開發環境
helm install ecommerce-dev ./ecommerce -f values-dev.yaml -n ecommerce-dev --create-namespace

# 生產環境
helm install ecommerce-prod ./ecommerce -f values-prod.yaml -n ecommerce-prod --create-namespace

# 升級
helm upgrade ecommerce-prod ./ecommerce -f values-prod.yaml -n ecommerce-prod

日誌與監控整合

Prometheus 監控

# prometheus-values.yaml
serverFiles:
  prometheus.yml:
    scrape_configs:
      - job_name: "kubernetes-pods"
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels:
              [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)

Pod 中添加註解：

metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"

服務應用程式中暴露 Metrics

以 Go 語言為例：

import (
    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
)

var (
    httpRequestsTotal = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "http_requests_total",
            Help: "Total number of HTTP requests",
        },
        []string{"method", "endpoint", "status"},
    )
)

func init() {
    prometheus.MustRegister(httpRequestsTotal)
}

// 在路由中添加
http.Handle("/metrics", promhttp.Handler())

Grafana Dashboard

使用 Helm 部署：

helm install grafana grafana/grafana \
  --set adminPassword=admin \
  --set persistence.enabled=true \
  --set persistence.size=5Gi

故障排除指南

常見問題診斷流程

graph TD
    A[問題發生] --> B{Pod 狀態?}

    B -->|Pending| C[檢查調度問題]
    B -->|CrashLoopBackOff| D[檢查容器日誌]
    B -->|ImagePullBackOff| E[檢查映像檔名稱/權限]
    B -->|Running 但不工作| F[檢查服務連線]

    C --> C1[資源不足?]
    C --> C2[Node Selector/Affinity?]
    C --> C3[Taints/Tolerations?]

    D --> D1[應用程式錯誤]
    D --> D2[配置錯誤]
    D --> D3[資源不足 OOMKilled]

    E --> E1[映像檔不存在]
    E --> E2[Registry 認證]

    F --> F1[Service 選擇器正確?]
    F --> F2[端口配置正確?]
    F --> F3[網路策略阻擋?]

常用診斷指令

# 查看 Pod 狀態
kubectl get pods -o wide
kubectl describe pod <pod-name>

# 查看 Pod 日誌
kubectl logs <pod-name>
kubectl logs <pod-name> -c <container-name>  # 多容器
kubectl logs <pod-name> --previous           # 上一次崩潰的日誌

# 進入 Pod 偵錯
kubectl exec -it <pod-name> -- /bin/sh
kubectl exec -it <pod-name> -c <container-name> -- /bin/sh

# 查看事件
kubectl get events --sort-by='.lastTimestamp'
kubectl get events -n <namespace> --field-selector type=Warning

# 查看資源使用
kubectl top nodes
kubectl top pods

# 網路偵錯
kubectl run debug --image=busybox --rm -it -- /bin/sh
# 在 debug Pod 中
nslookup <service-name>
wget -qO- <service-name>:<port>

# 查看 Endpoints
kubectl get endpoints <service-name>

# 查看 Ingress
kubectl describe ingress <ingress-name>

常見問題解決

1. Pod 一直 Pending

kubectl describe pod <pod-name>
# 查看 Events 部分

# 可能原因：
# - 資源不足
kubectl describe nodes | grep -A 5 "Allocated resources"

# - PVC 未綁定
kubectl get pvc
kubectl describe pvc <pvc-name>

2. CrashLoopBackOff

# 查看日誌
kubectl logs <pod-name> --previous

# 常見原因：
# - 應用程式啟動失敗
# - 配置錯誤
# - 缺少環境變數
# - 無法連線資料庫

3. Service 無法存取

# 檢查 Endpoints
kubectl get endpoints <service-name>
# 如果為空，檢查 selector 是否正確

# 檢查 Pod labels
kubectl get pods --show-labels

# 在叢集內測試
kubectl run debug --image=busybox --rm -it -- /bin/sh
wget -qO- <service-name>:<port>

4. Ingress 不工作

# 確認 Ingress Controller 運行中
kubectl get pods -n ingress-nginx

# 查看 Ingress 詳情
kubectl describe ingress <ingress-name>

# 確認 DNS/hosts 設定正確
# 對於本地測試，添加 /etc/hosts：
# 127.0.0.1 myapp.local

生產環境清單

部署前檢查清單

## 資源配置

- [ ] 所有 Pod 都設定了 requests/limits
- [ ] 設定了適當的 QoS（Guaranteed 用於關鍵服務）
- [ ] 配置了 LimitRange 和 ResourceQuota

## 高可用

- [ ] replicas >= 2
- [ ] 設定了 PodDisruptionBudget
- [ ] 跨 AZ 部署（podAntiAffinity）

## 健康檢查

- [ ] 所有服務都有 livenessProbe
- [ ] 所有服務都有 readinessProbe
- [ ] 慢啟動服務設定 startupProbe

## 擴展

- [ ] 配置了 HPA
- [ ] 設定合理的 min/max replicas

## 安全

- [ ] 使用 Secret 管理敏感資訊
- [ ] 配置了 Network Policy
- [ ] 使用非 root 用戶運行
- [ ] 設定了 SecurityContext

## 可觀測性

- [ ] 配置了 Prometheus/Grafana
- [ ] 設定了日誌收集
- [ ] 關鍵服務有告警規則

## 備份與恢復

- [ ] 資料庫有定期備份
- [ ] 測試過恢復流程

最終部署配置模板

# production-deployment-template.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .name }}
  labels:
    app: {{ .name }}
    version: {{ .version }}
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: {{ .name }}
  template:
    metadata:
      labels:
        app: {{ .name }}
        version: {{ .version }}
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: {{ .name }}
              topologyKey: topology.kubernetes.io/zone

      securityContext:
        runAsNonRoot: true
        runAsUser: 1000

      containers:
      - name: {{ .name }}
        image: {{ .image }}:{{ .tag }}
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080

        resources:
          requests:
            cpu: "250m"
            memory: "256Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"

        startupProbe:
          httpGet:
            path: /healthz
            port: 8080
          failureThreshold: 30
          periodSeconds: 10

        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          periodSeconds: 10
          failureThreshold: 3

        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          periodSeconds: 5
          failureThreshold: 3

        securityContext:
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL

        envFrom:
        - configMapRef:
            name: {{ .name }}-config
        - secretRef:
            name: {{ .name }}-secrets

---
apiVersion: v1
kind: Service
metadata:
  name: {{ .name }}
spec:
  selector:
    app: {{ .name }}
  ports:
  - port: 80
    targetPort: 8080

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: {{ .name }}-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{ .name }}
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: {{ .name }}-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: {{ .name }}

本章重點回顧

Helm

簡化部署：模板化 + 版本控制
多環境支援：values-dev.yaml / values-prod.yaml
依賴管理：Chart dependencies

多環境配置

環境隔離：使用不同 Namespace
資源差異化：dev 小、prod 大
功能開關：條件性啟用功能

故障排除

診斷流程：狀態 → 事件 → 日誌 → 偵錯
常用指令：describe、logs、exec、events
網路偵錯：使用 debug Pod

生產就緒

資源管理：requests/limits
高可用：replicas、anti-affinity、PDB
健康檢查：三種探針
安全：SecurityContext、Network Policy

系列總結

恭喜你完成了整個 Kubernetes 系列教學！讓我們回顧一下學習路徑：

graph LR
    P1[入門概念] --> P2[工作負載]
    P2 --> P3[網路]
    P3 --> P4[配置存儲]
    P4 --> P5[資源擴展]
    P5 --> P6[實戰部署]

    style P1 fill:#4ecdc4
    style P2 fill:#95e1d3
    style P3 fill:#f38181
    style P4 fill:#fce38a
    style P5 fill:#eaffd0
    style P6 fill:#aa96da

下一步建議

持續實踐：在自己的專案中應用所學
深入特定領域：
- StatefulSet 有狀態應用
- Operator Pattern
- Service Mesh（Istio、Linkerd）
考取認證：
- CKA（Certified Kubernetes Administrator）
- CKAD（Certified Kubernetes Application Developer）

系列文章導覽

Part 1：入門篇 - 認識 K8s 與核心架構
Part 2：基礎篇 - Pod、Deployment 與 Service
Part 3：網路篇 - 服務發現與流量管理
Part 4：配置與存儲篇 - ConfigMap、Secret 與 Volume
Part 5：進階篇 - 資源管理與自動擴展
Part 6：實戰篇 - 部署完整微服務應用（本篇）

前言

應用架構設計

範例應用：電商平台

目錄結構

Helm 簡介

為什麼需要 Helm？

Helm 核心概念

Helm 基本指令

創建 Helm Chart

初始化 Chart

Chart.yaml

values.yaml（預設值）

模板範例

templates/_helpers.tpl

templates/api-gateway/deployment.yaml

templates/ingress.yaml

多環境配置

values-dev.yaml

values-prod.yaml

部署不同環境

日誌與監控整合

Prometheus 監控

服務應用程式中暴露 Metrics

Grafana Dashboard

故障排除指南

常見問題診斷流程

常用診斷指令

常見問題解決

1. Pod 一直 Pending

2. CrashLoopBackOff

3. Service 無法存取

4. Ingress 不工作

生產環境清單

部署前檢查清單

最終部署配置模板

本章重點回顧

Helm

多環境配置

故障排除

生產就緒

系列總結

下一步建議

系列文章導覽

參考資源