AWS 服務系列（三）：儲存與資料庫服務 - S3、RDS、DynamoDB 與 ElastiCache 實戰指南

前言

資料是應用程式的核心，選擇正確的儲存與資料庫服務直接影響系統的效能、可擴展性與成本。AWS 提供了豐富的儲存與資料庫選項，從物件儲存到關聯式資料庫，再到 NoSQL 與快取服務。

作為後端工程師，我們需要理解每種服務的特性、適用場景與最佳實踐。本篇將深入探討：

S3 物件儲存：儲存類別、生命週期管理、安全配置
RDS 關聯式資料庫：引擎選擇、Multi-AZ、讀取複本
DynamoDB NoSQL：資料建模、索引設計、容量規劃
ElastiCache 快取：Redis vs Memcached、快取策略

儲存與資料庫服務全景

graph TB
    subgraph "物件儲存"
        A[S3]
    end
    
    subgraph "區塊儲存"
        B[EBS]
    end
    
    subgraph "檔案儲存"
        C[EFS]
        D[FSx]
    end
    
    subgraph "關聯式資料庫"
        E[RDS]
        F[Aurora]
    end
    
    subgraph "NoSQL 資料庫"
        G[DynamoDB]
        H[DocumentDB]
    end
    
    subgraph "快取"
        I[ElastiCache]
    end
    
    subgraph "資料倉儲"
        J[Redshift]
    end

服務	類型	典型用途
S3	物件儲存	靜態資源、備份、資料湖
EBS	區塊儲存	EC2 硬碟、資料庫儲存
RDS/Aurora	關聯式資料庫	交易系統、結構化資料
DynamoDB	NoSQL	高吞吐量、低延遲存取
ElastiCache	記憶體快取	Session、熱點資料快取

S3：Simple Storage Service

S3 核心概念

S3 是 AWS 最具代表性的服務之一，提供幾乎無限的物件儲存容量。

graph LR
    subgraph "S3 結構"
        A[Bucket 儲存桶] --> B[Object 物件]
        B --> C[Key 金鑰/路徑]
        B --> D[Value 資料內容]
        B --> E[Metadata 元資料]
        B --> F[Version ID 版本]
    end

核心特性：

單一物件最大 5TB
11 個 9 的持久性（99.999999999%）
自動跨 AZ 複製
支援版本控制與物件鎖定

儲存類別與成本優化

S3 提供多種儲存類別，根據存取頻率選擇可大幅節省成本：

graph TD
    A[S3 Standard] -->|30天未存取| B[S3 Standard-IA]
    B -->|90天未存取| C[S3 Glacier Instant]
    C -->|180天未存取| D[S3 Glacier Flexible]
    D -->|365天未存取| E[S3 Glacier Deep Archive]

儲存類別	存取頻率	最低儲存時間	取回時間	成本（相對）
Standard	頻繁	無	即時	100%
Standard-IA	不頻繁	30 天	即時	~45%
One Zone-IA	不頻繁	30 天	即時	~36%
Glacier Instant	極少	90 天	毫秒級	~20%
Glacier Flexible	存檔	90 天	分鐘-小時	~10%
Glacier Deep Archive	長期存檔	180 天	12-48 小時	~3%
Intelligent-Tiering	不確定	無	即時	自動優化

生命週期政策實務

自動管理物件的儲存類別轉換與刪除：

{
    "Rules": [
        {
            "ID": "ArchiveOldLogs",
            "Status": "Enabled",
            "Filter": {
                "Prefix": "logs/"
            },
            "Transitions": [
                {
                    "Days": 30,
                    "StorageClass": "STANDARD_IA"
                },
                {
                    "Days": 90,
                    "StorageClass": "GLACIER"
                }
            ],
            "Expiration": {
                "Days": 365
            }
        },
        {
            "ID": "CleanupIncompleteUploads",
            "Status": "Enabled",
            "Filter": {},
            "AbortIncompleteMultipartUpload": {
                "DaysAfterInitiation": 7
            }
        }
    ]
}

實務建議：

日誌檔案：30 天後轉 IA，90 天後轉 Glacier，1 年後刪除
使用者上傳：設定未完成的 Multipart Upload 自動清理
備份資料：根據合規要求設定保留期限

S3 安全配置

Bucket Policy vs ACL

Bucket Policy（推薦）：

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowCloudFrontAccess",
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::my-bucket/*",
            "Condition": {
                "StringEquals": {
                    "AWS:SourceArn": "arn:aws:cloudfront::123456789012:distribution/EXAMPLE"
                }
            }
        }
    ]
}

安全最佳實踐：

封鎖公開存取：預設啟用 Block Public Access
加密：啟用 SSE-S3 或 SSE-KMS 加密
版本控制：防止意外刪除，搭配 MFA Delete
存取日誌：啟用 Server Access Logging

S3 與 Go 語言整合

package main

import (
    "context"
    "fmt"
    "io"
    "os"
    "time"

    "github.com/aws/aws-sdk-go-v2/aws"
    "github.com/aws/aws-sdk-go-v2/config"
    "github.com/aws/aws-sdk-go-v2/service/s3"
)

type S3Client struct {
    client *s3.Client
    bucket string
}

func NewS3Client(bucket string) (*S3Client, error) {
    cfg, err := config.LoadDefaultConfig(context.TODO())
    if err != nil {
        return nil, err
    }
    
    return &S3Client{
        client: s3.NewFromConfig(cfg),
        bucket: bucket,
    }, nil
}

// 上傳檔案
func (s *S3Client) Upload(ctx context.Context, key string, body io.Reader) error {
    _, err := s.client.PutObject(ctx, &s3.PutObjectInput{
        Bucket: aws.String(s.bucket),
        Key:    aws.String(key),
        Body:   body,
    })
    return err
}

// 產生預簽名 URL（用於臨時存取）
func (s *S3Client) GetPresignedURL(ctx context.Context, key string, expiry time.Duration) (string, error) {
    presignClient := s3.NewPresignClient(s.client)
    
    request, err := presignClient.PresignGetObject(ctx, &s3.GetObjectInput{
        Bucket: aws.String(s.bucket),
        Key:    aws.String(key),
    }, s3.WithPresignExpires(expiry))
    
    if err != nil {
        return "", err
    }
    
    return request.URL, nil
}

// 串流下載大型檔案
func (s *S3Client) Download(ctx context.Context, key string, dest io.Writer) error {
    result, err := s.client.GetObject(ctx, &s3.GetObjectInput{
        Bucket: aws.String(s.bucket),
        Key:    aws.String(key),
    })
    if err != nil {
        return err
    }
    defer result.Body.Close()
    
    _, err = io.Copy(dest, result.Body)
    return err
}

S3 效能優化

請求速率：

單一 Prefix 可達 3,500 PUT/POST/DELETE + 5,500 GET 請求/秒
使用隨機化 Prefix 分散負載

// 使用日期 + 隨機前綴分散請求
func generateKey(filename string) string {
    now := time.Now()
    randomPrefix := uuid.New().String()[:8]
    return fmt.Sprintf("%s/%s/%s/%s", 
        randomPrefix,
        now.Format("2006/01/02"),
        now.Format("15"),
        filename,
    )
}

Multipart Upload：

檔案 > 100MB 建議使用
可平行上傳、失敗重試單一部分

RDS：關聯式資料庫服務

RDS 概覽

RDS（Relational Database Service）提供託管式關聯式資料庫，支援多種資料庫引擎。

graph TB
    subgraph "RDS 支援的引擎"
        A[MySQL]
        B[PostgreSQL]
        C[MariaDB]
        D[Oracle]
        E[SQL Server]
        F[Aurora MySQL]
        G[Aurora PostgreSQL]
    end

RDS 託管內容：

自動備份與快照
軟體修補與更新
故障偵測與復原
監控與指標收集

資料庫引擎選擇

引擎	優勢	適用場景
PostgreSQL	進階功能、JSONB、擴展性	複雜查詢、地理資料、分析
MySQL	成熟穩定、廣泛支援	Web 應用、CMS
Aurora	效能提升 5x、自動擴展	高效能需求、大規模應用
MariaDB	MySQL 替代、開源	MySQL 相容需求

Aurora vs 標準 RDS：

graph LR
    subgraph "標準 RDS"
        A[Primary] -->|同步複製| B[Standby]
        A --> C[EBS Volume]
        B --> D[EBS Volume]
    end
    
    subgraph "Aurora"
        E[Primary] --> H[共享儲存層]
        F[Replica 1] --> H
        G[Replica 2] --> H
        H -->|6份複製| I[跨3個AZ]
    end

Aurora 優勢：

儲存自動擴展（10GB - 128TB）
6 份資料複製跨 3 個 AZ
15 個讀取複本（RDS 只有 5 個）
更快的故障轉移（< 30 秒）

Multi-AZ 與 Read Replica

graph TB
    subgraph "Multi-AZ（高可用）"
        A[Primary - AZ1] <-->|同步複製| B[Standby - AZ2]
        C[應用程式] --> A
        A -.->|自動故障轉移| B
    end
    
    subgraph "Read Replica（讀取擴展）"
        D[Primary] -->|非同步複製| E[Replica 1]
        D -->|非同步複製| F[Replica 2]
        G[寫入請求] --> D
        H[讀取請求] --> E
        H --> F
    end

特性	Multi-AZ	Read Replica
目的	高可用性	讀取擴展
複製方式	同步	非同步
可讀取	否（Standby 不可存取）	是
故障轉移	自動	手動提升
跨 Region	否	可以

RDS 參數調優

參數群組配置：

# PostgreSQL 效能調優參數
shared_buffers: "{DBInstanceClassMemory/4}"  # 25% 記憶體
effective_cache_size: "{DBInstanceClassMemory*3/4}"
work_mem: "256MB"
maintenance_work_mem: "512MB"
random_page_cost: 1.1  # SSD 環境
effective_io_concurrency: 200

# 連線管理
max_connections: 200
idle_in_transaction_session_timeout: 60000  # 60 秒

# WAL 設定
wal_buffers: "64MB"
checkpoint_completion_target: 0.9

RDS Proxy：連線池管理

對於 Lambda 或高並發場景，RDS Proxy 是必備的：

sequenceDiagram
    participant L1 as Lambda 1
    participant L2 as Lambda 2
    participant L3 as Lambda N
    participant P as RDS Proxy
    participant DB as RDS Database
    
    L1->>P: 請求連線
    L2->>P: 請求連線
    L3->>P: 請求連線
    P->>DB: 復用連線池
    Note over P,DB: 連線池管理<br/>減少資料庫連線數

為什麼需要 RDS Proxy：

Lambda 每次 cold start 會建立新連線
資料庫連線數有限（依規格）
連線建立有成本（TCP + TLS 握手）

// 使用 RDS Proxy 連線
package main

import (
    "context"
    "database/sql"
    "fmt"
    
    "github.com/aws/aws-sdk-go-v2/config"
    "github.com/aws/aws-sdk-go-v2/feature/rds/auth"
    _ "github.com/lib/pq"
)

func connectWithIAMAuth() (*sql.DB, error) {
    cfg, err := config.LoadDefaultConfig(context.TODO())
    if err != nil {
        return nil, err
    }
    
    // 使用 IAM 認證取得臨時密碼
    authToken, err := auth.BuildAuthToken(
        context.TODO(),
        "my-proxy.proxy-xxxxx.ap-northeast-1.rds.amazonaws.com:5432",
        "ap-northeast-1",
        "dbuser",
        cfg.Credentials,
    )
    if err != nil {
        return nil, err
    }
    
    dsn := fmt.Sprintf(
        "host=%s port=5432 user=%s password=%s dbname=%s sslmode=require",
        "my-proxy.proxy-xxxxx.ap-northeast-1.rds.amazonaws.com",
        "dbuser",
        authToken,
        "mydb",
    )
    
    return sql.Open("postgres", dsn)
}

DynamoDB：NoSQL 資料庫

DynamoDB 核心概念

DynamoDB 是 AWS 的全託管 NoSQL 資料庫，提供毫秒級延遲與自動擴展。

graph LR
    subgraph "DynamoDB 結構"
        A[Table 資料表] --> B[Item 項目]
        B --> C[Partition Key 分區鍵]
        B --> D[Sort Key 排序鍵]
        B --> E[Attributes 屬性]
    end

主鍵設計：

類型	組成	使用時機
Simple Primary Key	Partition Key	單一屬性可唯一識別
Composite Primary Key	Partition Key + Sort Key	需要範圍查詢或一對多關係

資料建模實務

範例：電商訂單系統

graph TD
    subgraph "Single Table Design"
        A[Orders Table]
        A --> B["PK: USER#123<br/>SK: ORDER#2024-001"]
        A --> C["PK: USER#123<br/>SK: ORDER#2024-002"]
        A --> D["PK: ORDER#2024-001<br/>SK: ITEM#1"]
        A --> E["PK: ORDER#2024-001<br/>SK: ITEM#2"]
    end

// DynamoDB Single Table Design 範例
package main

import (
    "context"
    "fmt"
    "time"

    "github.com/aws/aws-sdk-go-v2/aws"
    "github.com/aws/aws-sdk-go-v2/feature/dynamodb/attributevalue"
    "github.com/aws/aws-sdk-go-v2/service/dynamodb"
    "github.com/aws/aws-sdk-go-v2/service/dynamodb/types"
)

// 訂單實體
type Order struct {
    PK          string    `dynamodbav:"PK"`
    SK          string    `dynamodbav:"SK"`
    GSI1PK      string    `dynamodbav:"GSI1PK"`      // 用於查詢
    GSI1SK      string    `dynamodbav:"GSI1SK"`
    OrderID     string    `dynamodbav:"OrderID"`
    UserID      string    `dynamodbav:"UserID"`
    Status      string    `dynamodbav:"Status"`
    TotalAmount float64   `dynamodbav:"TotalAmount"`
    CreatedAt   time.Time `dynamodbav:"CreatedAt"`
}

func NewOrder(userID, orderID string, amount float64) Order {
    now := time.Now()
    return Order{
        PK:          fmt.Sprintf("USER#%s", userID),
        SK:          fmt.Sprintf("ORDER#%s", orderID),
        GSI1PK:      fmt.Sprintf("ORDER#%s", orderID),
        GSI1SK:      fmt.Sprintf("ORDER#%s", orderID),
        OrderID:     orderID,
        UserID:      userID,
        Status:      "PENDING",
        TotalAmount: amount,
        CreatedAt:   now,
    }
}

// 查詢使用者的所有訂單
func (c *DynamoClient) GetUserOrders(ctx context.Context, userID string) ([]Order, error) {
    result, err := c.client.Query(ctx, &dynamodb.QueryInput{
        TableName:              aws.String("Orders"),
        KeyConditionExpression: aws.String("PK = :pk AND begins_with(SK, :sk)"),
        ExpressionAttributeValues: map[string]types.AttributeValue{
            ":pk": &types.AttributeValueMemberS{Value: fmt.Sprintf("USER#%s", userID)},
            ":sk": &types.AttributeValueMemberS{Value: "ORDER#"},
        },
        ScanIndexForward: aws.Bool(false), // 最新的在前
    })
    if err != nil {
        return nil, err
    }

    var orders []Order
    err = attributevalue.UnmarshalListOfMaps(result.Items, &orders)
    return orders, err
}

GSI 與 LSI 設計

graph TB
    subgraph "主表"
        A[PK: UserID] --> B[SK: OrderID]
    end
    
    subgraph "GSI1 - 按狀態查詢"
        C[GSI1PK: Status] --> D[GSI1SK: CreatedAt]
    end
    
    subgraph "GSI2 - 按日期查詢"
        E[GSI2PK: Date] --> F[GSI2SK: UserID]
    end

索引類型	建立時機	Key 限制	容量
LSI	只能在建表時	必須使用相同 PK	與主表共享
GSI	隨時可建立	可使用任意屬性	獨立配置

容量模式選擇

On-Demand vs Provisioned：

模式	計費方式	適用場景
On-Demand	按請求計費	流量不可預測、新應用
Provisioned	按容量計費	穩定流量、可預測負載

graph TD
    A[選擇容量模式] --> B{流量模式?}
    B -->|穩定可預測| C[Provisioned + Auto Scaling]
    B -->|波動大/不確定| D[On-Demand]
    B -->|有明顯尖峰| E[Provisioned + 預留容量]

DynamoDB 效能優化

熱分區問題：

graph LR
    subgraph "不良設計"
        A[PK: 2024-12-15] --> B[所有請求集中]
    end
    
    subgraph "良好設計"
        C["PK: 2024-12-15#0"] --> D[分散請求]
        E["PK: 2024-12-15#1"] --> D
        F["PK: 2024-12-15#2"] --> D
    end

Write Sharding 技巧：

// 使用 Suffix 分散寫入
func getShardedPK(baseKey string, shardCount int) string {
    shard := rand.Intn(shardCount)
    return fmt.Sprintf("%s#%d", baseKey, shard)
}

// 查詢時需要合併多個 Shard
func queryAllShards(ctx context.Context, baseKey string, shardCount int) ([]Item, error) {
    var allItems []Item
    var wg sync.WaitGroup
    resultChan := make(chan []Item, shardCount)
    
    for i := 0; i < shardCount; i++ {
        wg.Add(1)
        go func(shard int) {
            defer wg.Done()
            pk := fmt.Sprintf("%s#%d", baseKey, shard)
            items, _ := queryByPK(ctx, pk)
            resultChan <- items
        }(i)
    }
    
    go func() {
        wg.Wait()
        close(resultChan)
    }()
    
    for items := range resultChan {
        allItems = append(allItems, items...)
    }
    
    return allItems, nil
}

ElastiCache：記憶體快取

Redis vs Memcached

graph TB
    subgraph "ElastiCache"
        A[Redis] --> B[資料結構豐富]
        A --> C[持久化]
        A --> D[複製/叢集]
        A --> E[Pub/Sub]
        
        F[Memcached] --> G[簡單 Key-Value]
        F --> H[多執行緒]
        F --> I[無持久化]
    end

特性	Redis	Memcached
資料結構	String, List, Set, Hash, ZSet	String only
持久化	支援（RDB, AOF）	不支援
複製	支援	不支援
叢集模式	支援	支援（簡單分片）
交易	支援	不支援
Pub/Sub	支援	不支援
Lua 腳本	支援	不支援

選擇建議：

選 Redis：需要複雜資料結構、持久化、Pub/Sub
選 Memcached：純快取場景、極致效能、簡單 Key-Value

快取策略設計

graph TD
    subgraph "Cache-Aside Pattern"
        A[應用程式] -->|1. 查詢快取| B[ElastiCache]
        B -->|2. Cache Miss| A
        A -->|3. 查詢資料庫| C[RDS]
        C -->|4. 回傳資料| A
        A -->|5. 寫入快取| B
    end

常見快取模式：

模式	說明	適用場景
Cache-Aside	應用程式管理快取	讀取密集、資料可容忍短暫不一致
Read-Through	快取層自動載入	簡化應用程式邏輯
Write-Through	寫入時同步更新快取	需要強一致性
Write-Behind	寫入快取後非同步更新 DB	寫入密集、可容忍延遲

Go 語言 Redis 整合

package cache

import (
    "context"
    "encoding/json"
    "time"

    "github.com/redis/go-redis/v9"
)

type CacheClient struct {
    client *redis.Client
}

func NewCacheClient(addr string) *CacheClient {
    client := redis.NewClient(&redis.Options{
        Addr:         addr,
        PoolSize:     100,
        MinIdleConns: 10,
        DialTimeout:  5 * time.Second,
        ReadTimeout:  3 * time.Second,
        WriteTimeout: 3 * time.Second,
    })
    
    return &CacheClient{client: client}
}

// Cache-Aside Pattern 實現
func (c *CacheClient) GetOrSet(ctx context.Context, key string, ttl time.Duration, loader func() (interface{}, error)) (interface{}, error) {
    // 1. 嘗試從快取取得
    val, err := c.client.Get(ctx, key).Result()
    if err == nil {
        var result interface{}
        json.Unmarshal([]byte(val), &result)
        return result, nil
    }
    
    if err != redis.Nil {
        return nil, err
    }
    
    // 2. Cache Miss，從資料來源載入
    data, err := loader()
    if err != nil {
        return nil, err
    }
    
    // 3. 寫入快取
    jsonData, _ := json.Marshal(data)
    c.client.Set(ctx, key, jsonData, ttl)
    
    return data, nil
}

// 使用 Lua 腳本實現分散式鎖
func (c *CacheClient) AcquireLock(ctx context.Context, key string, ttl time.Duration) (bool, error) {
    script := redis.NewScript(`
        if redis.call("SET", KEYS[1], ARGV[1], "NX", "PX", ARGV[2]) then
            return 1
        else
            return 0
        end
    `)
    
    result, err := script.Run(ctx, c.client, []string{key}, "locked", ttl.Milliseconds()).Int()
    return result == 1, err
}

// 快取預熱
func (c *CacheClient) WarmUp(ctx context.Context, keys []string, loader func(key string) (interface{}, error)) error {
    pipe := c.client.Pipeline()
    
    for _, key := range keys {
        data, err := loader(key)
        if err != nil {
            continue
        }
        jsonData, _ := json.Marshal(data)
        pipe.Set(ctx, key, jsonData, 1*time.Hour)
    }
    
    _, err := pipe.Exec(ctx)
    return err
}

ElastiCache 最佳實踐

叢集模式配置：

graph TB
    subgraph "Redis Cluster Mode"
        A[Shard 1] --> A1[Primary]
        A --> A2[Replica]
        
        B[Shard 2] --> B1[Primary]
        B --> B2[Replica]
        
        C[Shard 3] --> C1[Primary]
        C --> C2[Replica]
    end
    
    D[客戶端] --> A1
    D --> B1
    D --> C1

關鍵配置：

叢集模式：資料量大時啟用，自動分片
Multi-AZ：生產環境必須啟用
節點類型：根據資料量選擇（cache.r6g 系列記憶體優化）
預留節點：穩定負載可節省 30-50% 成本

儲存與資料庫選擇決策

flowchart TD
    A[選擇資料儲存] --> B{資料類型?}
    
    B -->|檔案/物件| C[S3]
    B -->|結構化/關聯| D{需要複雜查詢?}
    B -->|Key-Value/文件| E{延遲要求?}
    
    D -->|是| F[RDS/Aurora]
    D -->|否| G{規模?}
    G -->|大| H[Aurora]
    G -->|中小| F
    
    E -->|毫秒級| I[DynamoDB]
    E -->|可接受較高| J{需要彈性 Schema?}
    J -->|是| I
    J -->|否| F
    
    K[需要快取?] -->|是| L[ElastiCache]

本章重點回顧

關鍵要點

S3
- 善用儲存類別與生命週期降低成本
- 預設封鎖公開存取，使用 Bucket Policy 管理權限
- 大檔案使用 Multipart Upload
RDS/Aurora
- Aurora 效能更好但成本較高
- Multi-AZ 用於高可用，Read Replica 用於讀取擴展
- Lambda 場景使用 RDS Proxy 管理連線
DynamoDB
- 設計好的 Partition Key 避免熱分區
- Single Table Design 減少請求次數
- On-Demand 適合不可預測流量
ElastiCache
- Redis 功能豐富，Memcached 追求極致效能
- Cache-Aside 是最常用的快取模式
- 注意快取穿透、雪崩、擊穿問題