Kong - 自动化 深入理解
发布时间:2026-04-30 06:01
Kong自动化运维:批量管理脚本与配置同步实战指南
一、前言
搞过Kong运维的都知道, gateway实例多了之后,手动敲admin API简直要命。服务路由改一次要登录三台机器,配置不一致排查半天,扩缩容靠人肉操作累到吐。这篇聊透怎么用脚本把Kong的日常运维彻底自动化,包括批量服务管理、配置备份恢复、健康检查自动化。
二、操作步骤
步骤1:搭建自动化基础环境
先准备脚本运行环境,确保Kong Admin API可访问。
# 检查Kong Admin API连通性
curl -s http://kong-admin-server:8001 | head -20
# 预期输出:
# {"plugins":{"enabled_in_default":true,"available":{...},"configured_on_first_request":false},
# "tagline":"Welcome to proxy","version":"3.4.0",
# "prng_seeds":{"pid":42},"node_id":"a1b2c3d4-xxxx",
# "configuration":{...},"links":{}}
创建自动化脚本目录结构:
mkdir -p /opt/kong-automation/{scripts,config,backup,logs}
cd /opt/kong-automation
# 创建配置文件
cat > config/kong-hosts.ini << 'EOF'
[production]
kong_admin_url = http://kong-prod-01:8001
kong_admin_url2 = http://kong-prod-02:8001
[staging]
kong_admin_url = http://kong-staging:8001
EOF
步骤2:编写批量服务管理脚本
这个脚本解决手动创建服务繁琐的问题,一行命令搞定批量导入。
cat > scripts/batch-create-services.sh << 'EOF'
#!/bin/bash
set -e
KONG_ADMIN="${KONG_ADMIN_URL:-http://localhost:8001}"
SERVICES_FILE="${1:-config/services.json}"
echo "=== 开始批量创建服务 ==="
echo "Admin URL: ${KONG_ADMIN}"
echo "服务文件: ${SERVICES_FILE}"
# 读取JSON格式的服务列表
while IFS= read -r service; do
SERVICE_NAME=$(echo "$service" | jq -r '.name')
SERVICE_URL=$(echo "$service" | jq -r '.url')
echo ">>> 创建服务: ${SERVICE_NAME} -> ${SERVICE_URL}"
RESPONSE=$(curl -s -X POST "${KONG_ADMIN}/services"
-H "Content-Type: application/json"
-d "$service")
if echo "$RESPONSE" | jq -e '.id' > /dev/null 2>&1; then
echo "✓ 服务 ${SERVICE_NAME} 创建成功,ID: $(echo $RESPONSE | jq -r '.id')"
else
ERROR_MSG=$(echo "$RESPONSE" | jq -r '.message // .name')
if [[ "$ERROR_MSG" == "already exists" ]]; then
echo "⚠ 服务 ${SERVICE_NAME} 已存在,跳过"
else
echo "✗ 失败: $ERROR_MSG"
fi
fi
done < <(cat "$SERVICES_FILE" | jq -c '.[]')
echo "=== 批量创建完成 ==="
EOF
chmod +x scripts/batch-create-services.sh
服务配置文件格式(config/services.json):
cat > config/services.json << 'EOF'
[
{
"name": "user-service",
"url": "http://user-service.internal:8080",
"retries": 3,
"timeout": 5000
},
{
"name": "order-service",
"url": "http://order-service.internal:8080",
"retries": 3,
"timeout": 5000
}
]
EOF
执行脚本测试:
./scripts/batch-create-services.sh config/services.json
# 预期输出:
# === 开始批量创建服务 ===
# Admin URL: http://localhost:8001
# 服务文件: config/services.json
# >>> 创建服务: user-service -> http://user-service.internal:8080
# ✓ 服务 user-service 创建成功,ID: a1b2c3d4-e5f6-7890-abcd-ef1234567890
# >>> 创建服务: order-service -> http://order-service.internal:8080
# ✓ 服务 order-service 创建成功,ID: b2c3d4e5-f6a7-8901-bcde-f23456789012
# === 批量创建完成 ===
步骤3:编写路由批量绑定脚本
服务建好了,接下来批量给服务绑定路由。
cat > scripts/batch-create-routes.sh << 'EOF'
#!/bin/bash
set -e
KONG_ADMIN="${KONG_ADMIN_URL:-http://localhost:8001}"
# 根据服务名获取ID
get_service_id() {
local svc_name="$1"
curl -s "${KONG_ADMIN}/services/${svc_name}" | jq -r '.id'
}
# 批量创建路由
create_route() {
local service_id="$1"
local route_name="$2"
local methods="$3"
local paths="$4"
curl -s -X POST "${KONG_ADMIN}/services/${service_id}/routes"
-H "Content-Type: application/json"
-d "{
\"name\": \"${route_name}\",
\"methods\": [${methods}],
\"paths\": [${paths}],
\"strip_path\": true
}" | jq -r '.id // .message'
}
echo "=== 批量创建路由 ==="
# 示例:给user-service绑定多个路由
SERVICE_ID=$(get_service_id "user-service")
if [[ "$SERVICE_ID" != "null" ]]; then
ROUTE_ID=$(create_route "$SERVICE_ID" "user-api-v1" '"GET","POST"' '"/api/v1/users"')
echo "✓ user-api-v1 路由创建: $ROUTE_ID"
ROUTE_ID=$(create_route "$SERVICE_ID" "user-api-v2" '"GET","PUT","DELETE"' '"/api/v2/users"')
echo "✓ user-api-v2 路由创建: $ROUTE_ID"
else
echo "✗ 服务 user-service 不存在"
fi
EOF
chmod +x scripts/batch-create-routes.sh
执行结果:
./scripts/batch-create-routes.sh
# 预期输出:
# === 批量创建路由 ===
# ✓ user-api-v1 路由创建: c3d4e5f6-a7b8-9012-cdef-345678901234
# ✓ user-api-v2 路由创建: d4e5f6a7-b8c9-0123-def0-456789012345
步骤4:配置备份与恢复脚本
⚠️ 重要:生产环境修改配置前必须备份,这个脚本一键导出所有配置到JSON文件。
cat > scripts/backup-config.sh << 'EOF'
#!/bin/bash
KONG_ADMIN="${KONG_ADMIN_URL:-http://localhost:8001}"
BACKUP_DIR="/opt/kong-automation/backup"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
mkdir -p "${BACKUP_DIR}"
backup_endpoint() {
local endpoint="$1"
local filename="$2"
local full_url="${KONG_ADMIN}${endpoint}"
echo "备份: ${endpoint} -> ${filename}"
curl -s "${full_url}" | jq '.' > "${BACKUP_DIR}/${TIMESTAMP}_${filename}"
if [[ ${PIPESTATUS[0]} -eq 0 ]]; then
echo "✓ ${filename} 备份成功 ($(wc -c < "${BACKUP_DIR}/${TIMESTAMP}_${filename}") bytes)"
else
echo "✗ ${filename} 备份失败"
fi
}
echo "=== 开始备份 Kong 配置 ==="
echo "时间戳: ${TIMESTAMP}"
echo "备份目录: ${BACKUP_DIR}"
backup_endpoint "/services" "services.json"
backup_endpoint "/routes" "routes.json"
backup_endpoint "/plugins" "plugins.json"
backup_endpoint "/consumers" "consumers.json"
backup_endpoint "/upstreams" "upstreams.json"
# 创建备份清单
cat > "${BACKUP_DIR}/${TIMESTAMP}_manifest.txt" << MANIFEST
Kong Configuration Backup Manifest
===================================
Timestamp: ${TIMESTAMP}
Admin URL: ${KONG_ADMIN}
Hostname: $(hostname)
Services: $(jq '.data | length' "${BACKUP_DIR}/${TIMESTAMP}_services.json" 2>/dev/null || echo "N/A")
Routes: $(jq '.data | length' "${BACKUP_DIR}/${TIMESTAMP}_routes.json" 2>/dev/null || echo "N/A")
MANIFEST
echo "=== 备份完成 ==="
ls -lh "${BACKUP_DIR}/${TIMESTAMP}"*
EOF
chmod +x scripts/backup-config.sh
执行备份:
./scripts/backup-config.sh
# 预期输出:
# === 开始备份 Kong 配置 ===
# 时间戳: 20240115_143022
# 备份目录: /opt/kong-automation/backup
# 备份: /services -> services.json
# ✓ services.json 备份成功 (2048 bytes)
# 备份: /routes -> routes.json
# ✓ routes.json 备份成功 (4096 bytes)
# 备份: /plugins -> plugins.json
# ✓ plugins.json 备份成功 (1024 bytes)
# 备份: /consumers -> consumers.json
# ✓ consumers.json 备份成功 (512 bytes)
# 备份: /upstreams -> upstreams.json
# ✓ upstreams.json 备份成功 (768 bytes)
# === 备份完成 ===
# -rw-r--r-- 1 root root 2048 Jan 15 14:30 backup/20240115_143022_services.json
# -rw-r--r-- 1 root root 4096 Jan 15 14:30 backup/20240115_143022_routes.json
步骤5:健康检查与告警脚本
自动化监控Kong节点和服务健康状态。
cat > scripts/health-check.sh << 'EOF'
#!/bin/bash
KONG_ADMIN="${KONG_ADMIN_URL:-http://localhost:8001}"
LOG_FILE="/opt/kong-automation/logs/health-$(date +%Y%m%d).log"
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}
check_node_health() {
local response=$(curl -s -w "\n%{http_code}" "${KONG_ADMIN}/status")
local body=$(echo "$response" | head -n -1)
local code=$(echo "$response" | tail -n 1)
if [[ "$code" == "200" ]]; then
local db_status=$(echo "$body" | jq -r '.database.reachable')
local workers=$(echo "$body" | jq -r '.workers | length')
local connections=$(echo "$body" | jq -r '.server.connections_accepted')
if [[ "$db_status" == "true" ]]; then
log "✓ Kong节点健康 [workers:${workers}] [connections:${connections}]"
return 0
else
log "✗ 数据库不可达!"
return 1
fi
else
log "✗ Kong Admin API无响应 (HTTP ${code})"
return 2
fi
}
check_services_health() {
local unhealthy=0
local services=$(curl -s "${KONG_ADMIN}/services" | jq -r '.data[].name')
for svc in $services; do
local health=$(curl -s "${KONG_ADMIN}/services/${svc}/health" 2>/dev/null)
if echo "$health" | jq -e '.healthchecks.exitnonzero' > /dev/null 2>&1; then
local status=$(echo "$health" | jq -r '.healthchecks.passive.healthy // false')
if [[ "$status" != "true" ]]; then
log "⚠ 服务 ${svc} 健康检查异常"
unhealthy=$((unhealthy + 1))
fi
fi
done
if [[ $unhealthy -eq 0 ]]; then
log "✓ 所有服务健康检查通过"
fi
}
# 主流程
log "========== 开始健康检查 =========="
check_node_health
check_services_health
log "========== 检查完成 =========="
EOF
chmod +x scripts/health-check.sh
配置定时任务自动执行:
# 添加到crontab,每5分钟执行一次
(crontab -l 2>/dev/null | grep -v health-check.sh; echo "*/5 * * * * /opt/kong-automation/scripts/health-check.sh >> /var/log/kong-health.log 2>&1") | crontab -
# 验证crontab
crontab -l | grep health
# 输出:*/5 * * * * /opt/kong-automation/scripts/health-check.sh >> /var/log/kong-health.log 2>&1
步骤6:多节点配置同步脚本
生产环境通常多个Kong节点,配置要保持同步。
cat > scripts/sync-config.sh << 'EOF'
#!/bin/bash
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
NODES=(
"http://kong-prod-01:8001"
"http://kong-prod-02:8001"
)
# 加载配置
CONFIG_FILE="${SCRIPT_DIR}/../config/services.json"
TARGET_NODE="${1:-${NODES[0]}}"
sync_to_node() {
local node="$1"
local node_name=$(echo "$node" | grep -oP '(?<=http://)[^:]+')
echo ">>> 同步配置到 ${node_name}"
# 同步Services
while IFS= read -r service; do
svc_name=$(echo "$service" | jq -r '.name')
curl -s -X PUT "${node}/services/${svc_name}"
-H "Content-Type: application/json"
-d "$service" | jq -r '.id // .message' | xargs -I{} echo " 服务 ${svc_name}: {}"
done < <(jq -c '.[]' "$CONFIG_FILE")
echo "✓ ${node_name} 同步完成"
}
echo "=== Kong 多节点配置同步 ==="
echo "目标节点: ${TARGET_NODE}"
echo ""
sync_to_node "$TARGET_NODE"
# 验证同步结果
echo ""
echo "=== 验证同步 ==="
LOCAL_COUNT=$(curl -s "${NODES[0]}/services" | jq '.data | length')
REMOTE_COUNT=$(curl -s "${TARGET_NODE}/services" | jq '.data | length')
if [[ "$LOCAL_COUNT" == "$REMOTE_COUNT" ]]; then
echo "✓ 同步验证通过 (服务数量: ${LOCAL_COUNT})"
else
echo "✗ 同步异常 - 源节点: ${LOCAL_COUNT}, 目标节点: ${REMOTE_COUNT}"
fi
EOF
chmod +x scripts/sync-config.sh
步骤7:自动化部署完整工作流
把前面的脚本串起来,形成完整的自动化部署流程。
cat > scripts/deploy.sh << 'EOF'
#!/bin/bash
set -e
DEPLOY_ENV="${1:-staging}"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/utils.sh"
echo "╔════════════════════════════════════════╗"
echo "║ Kong 自动化部署工作流 v1.0 ║"
echo "╚════════════════════════════════════════╝"
echo "部署环境: ${DEPLOY_ENV}"
echo ""
# 1. 部署前备份
log_info "步骤1: 备份当前配置..."
"${SCRIPT_DIR}/backup-config.sh" > /dev/null 2>&1
# 2. 验证环境连通性
log_info "步骤2: 验证Kong Admin连通性..."
KONG_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "${KONG_ADMIN_URL}/status")
if [[ "$KONG_STATUS" != "200" ]]; then
log_error "Kong Admin不可访问 (HTTP ${KONG_STATUS})"
exit 1
fi
log_success "Kong Admin连接正常"
# 3. 部署服务
log_info "步骤3: 部署服务配置..."
"${SCRIPT_DIR}/batch-create-services.sh" "${SCRIPT_DIR}/../config/${DEPLOY_ENV}-services.json"
# 4. 部署路由
log_info "步骤4: 部署路由配置..."
"${SCRIPT_DIR}/batch-create-routes.sh"
# 5. 部署插件
log_info "步骤5: 部署全局插件..."
if [[ -f "${SCRIPT_DIR}/../config/${DEPLOY_ENV}-plugins.json" ]]; then
jq -c '.[]' "${SCRIPT_DIR}/../config/${DEPLOY_ENV}-plugins.json" | while read plugin; do
curl -s -X POST "${KONG_ADMIN_URL}/plugins" -H "Content-Type: application/json" -d "$plugin" | jq -r '.id // .message'
done
fi
# 6. 验证部署
log_info "步骤6: 验证部署结果..."
SERVICE_COUNT=$(curl -s "${KONG_ADMIN_URL}/services" | jq '.data | length')
ROUTE_COUNT=$(curl -s "${KONG_ADMIN_URL}/routes" | jq '.data | length')
log_success "部署完成!"
echo ""
echo "部署统计:"
echo " - 服务数量: ${SERVICE_COUNT}"
echo " - 路由数量: ${ROUTE_COUNT}"
echo ""
log_info "如需回滚,执行: ${SCRIPT_DIR}/restore-config.sh <备份时间戳>"
EOF
chmod +x scripts/deploy.sh
辅助工具脚本(utils.sh):
cat > scripts/utils.sh << 'EOF'
#!/bin/bash
KONG_ADMIN_URL="${KONG_ADMIN_URL:-http://localhost:8001}"
log_info() { echo "ℹ $1"; }
log_success() { echo "✓ $1"; }
log_error() { echo "✗ $1" >&2; }
log_warn() { echo "⚠ $1"; }
EOF
三、常见问题FAQ
Q: 脚本执行报错 "Kong Admin API无响应",怎么排查?
先手动测连通性:curl http://kong-admin:8001/status。如果返回401/403,检查Kong配置里的admin_listen是否限制了IP访问,或者检查防火墙规则。另外确认Kong进程是否存活:ps aux | grep kong。生产环境建议给Admin API加认证,别裸奔在公网。
Q: 批量创建服务时提示 "schema violation (name: cannot start with system reserved prefix 'kong_')",是什么鬼?
Kong保留以kong_开头的前缀给内部使用。你的服务名踩雷了,换个名字比如user-api-prod、order-service-v2这种。起名规范建议:业务名-环境-版本,比如payment-service-prod、auth-service-staging。
Q: 多节点同步脚本同步完了,但实际请求还是走的旧路由?
Kong的Admin API修改会立即生效,但Nginx/Ingress层面的upstream可能还在缓存旧数据。执行路由刷新:curl -s -X POST http://kong-admin:8001/cache:invalidate -d '{"keys":"*"}'。另外检查负载均衡器配置,确保请求真的打到了新节点上。实际遇到过配置同步成功但用户请求还是404的案例,排查半天才发现是ELB健康检查把新节点摘了。
Q: 备份脚本导出的JSON文件导入时报错,怎么处理?
Kong导出的JSON格式是分页的,数据在data数组里。导入时如果直接POST整个文件会报错。需要先提取data数组:jq '.data' backup.json > import.json。或者写个解析脚本把分页结构扁平化。还有个坑:导入时会带上ID字段,如果目标环境已经存在同样ID的配置,要么先清空(⚠️ 危险操作,确认备份),要么导入前处理掉ID让它自动生成新ID。
Q: health-check脚本的日志文件越来越大,怎么处理?
# 配置logrotate自动轮转
cat > /etc/logrotate.d/kong-health << 'EOF'
/opt/kong-automation/logs/health-*.log {
daily
rotate 7
compress
missingok
notifempty
create 0644 root root
}
EOF
四、总结
核心要点:
- 自动化脚本是Kong运维的标配,手动操作三个节点以上就会出纰漏
- 任何修改前必须备份,脚本自动化备份一点都不麻烦
- 多节点环境配置同步是刚需,用PUT代替POST可以幂等更新
- 健康检查要持续跑,发现问题比解决问题更重要
- 脚本要有日志输出,出问题能快速定位是哪一步崩了
延伸阅读:
- Kong Admin API官方文档:https://docs.konghq.com/gateway/latest/admin-api/
- deck工具:Kong官方配置同步工具,支持声明式配置和 drift detection
- DecK GitHub:https://github.com/kong/deck
- Ansible Kong Collection:大型环境推荐用Ansible Playbook管理Kong配置
- Kong Ingress Controller:Kubernetes环境下推荐使用KIC,配置即代码