1. NGINX Worker进程概述

NGINX的worker进程是处理客户端请求的核心组件,其数量和配置直接影响系统的并发处理能力和性能表现。合理配置worker进程数量是NGINX性能优化的关键因素之一。本文将详细介绍NGINX worker进程的原理、配置策略以及最佳实践。

1.1 Worker进程核心作用

  1. 请求处理: 处理客户端HTTP请求
  2. 事件循环: 执行事件驱动的I/O操作
  3. 负载均衡: 在多进程间分配请求负载
  4. 资源管理: 管理连接、内存等系统资源
  5. 进程隔离: 提供进程级别的故障隔离
  6. 性能优化: 通过多进程提升并发处理能力

1.2 Worker进程架构特点

  • 主进程: 负责配置管理、信号处理和worker进程管理
  • Worker进程: 负责实际的请求处理工作
  • 事件模型: 基于epoll/kqueue的异步非阻塞I/O
  • 进程通信: 通过共享内存和信号进行进程间通信
  • 负载均衡: 操作系统级别的负载均衡

1.3 Worker进程数量影响因素

  • CPU核心数: 主要决定因素
  • 工作负载类型: CPU密集型vs I/O密集型
  • 内存限制: 每个进程的内存占用
  • 连接数: 预期的并发连接数
  • 系统架构: 单机vs集群部署

2. Worker进程配置详解

2.1 基础配置参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
# NGINX主配置文件 - Worker进程配置
# /etc/nginx/nginx.conf

# 运行用户
user nginx;

# Worker进程数量配置
# 1. auto - 自动设置为CPU核心数(推荐)
# 2. 数字 - 手动指定进程数量
# 3. 1 - 单进程模式(调试用)
worker_processes auto;

# 每个worker进程的最大文件描述符数
# 影响:每个进程能处理的最大连接数
# 建议:设置为系统ulimit -n的值
worker_rlimit_nofile 65535;

# Worker进程CPU绑定(可选)
# 作用:将worker进程绑定到特定CPU核心
# 格式:worker_cpu_affinity 0001 0010 0100 1000;
# 说明:每个位代表一个CPU核心,1表示绑定
worker_cpu_affinity auto;

# 错误日志配置
# 级别:debug, info, notice, warn, error, crit, alert, emerg
error_log /var/log/nginx/error.log warn;

# 进程ID文件
pid /var/run/nginx.pid;

# 事件模块配置
events {
# 事件模型选择
# epoll - Linux系统推荐(高效)
# kqueue - FreeBSD/MacOS推荐
# select - 通用但效率较低
use epoll;

# 每个worker进程的最大连接数
# 总连接数 = worker_processes × worker_connections
# 建议:根据系统内存和预期连接数设置
worker_connections 65535;

# 允许一个worker进程同时接受多个新连接
# 作用:提高连接接受效率
multi_accept on;

# 优化accept()系统调用
# on - 使用互斥锁(低并发时)
# off - 不使用互斥锁(高并发时推荐)
accept_mutex off;
}

# HTTP模块配置
http {
# 包含MIME类型定义
include /etc/nginx/mime.types;
default_type application/octet-stream;

# 日志格式定义
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'rt=$request_time uct="$upstream_connect_time" '
'uht="$upstream_header_time" urt="$upstream_response_time"';

# 访问日志
access_log /var/log/nginx/access.log main;

# 性能优化配置
sendfile on; # 开启高效文件传输
tcp_nopush on; # 优化sendfile()系统调用
tcp_nodelay on; # 优化tcp_nodelay

# 连接保持配置
keepalive_timeout 65; # 保持连接超时时间
keepalive_requests 1000; # 保持连接的最大请求数

# 客户端请求配置
client_max_body_size 100m; # 客户端请求体大小限制
client_header_buffer_size 4k; # 客户端请求头缓冲区大小
large_client_header_buffers 8 16k; # 大请求头缓冲区大小
client_body_buffer_size 128k; # 客户端请求体缓冲区大小

# 临时文件路径配置
client_body_temp_path /var/cache/nginx/client_temp;
proxy_temp_path /var/cache/nginx/proxy_temp;
fastcgi_temp_path /var/cache/nginx/fastcgi_temp;

# 压缩配置
gzip on; # 开启gzip压缩
gzip_vary on; # 添加Vary头
gzip_min_length 1024; # 最小压缩文件大小
gzip_comp_level 6; # 压缩级别(1-9)
gzip_types # 压缩文件类型
text/plain
text/css
text/xml
text/javascript
application/json
application/javascript
application/xml+rss
application/atom+xml
image/svg+xml;

# 文件缓存配置
open_file_cache max=10000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;

# 限制配置
limit_conn_zone $binary_remote_addr zone=conn_limit_per_ip:10m;
limit_conn conn_limit_per_ip 20;

limit_req_zone $binary_remote_addr zone=req_limit_per_ip:10m rate=10r/s;
limit_req zone=req_limit_per_ip burst=20 nodelay;

# 上游服务器组
upstream backend {
least_conn; # 负载均衡算法:最少连接
server 192.168.1.10:8080 weight=3 max_fails=3 fail_timeout=30s;
server 192.168.1.11:8080 weight=3 max_fails=3 fail_timeout=30s;
server 192.168.1.12:8080 weight=2 max_fails=3 fail_timeout=30s;

# 保持连接配置
keepalive 32;
keepalive_requests 1000;
keepalive_timeout 60s;
}

# 代理缓存配置
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=1g inactive=60m use_temp_path=off;

# 主服务器配置
server {
listen 80;
server_name example.com;

# 安全头配置
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";

# 静态文件处理
location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
expires 1y; # 设置过期时间
add_header Cache-Control "public, immutable";
access_log off; # 关闭访问日志
}

# API接口代理
location /api/ {
# 请求方法限制
limit_except GET POST {
deny all;
}

# 代理配置
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

# 超时配置
proxy_connect_timeout 5s; # 连接超时
proxy_send_timeout 10s; # 发送超时
proxy_read_timeout 10s; # 读取超时

# 缓冲配置
proxy_buffering on; # 开启代理缓冲
proxy_buffer_size 4k; # 代理缓冲区大小
proxy_buffers 8 4k; # 代理缓冲区数量和大小
proxy_busy_buffers_size 8k; # 忙缓冲区大小

# 缓存配置
proxy_cache my_cache; # 使用代理缓存
proxy_cache_valid 200 302 10m; # 缓存有效时间
proxy_cache_valid 404 1m;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
proxy_cache_lock on; # 缓存锁

# 健康检查
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_next_upstream_tries 3;
proxy_next_upstream_timeout 10s;
}

# 健康检查端点
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}

# 状态监控端点
location /nginx_status {
stub_status on; # 开启状态监控
access_log off; # 关闭访问日志
allow 127.0.0.1; # 允许本地访问
allow 192.168.1.0/24; # 允许内网访问
deny all; # 拒绝其他访问
}
}
}

2.2 Worker进程数量计算

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
#!/bin/bash
# NGINX Worker进程数量计算脚本
# /opt/scripts/nginx_worker_calculator.sh

# 日志函数
log_message() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1"
}

# 获取CPU核心数
get_cpu_cores() {
# 方法1:从/proc/cpuinfo获取
local cores=$(grep -c ^processor /proc/cpuinfo)
echo $cores
}

# 获取CPU型号信息
get_cpu_info() {
local model=$(grep "model name" /proc/ccpuinfo | head -1 | cut -d: -f2 | sed 's/^[ \t]*//')
echo "$model"
}

# 获取系统内存信息
get_memory_info() {
local total_mem=$(free -m | grep "Mem:" | awk '{print $2}')
local available_mem=$(free -m | grep "Mem:" | awk '{print $7}')
echo "$total_mem $available_mem"
}

# 获取当前NGINX配置
get_current_config() {
local config_file="/etc/nginx/nginx.conf"
if [ -f "$config_file" ]; then
local worker_processes=$(grep "worker_processes" "$config_file" | grep -v "^#" | awk '{print $2}' | sed 's/;//')
local worker_connections=$(grep "worker_connections" "$config_file" | grep -v "^#" | awk '{print $2}' | sed 's/;//')
echo "$worker_processes $worker_connections"
else
echo "auto 1024"
fi
}

# 计算推荐的Worker进程数
calculate_worker_processes() {
local cpu_cores=$1
local workload_type=$2
local memory_limit=$3

local recommended_processes

case "$workload_type" in
"cpu_intensive")
# CPU密集型:进程数 = CPU核心数
recommended_processes=$cpu_cores
;;
"io_intensive")
# I/O密集型:进程数 = CPU核心数 × 2
recommended_processes=$((cpu_cores * 2))
;;
"mixed")
# 混合型:进程数 = CPU核心数 × 1.5
recommended_processes=$(echo "$cpu_cores * 1.5" | bc | cut -d. -f1)
;;
"high_concurrency")
# 高并发:进程数 = CPU核心数 × 2-4
recommended_processes=$((cpu_cores * 3))
;;
*)
# 默认:进程数 = CPU核心数
recommended_processes=$cpu_cores
;;
esac

# 内存限制检查
if [ $memory_limit -gt 0 ]; then
local max_processes_by_memory=$((memory_limit / 100)) # 假设每个进程100MB
if [ $recommended_processes -gt $max_processes_by_memory ]; then
recommended_processes=$max_processes_by_memory
fi
fi

echo $recommended_processes
}

# 计算推荐的连接数
calculate_worker_connections() {
local worker_processes=$1
local expected_connections=$2
local memory_per_connection=$3

# 默认每个连接占用内存(字节)
if [ -z "$memory_per_connection" ]; then
memory_per_connection=8192 # 8KB
fi

# 计算每个进程的连接数
local connections_per_process=$((expected_connections / worker_processes))

# 内存限制检查
local available_memory=$(free -b | grep "Mem:" | awk '{print $7}')
local max_connections_by_memory=$((available_memory / memory_per_connection / worker_processes))

if [ $connections_per_process -gt $max_connections_by_memory ]; then
connections_per_process=$max_connections_by_memory
fi

# 系统限制检查
local ulimit_n=$(ulimit -n)
local max_connections_by_ulimit=$((ulimit_n / worker_processes))

if [ $connections_per_process -gt $max_connections_by_ulimit ]; then
connections_per_process=$max_connections_by_ulimit
fi

echo $connections_per_process
}

# 生成配置建议
generate_recommendations() {
local cpu_cores=$1
local workload_type=$2
local expected_connections=$3
local memory_limit=$4

log_message "=== NGINX Worker进程配置建议 ==="
log_message "CPU核心数: $cpu_cores"
log_message "工作负载类型: $workload_type"
log_message "预期连接数: $expected_connections"
log_message "内存限制: ${memory_limit}MB"

# 计算推荐的Worker进程数
local recommended_processes=$(calculate_worker_processes $cpu_cores $workload_type $memory_limit)
log_message "推荐Worker进程数: $recommended_processes"

# 计算推荐的连接数
local recommended_connections=$(calculate_worker_connections $recommended_processes $expected_connections)
log_message "推荐每个Worker连接数: $recommended_connections"

# 计算总连接数
local total_connections=$((recommended_processes * recommended_connections))
log_message "总连接数: $total_connections"

# 生成配置文件建议
cat << EOF

=== 配置文件建议 ===

# Worker进程配置
worker_processes $recommended_processes;
worker_rlimit_nofile $((recommended_connections * 2));

# 事件模块配置
events {
use epoll;
worker_connections $recommended_connections;
multi_accept on;
accept_mutex off;
}

=== 系统优化建议 ===

# 1. 文件描述符限制
echo "* soft nofile $((recommended_connections * 2))" >> /etc/security/limits.conf
echo "* hard nofile $((recommended_connections * 2))" >> /etc/security/limits.conf

# 2. 内核参数优化
echo "net.core.somaxconn = $recommended_connections" >> /etc/sysctl.conf
echo "net.core.netdev_max_backlog = 5000" >> /etc/sysctl.conf
echo "net.ipv4.tcp_max_syn_backlog = $recommended_connections" >> /etc/sysctl.conf

# 3. 应用配置
sysctl -p

EOF
}

# 性能测试建议
performance_test_recommendations() {
local worker_processes=$1
local worker_connections=$2

log_message "=== 性能测试建议 ==="

cat << EOF

# 1. 压力测试命令
ab -n 10000 -c 100 http://localhost/

# 2. WRK测试命令
wrk -t$worker_processes -c$worker_connections -d30s http://localhost/

# 3. 监控命令
watch -n 1 'ps aux | grep nginx'

# 4. 状态监控
curl http://localhost/nginx_status

# 5. 系统资源监控
top -p \$(pgrep nginx | tr '\n' ',' | sed 's/,$//')

EOF
}

# 主函数
main() {
case "$1" in
"calculate")
local cpu_cores=$(get_cpu_cores)
local workload_type="${2:-mixed}"
local expected_connections="${3:-1000}"
local memory_limit="${4:-0}"

generate_recommendations $cpu_cores $workload_type $expected_connections $memory_limit
;;
"info")
local cpu_cores=$(get_cpu_cores)
local cpu_info=$(get_cpu_info)
local memory_info=$(get_memory_info)
local current_config=$(get_current_config)

log_message "=== 系统信息 ==="
log_message "CPU核心数: $cpu_cores"
log_message "CPU型号: $cpu_info"
log_message "内存信息: ${memory_info}MB"
log_message "当前配置: $current_config"
;;
"test")
local worker_processes="${2:-auto}"
local worker_connections="${3:-1024}"

performance_test_recommendations $worker_processes $worker_connections
;;
*)
echo "Usage: $0 {calculate|info|test}"
echo " calculate [workload_type] [expected_connections] [memory_limit] - 计算推荐配置"
echo " info - 显示系统信息"
echo " test [worker_processes] [worker_connections] - 性能测试建议"
echo ""
echo "Workload types: cpu_intensive, io_intensive, mixed, high_concurrency"
exit 1
;;
esac
}

# 执行主函数
main "$@"

2.3 CPU绑定配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
# NGINX CPU绑定配置示例
# /etc/nginx/nginx.conf

# 运行用户
user nginx;

# Worker进程数量
worker_processes 4;

# CPU绑定配置
# 说明:将worker进程绑定到特定CPU核心
# 格式:每个位代表一个CPU核心,1表示绑定到该核心
# 示例:4个进程绑定到4个CPU核心
worker_cpu_affinity 0001 0010 0100 1000;

# 或者使用auto自动绑定
# worker_cpu_affinity auto;

# 每个worker进程的最大文件描述符数
worker_rlimit_nofile 65535;

# 错误日志
error_log /var/log/nginx/error.log warn;

# 进程ID文件
pid /var/run/nginx.pid;

# 事件模块配置
events {
use epoll;
worker_connections 16384;
multi_accept on;
accept_mutex off;
}

# HTTP模块配置
http {
# 基础配置
include /etc/nginx/mime.types;
default_type application/octet-stream;

# 日志格式
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

# 性能优化
sendfile on;
tcp_nopush on;
tcp_nodelay on;

# 连接保持
keepalive_timeout 65;
keepalive_requests 1000;

# 客户端配置
client_max_body_size 100m;
client_header_buffer_size 4k;
large_client_header_buffers 8 16k;
client_body_buffer_size 128k;

# 压缩配置
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_comp_level 6;
gzip_types
text/plain
text/css
text/xml
text/javascript
application/json
application/javascript
application/xml+rss
application/atom+xml
image/svg+xml;

# 文件缓存
open_file_cache max=10000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;

# 限制配置
limit_conn_zone $binary_remote_addr zone=conn_limit_per_ip:10m;
limit_conn conn_limit_per_ip 20;

limit_req_zone $binary_remote_addr zone=req_limit_per_ip:10m rate=10r/s;
limit_req zone=req_limit_per_ip burst=20 nodelay;

# 上游服务器
upstream backend {
least_conn;
server 192.168.1.10:8080 weight=3 max_fails=3 fail_timeout=30s;
server 192.168.1.11:8080 weight=3 max_fails=3 fail_timeout=30s;
server 192.168.1.12:8080 weight=2 max_fails=3 fail_timeout=30s;

keepalive 32;
keepalive_requests 1000;
keepalive_timeout 60s;
}

# 代理缓存
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=1g inactive=60m use_temp_path=off;

# 主服务器
server {
listen 80;
server_name example.com;

# 安全头
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";

# 静态文件
location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
expires 1y;
add_header Cache-Control "public, immutable";
access_log off;
}

# API代理
location /api/ {
limit_except GET POST {
deny all;
}

proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

proxy_connect_timeout 5s;
proxy_send_timeout 10s;
proxy_read_timeout 10s;

proxy_buffering on;
proxy_buffer_size 4k;
proxy_buffers 8 4k;
proxy_busy_buffers_size 8k;

proxy_cache my_cache;
proxy_cache_valid 200 302 10m;
proxy_cache_valid 404 1m;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
proxy_cache_lock on;

proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_next_upstream_tries 3;
proxy_next_upstream_timeout 10s;
}

# 健康检查
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}

# 状态监控
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
allow 192.168.1.0/24;
deny all;
}
}
}

3. Worker进程监控和调优

3.1 Worker进程监控脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
#!/bin/bash
# NGINX Worker进程监控脚本
# /opt/scripts/nginx_worker_monitor.sh

# 配置变量
LOG_FILE="/var/log/nginx_worker_monitor.log"
ALERT_EMAIL="admin@example.com"
CPU_THRESHOLD=80
MEMORY_THRESHOLD=80
LOAD_THRESHOLD=5.0

# 日志函数
log_message() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}

# 获取Worker进程信息
get_worker_info() {
# 获取所有NGINX进程
local nginx_processes=$(ps aux | grep nginx | grep -v grep)

# 统计进程数量
local total_processes=$(echo "$nginx_processes" | wc -l)
local worker_processes=$(echo "$nginx_processes" | grep -c "worker process")
local master_processes=$(echo "$nginx_processes" | grep -c "master process")

echo "$total_processes $worker_processes $master_processes"
}

# 获取Worker进程CPU使用率
get_worker_cpu_usage() {
local cpu_usage=$(ps aux | grep "nginx.*worker process" | grep -v grep | awk '{sum+=$3} END {print sum}')
echo "${cpu_usage:-0}"
}

# 获取Worker进程内存使用率
get_worker_memory_usage() {
local memory_usage=$(ps aux | grep "nginx.*worker process" | grep -v grep | awk '{sum+=$4} END {print sum}')
echo "${memory_usage:-0}"
}

# 获取Worker进程内存占用(MB)
get_worker_memory_mb() {
local memory_mb=$(ps aux | grep "nginx.*worker process" | grep -v grep | awk '{sum+=$6} END {print sum/1024}')
echo "${memory_mb:-0}"
}

# 获取系统负载
get_system_load() {
local load=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}' | sed 's/,//')
echo "$load"
}

# 获取系统CPU使用率
get_system_cpu_usage() {
local cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | awk -F'%' '{print $1}')
echo "$cpu_usage"
}

# 获取系统内存使用率
get_system_memory_usage() {
local memory_usage=$(free | grep Mem | awk '{printf "%.2f", $3/$2 * 100.0}')
echo "$memory_usage"
}

# 获取NGINX状态
get_nginx_status() {
local status=$(curl -s http://localhost/nginx_status 2>/dev/null)
if [ $? -eq 0 ]; then
echo "$status"
else
echo "ERROR"
fi
}

# 解析NGINX状态
parse_nginx_status() {
local status="$1"
if [ "$status" = "ERROR" ]; then
echo "0 0 0 0 0 0 0"
return
fi

local active=$(echo "$status" | grep "Active connections" | awk '{print $3}')
local accepts=$(echo "$status" | awk 'NR==3 {print $1}')
local handled=$(echo "$status" | awk 'NR==3 {print $2}')
local requests=$(echo "$status" | awk 'NR==3 {print $3}')
local reading=$(echo "$status" | awk 'NR==4 {print $2}')
local writing=$(echo "$status" | awk 'NR==4 {print $4}')
local waiting=$(echo "$status" | awk 'NR==4 {print $6}')

echo "$active $accepts $handled $requests $reading $writing $waiting"
}

# 检查Worker进程健康状态
check_worker_health() {
local worker_info=$(get_worker_info)
local total_processes=$(echo $worker_info | awk '{print $1}')
local worker_processes=$(echo $worker_info | awk '{print $2}')
local master_processes=$(echo $worker_info | awk '{print $3}')

# 检查进程数量
if [ $total_processes -eq 0 ]; then
echo "CRITICAL: No NGINX processes running"
return 1
fi

if [ $master_processes -eq 0 ]; then
echo "CRITICAL: No NGINX master process running"
return 1
fi

if [ $worker_processes -eq 0 ]; then
echo "CRITICAL: No NGINX worker processes running"
return 1
fi

# 检查进程比例
local expected_workers=$(grep "worker_processes" /etc/nginx/nginx.conf | grep -v "^#" | awk '{print $2}' | sed 's/;//')
if [ "$expected_workers" = "auto" ]; then
local cpu_cores=$(grep -c ^processor /proc/cpuinfo)
expected_workers=$cpu_cores
fi

if [ $worker_processes -lt $expected_workers ]; then
echo "WARNING: Worker processes count ($worker_processes) is less than expected ($expected_workers)"
return 1
fi

echo "HEALTHY: Worker processes running normally"
return 0
}

# 检查资源使用情况
check_resource_usage() {
local worker_cpu=$(get_worker_cpu_usage)
local worker_memory=$(get_worker_memory_usage)
local system_load=$(get_system_load)
local system_cpu=$(get_system_cpu_usage)
local system_memory=$(get_system_memory_usage)

local issues=()

# 检查Worker进程CPU使用率
if (( $(echo "$worker_cpu > $CPU_THRESHOLD" | bc -l) )); then
issues+=("Worker CPU usage is high: ${worker_cpu}%")
fi

# 检查Worker进程内存使用率
if (( $(echo "$worker_memory > $MEMORY_THRESHOLD" | bc -l) )); then
issues+=("Worker memory usage is high: ${worker_memory}%")
fi

# 检查系统负载
if (( $(echo "$system_load > $LOAD_THRESHOLD" | bc -l) )); then
issues+=("System load is high: $system_load")
fi

# 检查系统CPU使用率
if (( $(echo "$system_cpu > $CPU_THRESHOLD" | bc -l) )); then
issues+=("System CPU usage is high: ${system_cpu}%")
fi

# 检查系统内存使用率
if (( $(echo "$system_memory > $MEMORY_THRESHOLD" | bc -l) )); then
issues+=("System memory usage is high: ${system_memory}%")
fi

if [ ${#issues[@]} -gt 0 ]; then
echo "WARNING: ${issues[*]}"
return 1
fi

echo "HEALTHY: Resource usage is normal"
return 0
}

# 发送告警邮件
send_alert() {
local message="$1"
echo "$message" | mail -s "NGINX Worker Alert" $ALERT_EMAIL
log_message "ALERT: $message"
}

# 生成监控报告
generate_report() {
local worker_info=$(get_worker_info)
local total_processes=$(echo $worker_info | awk '{print $1}')
local worker_processes=$(echo $worker_info | awk '{print $2}')
local master_processes=$(echo $worker_info | awk '{print $3}')

local worker_cpu=$(get_worker_cpu_usage)
local worker_memory=$(get_worker_memory_usage)
local worker_memory_mb=$(get_worker_memory_mb)
local system_load=$(get_system_load)
local system_cpu=$(get_system_cpu_usage)
local system_memory=$(get_system_memory_usage)

local nginx_status=$(get_nginx_status)
local status_info=$(parse_nginx_status "$nginx_status")
local active=$(echo $status_info | awk '{print $1}')
local accepts=$(echo $status_info | awk '{print $2}')
local handled=$(echo $status_info | awk '{print $3}')
local requests=$(echo $status_info | awk '{print $4}')
local reading=$(echo $status_info | awk '{print $5}')
local writing=$(echo $status_info | awk '{print $6}')
local waiting=$(echo $status_info | awk '{print $7}')

cat << EOF

=== NGINX Worker进程监控报告 ===
生成时间: $(date)

=== 进程信息 ===
总进程数: $total_processes
Worker进程数: $worker_processes
Master进程数: $master_processes

=== Worker进程资源使用 ===
CPU使用率: ${worker_cpu}%
内存使用率: ${worker_memory}%
内存占用: ${worker_memory_mb}MB

=== 系统资源使用 ===
系统负载: $system_load
系统CPU使用率: ${system_cpu}%
系统内存使用率: ${system_memory}%

=== NGINX状态 ===
活跃连接数: $active
总接受连接数: $accepts
总处理连接数: $handled
总请求数: $requests
读取连接数: $reading
写入连接数: $writing
等待连接数: $waiting

=== 健康检查 ===
进程健康状态: $(check_worker_health)
资源使用状态: $(check_resource_usage)

EOF
}

# 主监控函数
monitor_workers() {
log_message "Starting NGINX worker process monitoring"

# 检查Worker进程健康状态
local health_status=$(check_worker_health)
if [ $? -ne 0 ]; then
send_alert "$health_status"
fi

# 检查资源使用情况
local resource_status=$(check_resource_usage)
if [ $? -ne 0 ]; then
send_alert "$resource_status"
fi

# 生成监控报告
generate_report >> $LOG_FILE

log_message "NGINX worker process monitoring completed"
}

# 主函数
main() {
case "$1" in
"monitor")
monitor_workers
;;
"report")
generate_report
;;
"health")
check_worker_health
;;
"resources")
check_resource_usage
;;
"info")
get_worker_info
;;
*)
echo "Usage: $0 {monitor|report|health|resources|info}"
exit 1
;;
esac
}

# 执行主函数
main "$@"

3.2 Worker进程调优脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
#!/bin/bash
# NGINX Worker进程调优脚本
# /opt/scripts/nginx_worker_tuning.sh

# 配置变量
NGINX_CONFIG="/etc/nginx/nginx.conf"
BACKUP_DIR="/etc/nginx/backup"
LOG_FILE="/var/log/nginx_worker_tuning.log"

# 日志函数
log_message() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a $LOG_FILE
}

# 备份配置文件
backup_config() {
local timestamp=$(date +%Y%m%d_%H%M%S)
local backup_file="$BACKUP_DIR/nginx.conf.backup.$timestamp"

mkdir -p "$BACKUP_DIR"
cp "$NGINX_CONFIG" "$backup_file"

log_message "Configuration backed up to: $backup_file"
}

# 获取当前配置
get_current_config() {
local worker_processes=$(grep "worker_processes" "$NGINX_CONFIG" | grep -v "^#" | awk '{print $2}' | sed 's/;//')
local worker_connections=$(grep "worker_connections" "$NGINX_CONFIG" | grep -v "^#" | awk '{print $2}' | sed 's/;//')
local worker_cpu_affinity=$(grep "worker_cpu_affinity" "$NGINX_CONFIG" | grep -v "^#" | awk '{print $2}' | sed 's/;//')

echo "$worker_processes $worker_connections $worker_cpu_affinity"
}

# 获取系统信息
get_system_info() {
local cpu_cores=$(grep -c ^processor /proc/cpuinfo)
local memory_gb=$(free -g | grep "Mem:" | awk '{print $2}')
local load_avg=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}' | sed 's/,//')

echo "$cpu_cores $memory_gb $load_avg"
}

# 分析当前性能
analyze_performance() {
local current_config=$(get_current_config)
local worker_processes=$(echo $current_config | awk '{print $1}')
local worker_connections=$(echo $current_config | awk '{print $2}')

local system_info=$(get_system_info)
local cpu_cores=$(echo $system_info | awk '{print $1}')
local memory_gb=$(echo $system_info | awk '{print $2}')
local load_avg=$(echo $system_info | awk '{print $3}')

log_message "=== 当前配置分析 ==="
log_message "Worker进程数: $worker_processes"
log_message "每个Worker连接数: $worker_connections"
log_message "CPU核心数: $cpu_cores"
log_message "内存大小: ${memory_gb}GB"
log_message "系统负载: $load_avg"

# 分析配置合理性
local total_connections=$((worker_processes * worker_connections))
log_message "总连接数: $total_connections"

# 检查进程数配置
if [ "$worker_processes" = "auto" ]; then
log_message "Worker进程数设置为auto,将使用CPU核心数: $cpu_cores"
elif [ $worker_processes -gt $cpu_cores ]; then
log_message "WARNING: Worker进程数($worker_processes)大于CPU核心数($cpu_cores)"
elif [ $worker_processes -lt $cpu_cores ]; then
log_message "INFO: Worker进程数($worker_processes)小于CPU核心数($cpu_cores),可能未充分利用CPU"
fi

# 检查连接数配置
local max_connections_by_memory=$((memory_gb * 1024 * 1024 / 8192)) # 假设每个连接8KB
if [ $total_connections -gt $max_connections_by_memory ]; then
log_message "WARNING: 总连接数($total_connections)可能超过内存限制"
fi

local ulimit_n=$(ulimit -n)
if [ $total_connections -gt $ulimit_n ]; then
log_message "WARNING: 总连接数($total_connections)超过系统文件描述符限制($ulimit_n)"
fi
}

# 优化Worker进程数
optimize_worker_processes() {
local workload_type="$1"
local system_info=$(get_system_info)
local cpu_cores=$(echo $system_info | awk '{print $1}')

local optimized_processes

case "$workload_type" in
"cpu_intensive")
optimized_processes=$cpu_cores
;;
"io_intensive")
optimized_processes=$((cpu_cores * 2))
;;
"mixed")
optimized_processes=$(echo "$cpu_cores * 1.5" | bc | cut -d. -f1)
;;
"high_concurrency")
optimized_processes=$((cpu_cores * 3))
;;
*)
optimized_processes=$cpu_cores
;;
esac

log_message "优化后的Worker进程数: $optimized_processes"

# 更新配置文件
sed -i "s/worker_processes.*/worker_processes $optimized_processes;/" "$NGINX_CONFIG"

return $optimized_processes
}

# 优化Worker连接数
optimize_worker_connections() {
local worker_processes="$1"
local expected_connections="$2"

local optimized_connections=$((expected_connections / worker_processes))

# 内存限制检查
local system_info=$(get_system_info)
local memory_gb=$(echo $system_info | awk '{print $2}')
local max_connections_by_memory=$((memory_gb * 1024 * 1024 / 8192 / worker_processes))

if [ $optimized_connections -gt $max_connections_by_memory ]; then
optimized_connections=$max_connections_by_memory
fi

# 系统限制检查
local ulimit_n=$(ulimit -n)
local max_connections_by_ulimit=$((ulimit_n / worker_processes))

if [ $optimized_connections -gt $max_connections_by_ulimit ]; then
optimized_connections=$max_connections_by_ulimit
fi

log_message "优化后的每个Worker连接数: $optimized_connections"

# 更新配置文件
sed -i "s/worker_connections.*/worker_connections $optimized_connections;/" "$NGINX_CONFIG"

return $optimized_connections
}

# 优化CPU绑定
optimize_cpu_affinity() {
local worker_processes="$1"
local system_info=$(get_system_info)
local cpu_cores=$(echo $system_info | awk '{print $1}')

if [ $worker_processes -le $cpu_cores ]; then
# 生成CPU绑定掩码
local affinity_mask=""
for ((i=0; i<worker_processes; i++)); do
local mask=$((1 << i))
local binary_mask=$(printf "%0${cpu_cores}d" $(echo "obase=2; $mask" | bc))
affinity_mask="$affinity_mask $binary_mask"
done

log_message "优化后的CPU绑定: $affinity_mask"

# 更新配置文件
sed -i "s/worker_cpu_affinity.*/worker_cpu_affinity$affinity_mask;/" "$NGINX_CONFIG"
else
log_message "Worker进程数大于CPU核心数,跳过CPU绑定优化"
fi
}

# 优化系统参数
optimize_system_params() {
local worker_processes="$1"
local worker_connections="$2"

log_message "优化系统参数"

# 优化文件描述符限制
local max_fd=$((worker_processes * worker_connections * 2))
echo "* soft nofile $max_fd" >> /etc/security/limits.conf
echo "* hard nofile $max_fd" >> /etc/security/limits.conf

# 优化内核参数
cat >> /etc/sysctl.conf << EOF

# NGINX Worker进程优化参数
net.core.somaxconn = $worker_connections
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_max_syn_backlog = $worker_connections
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
EOF

# 应用内核参数
sysctl -p

log_message "系统参数优化完成"
}

# 测试配置
test_config() {
log_message "测试NGINX配置"

nginx -t

if [ $? -eq 0 ]; then
log_message "NGINX配置测试通过"
return 0
else
log_message "ERROR: NGINX配置测试失败"
return 1
fi
}

# 重载配置
reload_config() {
log_message "重载NGINX配置"

nginx -s reload

if [ $? -eq 0 ]; then
log_message "NGINX配置重载成功"
return 0
else
log_message "ERROR: NGINX配置重载失败"
return 1
fi
}

# 主优化函数
optimize_workers() {
local workload_type="$1"
local expected_connections="$2"

log_message "开始NGINX Worker进程优化"

# 备份配置
backup_config

# 分析当前性能
analyze_performance

# 优化Worker进程数
local optimized_processes=$(optimize_worker_processes "$workload_type")

# 优化Worker连接数
local optimized_connections=$(optimize_worker_connections "$optimized_processes" "$expected_connections")

# 优化CPU绑定
optimize_cpu_affinity "$optimized_processes"

# 优化系统参数
optimize_system_params "$optimized_processes" "$optimized_connections"

# 测试配置
if test_config; then
# 重载配置
if reload_config; then
log_message "NGINX Worker进程优化完成"
log_message "优化后配置: $optimized_processes 进程, 每进程 $optimized_connections 连接"
return 0
else
log_message "ERROR: NGINX配置重载失败,请检查配置"
return 1
fi
else
log_message "ERROR: NGINX配置测试失败,请检查配置"
return 1
fi
}

# 主函数
main() {
case "$1" in
"optimize")
local workload_type="${2:-mixed}"
local expected_connections="${3:-1000}"
optimize_workers "$workload_type" "$expected_connections"
;;
"analyze")
analyze_performance
;;
"test")
test_config
;;
"reload")
reload_config
;;
*)
echo "Usage: $0 {optimize|analyze|test|reload}"
echo " optimize [workload_type] [expected_connections] - 优化Worker进程配置"
echo " analyze - 分析当前配置"
echo " test - 测试NGINX配置"
echo " reload - 重载NGINX配置"
echo ""
echo "Workload types: cpu_intensive, io_intensive, mixed, high_concurrency"
exit 1
;;
esac
}

# 执行主函数
main "$@"

4. 总结

4.1 NGINX Worker进程配置总结

  1. 进程数量: 通常设置为CPU核心数,可根据工作负载类型调整
  2. 连接数: 根据预期并发连接数和系统资源限制设置
  3. CPU绑定: 在高性能场景下可考虑CPU绑定
  4. 系统优化: 需要配合系统级参数优化
  5. 监控调优: 持续监控和调优是必要的
  6. 负载均衡: 操作系统级别的负载均衡

4.2 Worker进程数量选择原则

  • CPU密集型: 进程数 = CPU核心数
  • I/O密集型: 进程数 = CPU核心数 × 2
  • 混合型: 进程数 = CPU核心数 × 1.5
  • 高并发: 进程数 = CPU核心数 × 2-4

4.3 最佳实践建议

  • 监控系统: 实时监控Worker进程状态
  • 性能测试: 定期进行性能测试
  • 配置优化: 根据实际负载优化配置
  • 系统调优: 进行系统级性能调优
  • 故障处理: 建立完善的故障处理机制
  • 容量规划: 合理规划系统容量

通过本文的NGINX Worker进程配置指南,您可以掌握Worker进程的原理、配置策略、监控调优以及在企业级应用中的最佳实践,构建高效、稳定、可扩展的NGINX系统!