haproxy做代理后端nginx获取客户端真实ip注意事项

当使用haproxy与nginx做负载均衡时，由于每次都是代理服务器与后端web服务器直接通信，
因此后端的web服务器为了获取用户的真实IP都要让代理服务器添加一个X-Forwarded-For header。
在haproxy里可以设置
option forwardfor header Client-IP
不过仅仅设置这个是不够的，这与haproxy的代理模式有关。haproxy使用的是”tunnel mode”，在文档里是这样说的
By default HAProxy operates in a tunnel-like mode with regards to persistent
connections: for each connection it processes the first request and forwards
everything else (including additional requests) to selected server. Once
established, the connection is persisted both on the client and server
sides. Use “option http-server-close” to preserve client persistent connections
while handling every incoming request individually, dispatching them one after
another to servers, in HTTP close mode. Use “option httpclose” to switch both
sides to HTTP close mode. “option forceclose” and “option
http-pretend-keepalive” help working around servers misbehaving in HTTP close
mode.

………………………..

It is important to note that by default, HAProxy works in tunnel mode and
only inspects the first request of a connection, meaning that only the first
request will have the header appended, which is certainly not what you want.
In order to fix this, ensure that any of the “httpclose”, “forceclose” or
“http-server-close” options is set when using this option.
这段说的非常清楚了，也就是在默认的情况下haproxy与客户端和服务端都是会话保持的了。如果有用户A、B同时
访问代理服务器，那么很可能只有第一个用户的header会被发给服务器。可以参考如下模型
[CON] [REQ1] [REQ2] … [RESP1] [RESP2] [CLO] …
所以如果我们想要让后端web服务器每次都能获取到用户的IP，在haproxy里只能添加
A.
option forwardfor header Client-IP
option httpclose # client–短连接–haproxy–短连接—webserver
或者
B
option forwardfor header Client-IP
option http-server-close #client–长连接–haproxy–短连接–webserver

但是此时haproxy与后端每次都只能使用短链接，实际上不是太理想。因为我们往往不希望代理服务器对每次处理用户的一个新请求都往服务器再新发一个请求。不过目前为止haproxy还不支持与后端服务器的keepalive,不知道在1.5里到底会不会实现。
People often ask for SSL and Keep-Alive support. Both features will complicate the code and render it fragile for several releases. By the way, both features have a negative impact on performance :

Having SSL in the load balancer itself means that it becomes the bottleneck. When the load balancer’s CPU is saturated, the overall response times will increase and the only solution will be to multiply the load balancer with another load balancer in front of them. the only scalable solution is to have an SSL/Cache layer between the clients and the load balancer. Anyway for small sites it still makes sense to embed SSL, and it’s currently being studied. There has been some work on the CyaSSL library to ease integration with HAProxy, as it appears to be the only one out there to let you manage your memory yourself.
Keep-alive was invented to reduce CPU usage on servers when CPUs were 100 times slower. But what is not said is that persistent connections consume a lot of memory while not being usable by anybody except the client who openned them. Today in 2009, CPUs are very cheap and memory is still limited to a few gigabytes by the architecture or the price. If a site needs keep-alive, there is a real problem. Highly loaded sites often disable keep-alive to support the maximum number of simultaneous clients. The real downside of not having keep-alive is a slightly increased latency to fetch objects. Browsers double the number of concurrent connections on non-keepalive sites to compensate for this. With version 1.4, keep-alive with the client was introduced. It resulted in lower access times to load pages composed of many objects, without the cost of maintaining an idle connection to the server. It is a good trade-off. 1.5 will bring keep-alive to the server, but it will probably make sense only with static servers.

However, I’m planning on implementing both features in future versions, because it appears that there are users who mostly need availability above performance, and for them, it’s understandable that having both features will not impact their performance, and will reduce the number of components.
之前nginx的upstream模块里面只支持http 1.0，不支持与后端服务器的keepalive。令人高兴的是现在开发版nginx的upstream模块里支持对后端服务器的keepalive了，测试了一下还不错

worker_processes 1;
events {
use epoll;
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
log_format main ‘$remote_addr – $remote_user [$time_local] “$request” ‘
‘$status $body_bytes_sent “$http_referer” ‘
‘”$http_user_agent” “$http_x_forwarded_for”‘;
access_log logs/access.log main;
sendfile on;
keepalive_timeout 65;
gzip on;
server {
listen 80;
server_name localhost;
location / {
root html;
index index.html index.htm;
proxy_http_version 1.1;
proxy_set_header Connection “”;
proxy_pass http://http;
proxy_set_header Host $host;
proxy_set_header Client_IP $remote_addr;
proxy_set_header X-Forwarded-By $server_addr:$server_port;

}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
upstream http{
server 10.13.20.3:80;
keepalive 200 single;
}
}
测试的时候可以看到每次请求后nginx没有直接关闭后后端服务器的链接，新的请求进来后后端服务器也没有新增链接。

从haproxy去年的邮件列表里看到过有用户提出的这个问题，Willy最后去回复了，估计以后haproxy也会支持的。

http://www.serverphorums.com/read.php?10,301582

服务器维护

专注服务于当下互联网基础设施建设与云计算、大数据时代的各种需求!

haproxy做代理后端nginx获取客户端真实ip注意事项