Whitelisting IPs in Limiting Request Rate of Nginx and Varnish

Sometimes we want to exclude IP blocks from limited request rate zone of web servers, here is how we can do it in Nginx and Varnish, the Nginx way needs crappy hacks, on the other hand, Varnish handles it really elegant.

The Nginx way:

# in the http block, we define a zone, and use the geoip
# module to map IP addresses to variable
http {
    ...
    limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
    # geo directive maps $remote_addr to $block variable.
    # limited as default
    geo $block {
        default          limited;
        10.10.10.0/24    blacklist;
        192.168.1.0/24   whitelist;
        include more_geoip_map.conf;
    }
}

# server block
server {
    ...
    location /wherever {
        if ($block = whitelist) { return 600; }
        if ($block = limited)   { return 601; }
        if ($block = blacklist) { return 403; }
        # error code 600 and 601 goes to each's internal location.
        error_page 600 = @whitelist;
        error_page 601 = @limited;
    }

    # @whitelist have no limiting, it just passes
    # the requests to backend.
    location @whitelist {
        proxy_pass http://backend;
        # feel free to log into other file.
        #access_log /var/log/nginx/$host_whitelist.access.log;
    }

    # insert limit_req here.
    location @limited {
        limit_req zone=one burst=1 nodelay;
        proxy_pass http://backend;
        # feel free to log into other file.
        #access_log /var/log/nginx/$host_limited.access.log;
    }
    ...
}

The Varnish way:

vcl 4.0;

import vsthrottle;

acl blacklist {
    "10.10.10.0/24";
}

acl whitelist {
    192.168.1.0/24;
}

sub vcl_recv {
    # take client.ip as identify to distinguish clients.
    set client.identity = client.ip;
    if (client.ip ~ blacklist) {
        return (synth(403, "Forbidden"));
    }
    if ((client.ip !~ whitelist) && vsthrottle.is_denied(client.identity, 25, 10s)) {
        return (synth(429, "Too Many Requests"));
    }
}

As you can see, unlike Nginx, Varnish has the powerful if directive, it works just like you’d expect.

Advanced Limiting Request with Nginx (or Openresty)

Nginx had the ngx_http_limit_req_module which can be used to limit request processing rate, but most people seem to only used its basic feature: limiting request rate by the remote address, like this:

http {
    limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
    ...
    server {
        ...
        location /search/ {
            limit_req zone=one burst=5;
        }

This is the example configuration taken from Nginx’s official document, the limit_req_zone directive takes variable $binary_remote_addr as a key to limit incoming request. The key fills a zone named one, which is defined by the zone parameter, it could use up to 10m memory. And the rate parameter says that the maximum request rate of each $binary_remote_addr is 1 per second. In the search location block, we can use limit_req directive to refer the one zone, with the bursts not exceeding 5 requests.

Everything seems great, we have configured Nginx against rogue bots/spiders, right?

No! That configuration won’t work in real life, it should never used in your production environments! Take these following circumstances as examples:

When users access your website behind a NAT, they share the same public IP, thus Nginx will use only one $binary_remote_addr to do limit request. Hundreds of users in total could only be able access your website 1 time per second!
A botnet is used to crawl your website, it use different IP addresses each time. Again, in this situation, limiting by $binary_remote_addr is totally useless.

So what configuration should we use then? We need to use a different variable as the key, or even multiple variables combined together (since version 1.7.6, limit_req_zone‘s key can take multiple variables). Instead of remote address, it is better to use request HTTP headers to distinguish users apart, like User-Agent, Referer, Cookie, etc. These headers are easy to access in Nginx, they are exposed as built in variables, like $http_user_agent, $http_referer, $cookie_name, etc.

For example, this is a better way to define a zone:

http {
    limit_req_zone $binary_remote_addr$http_user_agent zone=two:10m rate=90r/m;
}

It combines $binary_remote_addr and $http_user_agent together, so different user agent behind NATed network can be distinguished. But it is still not perfect, multiple users could use a same browser, same version, thus they send same User-Agent header! Another problem is that the length of $http_user_agent variable is not fixed (unlike $binary_remote_addr), a long header could use a lot of memory of the zone, may exceeds it.

To solve the first problem, we can use more variables there, cookies would be great, since different user sends their unique cookies, like $cookie_userid, but this still leaves us the second problem. The answer it to use the hashes of variables instead.

Thers is a third-party module called set-misc-nginx-module, we can use it to generate hashes from variables. If you are using Openresty, this moule is already included. So the configuration is like this:

http {
    ...
    limit_req_zone $binary_remote_addr$cookie_hash$ua_hash zone=three:10m rate=90r/m;
    ...
    server {
        ...
        set_md5 $cookie_hash $cookie_userid;
        set_md5 $ua_hash $http_user_agent;
        ...
    }
}

It is OK we used $cookie_hash and $ua_hash in the http block before they are defined in server block. This configuration is great now.

Let’s continue with the distributed botnet problem now, we need to take $binary_remote_addr out of the key, and since those bots usually don’t send Referer header (else you can found what unique about it by yourself), we can take advantage of it. This configuration should take care of it:

http {
    ...
    limit_req_zone $cookie_hash$referer_hash$ua_hash zone=three:10m rate=90r/m;
    ...
    server {
        ...
        set_md5 $cookie_hash $cookie_userid;
        set_md5 $referer_hash $http_referer;
        set_md5 $ua_hash $http_user_agent;
        ...
    }
}

nginx 直接向 logstash 发送 JSON 格式日志

通过 nginx 的 log_format 指令可以很容易直接就在 nginx 这里就生成（几乎是）JSON 格式的消息发送到 logstash

log_format  logstash '{"@timestamp":"$time_iso8601",'
                     '"@version":"1",'
                     '"host":"$server_addr",'
                     '"client":"$remote_addr",'
                     '"size":$body_bytes_sent,'
                     '"domain":"$host",'
                     '"method":"$request_method",'
                     '"url":"$uri",'
                     '"status":"$status",' # status 有可能会是以0开头，比如“009”这样的状态码，因此不能以数值形式保存，需要加括号存为字符串
                     '"referer":"$http_referer",'
                     '"user_agent":"$http_user_agent",'
                     '"real_ip":"$http_x_real_ip",'
                     '"forwarded_for":"$http_x_forwarded_for",'
                     '"responsetime":$request_time,'
                     '"upstream":"$upstream_addr",'
                     '"upstream_response_time":"$upstream_response_time",'
                     '"cache_fetch":"$srcache_fetch_status",' # 统计srcahe缓存命中率
                     '"cache_store":"$srcache_store_status"}';

但是这个还是会有问题，因为中文字符编码的原因，有时候 url 或者 referer 头里面可能会出现”\x”这样的跳脱字符，导致 JSON 解析失败，这个时候需要在 logstash 的 filter 里面再加上配置

filter {
  if [type] == "nginx-access-syslog" {
    mutate {
      gsub => [
        # replace '\x' with '\\x', or json parser will fail
        "message", "\\x", "\\\\x"
      ]
    }
  }
}

把”\x”替换为”\\x”。（这个方法不通用，只对”\x”做了处理，但一般应该也只有”\x”会出现了。。。）
然后我们可以用 kibana 画图了，以上收集的参数还算比较丰富了，可以做出很不错的 dashboard 了

少用 if … rewrite，多用 try_files

前段时间配置 Nginx 时候碰到了一个问题：Nginx 模块内部暴露出来的资源无法访问。具体来说就是比如某个模块导出了一个资源（比如 /channels-stats）提供统计信息,如下：

location /channels-stats {
# activate channels statistics mode for this location
push_stream_channels_statistics;

# query string based channel id
push_stream_channels_path $arg_id;
}

而在配置文件中还有一下配置，如果访问的文件不存在，则重定向到使用 index.php 去进行访问

if (!-e $request_filename) {
rewrite ^/(.*)$ /index.php/?s=$1 last;
break;
}

这个时候，/channels-stats 是无法被访问到的，原因应该是 /channels-stats 这样的有模块导出的资源并不属于 $request_filename 这个变量的范畴。导致实际访问的其实是 /index.php/?s=channels-stats 。

这个时候也不能用 $request_uri, $uri 等变量去替换 $request_filename 的位置，因为 if 能测试的就只是文件是否存在而已。

想了想，发现这段时间都忘了 Nginx 的 try_files 指令了，以及 if is evil 这条 Nginx 金句了，用下面这条可以替代上面的，并且可以保证 /channels-stats 仍然可以访问

try_files $uri $uri/ /index.php?s=$request_uri;

这里讲了不该用 if 的：
http://wiki.nginx.org/Pitfalls#Check_IF_File_Exists
http://wiki.nginx.org/IfIsEvil
更新：这里的问题其实是把”if (!-e $request_filename)”测试放到 location / 下面就很好解决了的，不知道当时为什么脑子短路了，没有发现。。。

Nginx with SSL

因为有了StartSSL 这样的可以免费申请SSL证书的机构，所以我觉得把自己的一些站点放到https 下还是值得去做了的。（至少，总还是会有一部分人会对在非SSL加密下的登录页面感到不舒服吧）

SSL 的性能问题

SSL 增强了安全，但它有些问题。它会导致服务器的资源消耗比http 大（但对现在一般的机器来说似乎已经不再是个问题了）。很多情况下人们对于https 的链接还是有一个感觉——慢，准确点来说应该是初次连接到服务器时会感觉慢，而一旦连接建立后，速度上感觉是与明文的http相差无几的。这主要是因为本地浏览器与服务器端初次建立连接握手时需要进行（非对称）密钥交换，密钥交换的协议和密钥的加密算法、长度都会对这个时间有影响。但是连接建立后，内容是以密钥长度短得多的对称加密的方式在传输，而且现代的浏览器同样也能对https链接的内容进行缓存等操作，所以也就感觉不到什么差别了。

因为慢，所以要对它作些调节。但知道了原因后也就好办了。

密钥交换协议的选择。可以在Nginx 配置文件的ssl_ciphers 指令中添加“!kEDH”禁用先对耗时更长的DH 密钥交换机制，而使用RSA【1】。从1.0.5 版本开始，Nginx 默认的SSL ciphers是“HIGH:!aNULL:!MD5”，我们可以把它改为“HIGH:!aNULL:!MD5!kEDH”。应该看到如下变化：

使用DHE_RSA 进行key 交换

更改后的使用RSA key交换机制

服务器公钥长度的选择。长度越长的服务器公钥一般来说是越安全的，但建立连接的时间也会越长，所以这就需要作些取舍了。像Google 和Facebook 这两家规模巨大但提供了许多全程加密的web 服务的公司就使用了相对“弱”的1024位公钥（openssl s_client -connect domain.com:443 查看），这也许降低了建立连接的安全性，但更重要的是降低了连接建立时间。（BTW，Google 还在传输过程中对Chrome 浏览器使用了比HTTP 更快的SPDY 协议，并且key exchange 机制也是RSA 的。）

其它还有要注意的地方有服务器ssl_session_cache, ssl_session_timeout, keepalive_time 等指令的使用等。

——

一些有用的链接：

【1】. http://matt.io/technobabble/hivemind_devops_alert:_nginx_does_not_suck_at_ssl/ur

http://nginx.org/en/docs/http/configuring_https_servers.html

http://www.imperialviolet.org/2010/06/25/overclocking-ssl.html

http://blog.httpwatch.com/2011/01/28/top-7-myths-about-https/

http://blog.httpwatch.com/2009/01/15/https-performance-tuning/