Nginx

What is nginx?

A lightweight, high-performance web server designed for high-traffic use cases.

**Serving Static Files: **One of NGINX’s strongest features is the ability to efficiently serve static content such as HTML and media files (musch faster than aiohttp or express static file support). NGINX uses an asynchronous event-driven model which provides predictable performance under load. Example

Reverse Proxy: a reverse proxy is a service that takes a client request, sends the request to one or more proxied servers, fetches the response, and delivers the server’s response to the client. Using Nginx as a reverse proxy gives you several additional benefits:

nginx can provide features on top of a basic Layer 4 (TCP) load balancer. Many applications require additional features, like these, to name just a few:


Nginx Internals

Nginx has one master process and several worker processes. The main purpose of the master process is to read and evaluate configuration, and maintain worker processes. Worker processes do actual processing of requests. nginx employs event-based model and OS-dependent mechanisms to efficiently distribute requests among worker processes.


Understand nginx configuration language

The nginx configuration language is designed with these uses in mind:

We need some global parameters that any server process needs to start: which port to listen to, which user and group to run as, the process id, where to store logs etc. (user, , error_log)

We then need some global parameters specific to nginx internal architecturer: how many worker processes it should have and how many connections does each worker accept (worker_processes, worker_connections). The workers handle connections from clients, and as a rule for CPU-bound loads the number of workers should equal number of processors and for I/O bound loads multiplied by? by 1.5 or 2

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
       . . .
}

http {
       . . .
}

We then want some instructions to be applied to requests. Often it is the case we want the same set of instructions to apply to requests requesting to the same location in their urls, going to a particular server or following a particular protocol (HTTP). So we allow these instructions (called directives) to be grouped together into blocks (or context).

<block> {

  <directive> <parameters>; 

}

For example, the http block contains directives for handling web directive. What might we want to put in instructions that apply to all web traffic? The location of special logs for this, the format of the log, whether we want the connections to be persistent for any time, etc….


http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

What might we want to put in a server block? We want to be able to serve static files. We want to be able to host multiple websites on a single address We want to be able to proxy the request. We might want to modify url, add headers,

How do we specify instructions? Similar to unix or ansible, nginx makes use of modules which take in arguments (parameters). A simple directive consists of the name and parameters separated by spaces and ends with a semicolon (;).

Example of modules:

include files can be used anywhere in your configuration file include /opt/local/etc/nginx/mime.types; A wildcard may appear in the path to match multiple files: include /opt/local/etc/nginx/vhost/*.conf; The location of your access.log file should be defined in your /etc/nginx/nginx.conf file or your site configuration file within your server block. By default, this would be in /etc/nginx/sites-available.

error_log- locations of error_logs. Can be overwritten within directives. A second param indicates the level of log debug (only available when debugging mod configured at compilation), info, notice, warn, error, crit, alert and emerg.pid - the file where the process ID of the main process is written, overwriting the compiled-in default. error_page code ... [=[response]] uri;

try_files So add a catch-all fallback route to your nginx location directive: If the URL doesn’t match any static assets, it should serve the same index.html:

location / {
  try_files $uri $uri/ /index.htm
}

nginx also allows us to set variables, and provides some default environmental variables. Variables can be accessed by a $. Some of the variables are $host, $remote_addr, $upstream_http_x_forwarded_ (X-Forwarded-Something headers), $http_cookie etc. If you feel you need some piece of data that should be available, search for the appropriate variable.


How nginx processes a request?

  1. Filter Servers based on listen directive (port and IP address)
  2. Filter server_names based on host header (longest specific prefix, then first matching regular expression if not exact match in previous step)
  3. After selecting server, select location based on $uri part of the request.
  4. Once location match is found, modify the request path with root or alias directives, and apply any rewrite rules

Tutorial 1 The order in which server and location blocks are selected

Alias vs Rewrite The alias is a directive which is similar to the URL re-write directive, which is used to rewrite the location. The only difference between them is that the URL displayed in the browser remains the same with alias but it changes with the rewrite directive. The rewrite directive works on the browser end, ie- the browser requests twice, once for the actual URL and another time with the modified(re-written) URL. While the alias directive works on the Nginx server end only, it internally modifies the URL, without the browser’s knowledge.

Depending of node app architecture it may be needed to replace proxy_pass http://localhost:3000; with proxy_pass http://localhost:3000/ to rewrite request URIs from /api/route/path/ to /route/path/

Serving on SubPath, Rewriting URLs


Examples

Remove particular cookies This snippet should remove some_cookie cookie from the request before passing
it to the backend:

set $new_cookie $http_cookie;

if ($http_cookie ~ "(.*)(?:^|;)\s*some_cookie=[^;]+(.*)") {
    set $new_cookie $1$2;
}
proxy_set_header Cookie $new_cookie

Location match based on regular expression

location ~* \.(?:ico|css|js|gif|jpe?g|png)$ {
    # Some basic cache-control for static files to be sent to the browser
    expires max;
    add_header Pragma public;
    add_header Cache-Control "public, must-revalidate, proxy-revalidate";
  }

Nginx as Docker Container

NGINX can be controlled by signals and Docker provides kill command for sending signals to a container. For example, to reload the NGINX configuration run the command:

docker kill -s HUP <container name>`

If you want to restart the NGINX process, restart the container by running the command:

docker restart <container name>`

I thought docker-compose would keep the same IP between runs of the same container but it changes with each restart causing the nginx.conf to become stale. I can fix it by sending a SIGHUP to nginx which will make it reload the conf file and read the updated IP from the hosts file. I suppose there isn’t a way to tell docker or compose to keep the same IP when restarting a container? Also described here: https://medium.com/@joatmon08/using-containers-to-learn-nginx-reverse-proxy-6be8ac75a757


Frequently Used Commands

nginx -t test your settings 
nginx -t -c
nginx -s reload
service nginx restart

Useful Reads

14 February 2020