Consul Service Discovery: Automate Nginx Upstream updates for dynamic load balancing

In this blog, I want to provide a practical example of implementing Hashicorp Consul Service Discovery to automate updating the Nginx upstream (load balancer) configuration file.

Imagine that you have three backend apps (running on either containers or virtual machines), and these apps are added to the Nginx upstream configuration so that every time you access the website, Nginx sends requests to these apps to process them. Whenever one of your app nodes is broken (due to any reason), you have to replace it with a new one and then also update the Nginx upstream config file to point to the newly launched node. Consul facilitates this process for you automatically without touching the config file.

I’ve also written another use case of Consul in Consul Service Discovery: Automate HAProxy Configuration with Consul for MySQL blog. Feel free to check it out if you’re interested. Now, we continue our topic.

Prerequisites

  • Consul version 1.21.2
  • We need 4 x Ubuntu servers. You could set them up by yourself, or if you use Vagrant/Virtualbox, I’ve created this project so that it speeds up your launching process. These servers include:
HostnameIP addressRole
web-frontend.srv.local192.168.68.50Web server with Nginx installed
consul-server.srv.local192.168.68.51A single consul server
web-backend-01.srv.local192.168.68.52Install a demo app to process
web-backend-02.srv.local192.168.68.53Install a demo app to process

NOTED: In production, it’s recommended to run a Consul cluster of at least 3 nodes minimum. In this demo, we just use a single one to simplify the setup and start Consul with the -dev option to enable development server mode. The main idea is to show the use case of Consul in real life. I’ve also written Puppet Series – Automate Consul Cluster Setup if you’re interested in using Puppet to automate the cluster setup in the production.

Set up a Consul server

The installation guide is here. But feel free to follow me, login consul-server server and run the following command:

# Update APT
sudo apt-get update -y

# Download GPG key
wget -O - https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg

# Register APT source in order to install "consul" through APT
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list

# Update APT again and install consul
sudo apt update && sudo apt install consul

Check if consul is installed successfully, run:

consul version

Output:

Consul uses all configuration files in the /etc/consul.d folder. And we have a default consul.hcl config file that appears there. Let’s move this file to a new file named consul.hcl.back so that we can create a new consul.hcl file with our own configuration.

sudo mv /etc/consul.d/consul.hcl /etc/consul.d/consul.hcl.back

Run sudo nano /etc/consul.d/consul.hcl and add the following content:

node_name = "consul-server"
bind_addr = "{{ GetAllInterfaces | include \"network\" \"192.168.68.0/24\" | sort \"size,address\" | attr \"address\" }}"
advertise_addr = "192.168.68.51"
client_addr = "0.0.0.0"
data_dir = "/opt/consul"
encrypt = "zXrwndHlj2mFve09qkl7oq6H4ZbPwKB4c1jOhhHtxys="
datacenter = "dc1"
ui_config {
  enabled = true
}
server = true
log_level = "INFO"

Press Ctrl + O and press Enter to save the content. Then press Ctrl + X to exit.

Where:

  • node_name: is the name of the node, and it must be unique within the cluster. Defaults to “hostname”.
  • bind_addr: specify the network address where it is reachable by all other nodes.
  • advertise_addr: specify the IP address advertised to other nodes in the cluster. By default, it binds to bind_addr.
  • client_addr: set the network address that the Consul API should listen on. "0.0.0.0" allows us to query from other hosts. If we set it to "127.0.0.1", we can only query from the consul-server.
  • data_dir: specify the directory where an agent stores state, required for all servers that install Consul
  • encrypt: Specifies the secret key (must be 32 bytes that are Base64 encoded) to encrypt the Consul network traffic. We can run consul keygen to generate a secret key and replace the returned output with the highlighted key above.
  • datacenter: controls the datacenter in which the agent is running. Defaults to “dc1”
  • ui_config: enables the Consul UI, which we can open in a browser.
  • server: specify this Consul agent runs as “server” or “client” mode. We use “server” mode.
  • log_level: set log level to a desired level after the agent starts. Available levels are "trace", "debug", "info", "warn", and "error".

You can visit the Agent Configuration files to know more about these parameters and other parameters.

Validating your configuration by running:

consul validate /etc/consul.d/consul.hcl

Output:

Configuration is valid!

NOTED: if you set bind_addr to 0.0.0.0, or don’t set it, it by default automatically sets to 0.0.0.0. This works when your server has only 1 network interface. If your machine has more than one network interface, e.g. eth0 and eth1. You may get this error when validating or starting Consul service:

Config validation failed: Multiple private IPv4 addresses found. Please configure one with 'bind' and/or 'advertise'.

That’s because the bind_addr option will bind to all addresses on the local machine and use advertise_addr to advertise the private IPv4 address (default to the bind address) of the server to the rest of the cluster. It works if you have one network interface only, but if you more than one, which means you would have multiple IPv4, Consul exits with an error, check out bind_addr for more details.

In this case, we need to explicitly set the desired network address to bind_addr and a real IP address of the network interface to advertise_addr. For example:

bind_addr = "192.168.68.0/24"
advertise_addr = "192.168.68.51"

Or can set like this way:

bind_addr = "{{ GetAllInterfaces | include \"network\" \"192.168.68.0/24\" | sort \"size,address\" | attr \"address\" }}"
advertise_addr = "192.168.68.51"

Now, start consul by using this command to let it run under development server mode (For development purposes only):

sudo nohup consul agent -dev -config-dir /etc/consul.d/ &

There’s a nohup.out file created in the current directory where you are, run:

sudo tail -100f nohup.out

Output:

2025-07-23T07:40:54.137Z [INFO]  agent.server: member joined, marking health alive: member=consul-server partition=default
2025-07-23T07:40:54.140Z [INFO]  agent.server: New leader elected: payload=consul-server
2025-07-23T07:40:54.141Z [INFO]  agent.server: federation state anti-entropy synced
2025-07-23T07:40:54.143Z [INFO]  agent.leader: stopping routine: routine="virtual IP version check"
2025-07-23T07:40:54.143Z [INFO]  agent.leader: stopped routine: routine="virtual IP version check"
2025-07-23T07:40:54.357Z [INFO]  agent: Synced node info

If you see a message that looks similar to “Synced node info”. You have successfully started Consul. Can access http://<your-consul-server-ip>:8500 to check

NOTED: DO NOT start Consul by running systemctl start consul.service because that’s for running in production mode. It won’t allow you to start then because it requires quite a lot of setups afterward. If you’re interested in configuring for production, please consult this docs.

Set up Consul agents on the other servers

Once we have the Consul server up and running, before we configure Nginx on web-frontend server, and also configure our own Python application running on web-backend servers, we need to set up Consul agents for these servers. The Consul agent (running in client mode) registers the nodes’ details to the Consul server, and then, these nodes will be able to register any service details to the Consul server as well.

Similar to installing the Consul server, on all the other servers, we run:

# Update APT
sudo apt-get update -y

# Download GPG key
wget -O - https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg

# Register APT source in order to install "consul" through APT
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list

# Update APT again and install consul
sudo apt update && sudo apt install consul

Remember to run consul version to check if you successfully installed.

Again, we continue backing up the default config consul.hcl and editing a new file

sudo mv /etc/consul.d/consul.hcl /etc/consul.d/consul.hcl.back

Run sudo nano /etc/consul.d/consul.hcl and add the following content for agent nodes:

# Change <changed-to-your-node-name> to e.g. web-backend-01
node_name = "<changed-to-your-node-name>"
server    = false
datacenter = "dc1"
data_dir   = "/opt/consul"
encrypt    = "zXrwndHlj2mFve09qkl7oq6H4ZbPwKB4c1jOhhHtxys="
log_level  = "INFO"

# We must enable this parameter to opt-in health checks that run scripts locally
# Used for registering service that we configure later
enable_local_script_checks = true

# used for internal RPC and Serf
bind_addr = "{{ GetAllInterfaces | include \"network\" \"192.168.68.0/24\" | sort \"size,address\" | attr \"address\" }}"

# Used for HTTP, HTTPS, DNS, and gRPC addresses.
# loopback is not included in GetPrivateInterfaces because it is not routable.
client_addr = "{{ GetPrivateInterfaces | exclude \"type\" \"ipv6\" | join \"address\" \" \" }} {{ GetAllInterfaces | include \"flags\" \"loopback\" | join \"address\" \" \" }}"

# GetInterfaceIP returns the IP address on "eth1", which is e.g. 192.168.68.52 for web-backend-01
advertise_addr = "{{ GetInterfaceIP \"eth1\" }}"

# Join this Consul agent as member to the Consul server. 
# Change "consul-server.srv.local" to your consul-server's IP. 
# I'm using DNS because I've set all servers' hostname in /etc/hosts
retry_join = ["consul-server.srv.local"]

Press Ctrl + O and press Enter to save the content. Then press Ctrl + X to exit.

Remember to run consul validate /etc/consul.d/consul.hcl and start consul by:

sudo nohup consul agent -dev -config-dir /etc/consul.d/ &

Output in nohup:

2025-07-23T08:25:36.730Z [INFO]  agent: (LAN) joined: number_of_nodes=1
2025-07-23T08:25:36.730Z [INFO]  agent: Join cluster completed. Synced with initial agents: cluster=LAN num_agents=1
2025-07-23T08:25:36.731Z [INFO]  agent.client: adding server: server="consul-server (Addr: tcp/192.168.68.51:8300) (DC: dc1)"
2025-07-23T08:25:39.550Z [INFO]  agent: Synced node info

If you do everything correctly, on an arbitrary node, run:

consul members

Output:

All nodes appear there with consul-server (server mode) and the others (client mode)

Register a Python application on the web-backend as a Consul service

To register our application with Consul as a service, we need to set up our app on web-backend servers first and configure Consul to register this application as the service.

Set up a demo Python application

On 2 web-backend servers, at your /home/user/ directory (my one is /home/vagrant), create an app/ folder a where we store our application:

mkdir app && cd app/

Run nano demo-webserver.py and add the following content

#!/usr/bin/env python3
from http.server import BaseHTTPRequestHandler, HTTPServer
import socket

PORT = 3000
HOSTNAME = socket.gethostname()

class MyHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header("Content-type", "text/plain")
        self.end_headers()
        message = f"Welcome to my server on {HOSTNAME}"
        self.wfile.write(message.encode())

def run(server_class=HTTPServer, handler_class=MyHandler):
    server_address = ("", PORT)
    httpd = server_class(server_address, handler_class)
    print(f"Serving on port {PORT}...")
    httpd.serve_forever()

if __name__ == "__main__":
    run()

Press Ctrl + O and press Enter to save the content. Then press Ctrl + X to exit.

Ubuntu 24.04, by default, comes with python3 already, you can check by python3 --version. If it doesn’t work for any reason, you have to install python3 package then.

Grant an executable permission to this demo-webserver.py so that it can be run as an executable file:

sudo chmod 755 demo-webserver.py

Still in app/ directory, start the application:

nohup ./demo-webserver.py &

Just noted that in production, we will use systemd or any other better way to manage this app rather than using nohup.

Test the application, we open the browser and access http://<web-backend-server-ip>:3000, for example:

Register service to the Consul server

We need to create a service configuration file to register the demo-webserver.py application as a service to the Consul server. Run sudo nano /etc/consul.d/svc-python-app.hcl and add the following content:

## -----------------------------
## svc-python-app.hcl
## -----------------------------
service {
  name = "python-app"
  id = "python-app-web"
  tags = [ "web-backend-app" ]
  port = 3000
  
  check {
    args = ["curl", "localhost:3000"]
    interval = "5s",
  }
}

Press Ctrl + O and press Enter to save the content. Then press Ctrl + X to exit.

Reload consul agent to register the service “python-app” defined in the newly created configuration file.

sudo consul reload

To check whether you successfully registered the service, access http://<your-consul-server-ip>:8500/ui/dc1/services/<your-service-name>/instances. For example, mine http://192.168.68.51:8500/ui/dc1/services/python-app/instances, we will see that there are two services from web-backend servers registered under the service name “python-app”:

Automate updates of Nginx Upstream configuration file on web-frontend server

Next, on the web-frontend server, we need to configure Consul to automatically update the Nginx upstream configuration file (/etc/nginx/conf.d/upstream-web-backend.conf) to point to our web-backend servers.

To update the upstream-web-backend.conf file, we use the consul-template binary to automate the process. This command-line tool makes API calls to the Consul server to get server details, then replaces the pre-defined values in a template (we create it later) and creates the upstream-web-backend.conf file based on it.

Firstly, we need to install Nginx and consul-template, we run:

sudo apt install nginx consul-template -y

Verify consul-template and nginx installation:

consul-template --version
nginx -v

Configure consul-template

In /etc/nginx/conf.d/ directory, we will create an upstream template file named upstream-web-backend.conf.ctmpl.

Run sudo nano /etc/nginx/conf.d/upstream-web-backend.conf.ctmpl and add the following content:

upstream web-backend {
 {{ range service "python-app" }} 
  server {{ .Address }}:{{ .Port }} weight=2 max_fails=3 fail_timeout=10s; # {{ .Node -}}
 {{ end }} 
}

Press Ctrl + O and press Enter to save the content. Then press Ctrl + X to exit.

To learn more about the Nginx upstream parameters, please refer to this link. I just explain basically here:

  • server: set the servers with <ip>:<port> where Nginx forwards requests to
  • weight: is a weighted round-robin balancing method, for example, with a total of 7 requests, each server receives 2 requests each time, and the last request will go to a random server.
  • max_fails: means after x number(s) of failed attempts to communicate with the server, mark the server as unavailable.
  • fail_timeout: combined with max_failes, to determine a max duration when it attempts to communicate to the server, after x duration of each attempt, consider that the server is unavailable.

Next, we create a /etc/consul-template directory where we store consul-template configuration files. consul-template then fetches web-backend servers’ details through the Consul server, and updates the upstream-web-backend-app.conf.ctmpl file if needed, run:

sudo mkdir /etc/consul-template

And run sudo nano /etc/consul-template/consul-template.hcl and add the following content:

consul {
  address = "consul-server.srv.local:8500"

  retry {
    enabled  = true
    attempts = 15
    backoff  = "250ms"
  }
}
template {
  source      = "/etc/nginx/conf.d/upstream-web-backend.conf.ctmpl"
  destination = "/etc/nginx/conf.d/upstream-web-backend.conf"
  perms       = 0644
  command     = ["nginx", "-s", "reload"]
}

Start the consul-template service to let it monitor and generate the upstream config file for us:

sudo nohup consul-template -config=/etc/consul-template/consul-template.hcl &

After you have started consul-template successfully, check your configuration file at

cat /etc/nginx/conf.d/upstream-web-backend.conf

Output:

Configure Nginx default site

We need to instruct the Nginx default site to use our upstream configuration named “web-backend”. Before doing that, we need to remove the default site, run:

sudo rm /etc/nginx/sites-enabled/default

Create our own default site configuration to test, run sudo nano /etc/nginx/sites-enabled/default-site and add the following content:

server {
   listen 80;
   
   location / {
      proxy_pass http://web-backend;
   }
}

Validate Nginx Configuration file and Reload Nginx service

sudo nginx -t && sudo nginx -s reload

Testing Load Balancer

Once it’s done, we access the http://<ip-of-web-frontend-server>, in my case is http://192.168.68.50/, we got different results every time we reload the site:

That has worked! We can see requests are being load-balanced between those two web backend servers.

We can try to kill the demo-webserver.py app on the web-backend-01 server to test, and then check the Consul Management UI on the “consul-server” server. It will look like:

Go back to the web frontend server to check the /etc/nginx/conf.d/upstream-web-backend.conf file; we see that web-backend-01 has been removed!

Once you start the demo-webserver.py app on web-backend-01 back, and then go back to check the upstream-web-backend.conf file, web-backend-01 will be added back then.

Ok. We’re done with the demo. Ideally, we could automate this manual setup using tools like Ansible or Puppet. Docker (etc.). However, that’s beyond what I want to cover. The goal is to demonstrate how we automate updates to the Nginx Upstream configuration file. Hope you gain some useful knowledge. Thanks for reading the blog.


Discover more from Turn DevOps Easier

Subscribe to get the latest posts sent to your email.

By Binh

Leave a Reply

Your email address will not be published. Required fields are marked *

Content on this page