Achieving high availability and security in backend services with Skipper swarm mode and rate limiting

06 April, 2023

5 min read


Last updated on 06 April, 2023

High availability and security are essential for any modern web applications. High availability refers to the ability of a service to remain accessible and operational even in the face of failures, errors, or heavy loads. In today's fast-paced, always-on world, users expect applications to be available around the clock, and any downtime can be costly both in terms of lost revenue and damage to reputation.

Rate limiting is a popular technique for controlling the amount of traffic that is allowed to access a particular service. By limiting the rate of incoming requests, we prevent service overloads, DDoS attacks, and other security threats. At the same time, rate limiting can also help to improve application performance and reliability, by ensuring that resources are allocated efficiently and that critical services are not overwhelmed by high volumes of traffic.

There are many different approaches to implementing rate limiting, ranging from simple time-based throttling to more sophisticated algorithms based on request patterns, user behavior, and other factors. Some rate limiting techniques rely on in-memory caching or distributed data stores to keep track of traffic volumes, while others use machine learning and other advanced techniques to adapt to changing traffic patterns and threats.

Skipper is a cloud-native application, developed intitially at Zalando – company I currently work in. We use skippers everywhere, including main Zalando edge proxy. Our skipper pods can serve more than 250 000 rps in peak loads, routing traffic to dozens of small services within rich Zalando ecosystem.

As an ingress controller, Skipper provides a way to manage incoming traffic to a cluster of microservices, routing requests to the appropriate service based on configurable rules and policies. Skipper supports a wide range of protocols and formats, including HTTP, WebSockets, gRPC, and more. It also includes a powerful routing engine that can handle complex routing scenarios, including load balancing, circuit breaking, and rate limiting.

Let's see how we can rate limit an access to our generic service fast.

code examples and containers are not production-ready. They serve just as blueprints so you can estimate faster how complex would be to set up the whole flow in production. Be aware and always think strategically before bringing any set up to your current systems.

Let's make basic API service. I'd do it in Rust. You can do it in any language.

use axum::{
    Router, routing::get,
use std::net::SocketAddr;

async fn hello_world() -> &'static str {
    "Hello, world!"

async fn main() {
    // Define the router
    let app = Router::new().route("/", get(hello_world));

    // Define the server address
    let addr = SocketAddr::from(([0, 0, 0, 0], 8080));

    // Start the server

And make Dockerfile for it.

# Set the base image
FROM rust:latest as build

# Create a new directory for the application code
WORKDIR /usr/src/skipper-rate-limiting

# Copy the application code into the container
COPY . .

# Build the application
RUN cargo build --release

# Create a new container for the application
FROM debian:buster-slim

# Install OpenSSL
RUN apt-get update \
    && apt-get install -y openssl curl \
    && rm -rf /var/lib/apt/lists/*

# Copy the binary from the build container into the new container
COPY --from=build /usr/src/skipper-rate-limiting/target/release/skipper-rate-limiting /usr/local/bin/skipper-rate-limiting

# Set the default command for the container
CMD ["skipper-rate-limiting"]

Here are our dependency list. Very slim, as much as our codebase. This is because we won't implement rate limiting on service level, we'll do it on ingress level. Code doesn't matter much in our case.

name = "skipper-rate-limiting"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at

axum = "0.6.12"
tokio = { version = "1.27.0", features = ["full"] }

Let's assemble compose file for our desired set up.

version: '3'

  # Skipper ingress
    image: # you can pick your release from there:
      - "9090:9090"
      - ./skipper:/etc/skipper
    command: ["skipper", "-routes-file", "/etc/skipper/routes.eskip", "-enable-swarm", "-enable-ratelimits", "-swarm-redis-urls", "redis:6379"]
      - redis

  # Generic Rust app
    build: .
      - "8080:8080"

    image: redis:latest
    restart: always
      - "6379:6379"

    external: true

As you may see, we bring redis to our set up. It's because we'd run skipper in swarm mode.

Skipper Swarm Mode

Skipper Swarm mode is a highly scalable and distributed mode of operation for Skipper. In this mode, Skipper can be run on a cluster of nodes, allowing for automatic load balancing, fault tolerance, and failover. It's a high chance you'd need swarm mode in production, but it depends if you are ready to bring a state to your edge or not.

State of your skippers swarm would be held in redis cluster. You can provide as many redis urls as you have in your cluster, when you launch your skipper swarm mode.

In that way your rate limits would not be stored locally per each skipper pod, you'd have 'global' rate limit. Doesn't matter, you have 3 skipper pods or 50  – your rate limits will be applied consistently.

Let's make routes.

    -> clusterRatelimit("authorization_rate_limit", 10, "1h")
    -> "http://app:8080/";

    Path("/test") && Method("GET")
    -> status(200)
    -> inlineContent("route valid")
    -> <shunt>;

For route definition, skipper uses 'eskip' file format, it's a highly-extensive DSL for route definition. Skipper has a lot of pre-built filters and predicates to make your routing shine. Here you can see clusterRateLimit filter in myAppRoute route.

Let's boot this set up and check how it works. I do it with podman compose. Don't forget to mount your virtual machine in case you also use podman, same as me.

Testing with Apache Bench

Our route promises us that we are limiting requests in rate of 10 each hour. Let's validate that.

ab -n 10 http://localhost:9090/

When we check logs in skipper, we see:

[06/Apr/2023:16:58:48 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 5 localhost:9090 - -
 - - [06/Apr/2023:16:59:32 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 2 localhost:9090 - -
 - - [06/Apr/2023:16:59:36 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 0 localhost:9090 - -
 - - [06/Apr/2023:16:59:37 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 1 localhost:9090 - -
 - - [06/Apr/2023:16:59:39 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 2 localhost:9090 - -
 - - [06/Apr/2023:16:59:40 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 0 localhost:9090 - -
 - - [06/Apr/2023:16:59:42 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 1 localhost:9090 - -
 - - [06/Apr/2023:16:59:43 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 0 localhost:9090 - -
 - - [06/Apr/2023:16:59:45 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 1 localhost:9090 - -
 - - [06/Apr/2023:16:59:46 +0000] "GET / HTTP/1.1" 200 13 "-" "ApacheBench/2.3" 1 localhost:9090 - -
 - - [06/Apr/2023:16:59:47 +0000] "GET / HTTP/1.1" 429 0 "-" "ApacheBench/2.3" 0 localhost:9090 - -

Means, our rate limit was applied. We receive 429 status code.

*   Trying
* Connected to localhost ( port 9090 (#0)
> GET / HTTP/1.1
> Host: localhost:9090
> User-Agent: curl/7.86.0
> Accept: */*
* Mark bundle as not supporting multiuse
< HTTP/1.1 429 Too Many Requests
< Retry-After: 3539
< Server: Skipper
< X-Rate-Limit: 10
< Date: Thu, 06 Apr 2023 16:59:50 GMT
< Transfer-Encoding: chunked
* Connection #0 to host localhost left intact

By implementing best practices for high availability and security, such as deploying your application in a distributed, scalable infrastructure with a load balancer and API gateway like Skipper, you can be sure that your backends are well-protected and can handle any level of traffic. Additionally, implementing rate limiting and other security measures in Skipper is easy and intuitive.

However, it may bring additional complexity in your systems and that risk should be well-calculated.

Proper tools make you and your engineers a lot more happier. By investing in the right infrastructure and taking a proactive approach to security of your backends, you can help to mitigate risk and ensure that your application and its users remain safe and accessible.

You can subscribe on my newsletters

Let's see if we can become internet friends.

Check also related posts

Troy Köhler