ASP.NET Core rate limiting middleware in .NET 7

September 26, 2022 Edit on GitHub

Rate limiting is a way to control the amount of traffic that a web application or API receives, by limiting the number of requests that can be made in a given period of time. This can help to improve the performance of the site or application, and to prevent it from becoming unresponsive.

Starting with .NET 7, ASP.NET Core includes a built-in rate limiting middleware, which can be used to rate limit web applications and APIs. In this blog post, we’ll take a look at how to configure and use the rate limiting middleware in ASP.NET Core.

What is rate limiting?

Every application you build is sharing resources. The application runs on a server that shares its CPU, memory, and disk I/O, on a database that stores data for all your users.

Whether accidental or intentional, users may exhaust those resources in a way that impacts others. A script can make too many requests, or a new deployment of your mobile app has a regression that calls a specific API too many times and results in the database being slow. Ideally, all of your users get access to an equal amount of shared resources, within the boundary of what your application can support.

Let’s say the database used by your application can safely handle around 1000 queries per minute. In your application, you can set a limit to only allow 1000 requests per minute to prevent the database from getting more requests.

Instead of one global “1000 requests per minute” limit, you could look at your average application usage, and for example set a limit of “100 requests per user per minute”. Or chain those limits, and say “100 requests per user per minute, and 1000 requests per minute”.

Rate limits will help to prevent the server from being overwhelmed by too many requests, and still makes sure that all users have a fair chance of getting their requests processed.

Rate limiting in ASP.NET Core

If your application is using .NET 7 (or higher), a rate limiting middleware is available out of the box. It provides a way to apply rate limiting to your web application and API endpoints.

Note: Under the hood, the ASP.NET Core rate limiting middleware uses the System.Threading.RateLimiting subsystem. If you’re interested in rate limiting other resources, for example an HttpClient making requests, or access to other resources, check it out!

Much like other middlewares, to enable the ASP.NET Core rate limiting middleware, you will have to add the required services to the service collection, and then enable the middleware for all request pipelines.

Let’s add a simple rate limiter that limits all to 10 requests per minute, per authenticated username (or hostname if not authenticated):

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
        RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: httpContext.User.Identity?.Name ?? httpContext.Request.Headers.Host.ToString(),
            factory: partition => new FixedWindowRateLimiterOptions
            {
                AutoReplenishment = true,
                PermitLimit = 10,
                QueueLimit = 0,
                Window = TimeSpan.FromMinutes(1)
            }));
});

// ...

var app = builder.Build();

// ...

app.UseRouting();
app.UseRateLimiter();

app.MapGet("/", () => "Hello World!");

app.Run();

Too much at once? I agree, so let’s try to break it down.

The call to builder.Services.AddRateLimiter(...) registers the ASP.NET Core middleware with the service collection, including its configuration options. There are many options that can be specified, such as the HTTP status code being returned, what should happen when rate limiting applies, and additional policies.

For now, let’s just assume we want to have one global rate limiter for all requests. The GlobalLimiter option can be set to any PartitionedRateLimiter. In this example, we’re adding a FixedWindowLimiter, and configure it to apply “per authenticated username (or hostname if not authenticated)” - the partition. The FixedWindowLimiter is then configured to automatically replenish permitted requests, and permits “10 requests per minute”.

Further down the code, you’ll see a call to app.UseRateLimiter(). This enables the rate limiting middleware using the options specified earlier.

If you run the application and refresh quickly, you’ll see at some point a 503 Service Unavailable is returned, which is when the rate limiting middleware does its thing.

Configure what happens when being rate limited

Not happy with that 503 being returned when rate limiting is enforced? Let’s look at how to configure that!

Many services settled on the 429 Too Many Requests status code. In order to change the status code, you can set the RejectionStatusCode option:

builder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = 429;

    // ...
});

Additionally, there’s an OnRejected option you can set to customize the response that is sent when rate limiting is triggered for a request. It’s a good practice to communicate what happened, and why a rate limit applies. So instead of going with the default of returning “just a status code”, you can return some more meaningful information. The OnRejected delegate gives you access to the current rate limit context, including the HttpContext.

Here’s an example that sets the response status code to 429, and returns a meaningful response. The response mentions when to retry (if available from the rate limiting metadata), and provides a documentation link where users can find out more.

builder.Services.AddRateLimiter(options =>
{
    options.OnRejected = async (context, token) =>
    {
        context.HttpContext.Response.StatusCode = 429;
        if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
        {
            await context.HttpContext.Response.WriteAsync(
                $"Too many requests. Please try again after {retryAfter.TotalMinutes} minute(s). " +
                $"Read more about our rate limits at https://example.org/docs/ratelimiting.", cancellationToken: token);
        }
        else
        {
            await context.HttpContext.Response.WriteAsync(
                "Too many requests. Please try again later. " +
                "Read more about our rate limits at https://example.org/docs/ratelimiting.", cancellationToken: token);
        }
    };

    // ...
});

Given you have access to the current HttpContext, you also have access to the service collection. It’s a good practice to keep an eye on who, when and why a rate limit is being enforced, and you could log that by grabbing an ILogger from context.HttpContext.RequestServices if needed.

Note: Be careful with the logic you write in your OnRejected implementation. If you use your database context and run 5 queries, your rate limit isn’t actually helping reduce strain on your database. Communicate with the user and return a meaningful error (you could even use the Accept header and return either JSON or HTML depending on the client type), but don’t consume more resources than a normal response would require.

Speaking of communicating about what and why, the ASP.NET Core rate limiting middleware is a bit limited (pun not intended). The metadata you have access to is sparse (“retry after” is pretty much the only useful metadata returned).

Additionally, if you would want to return statistics about your limits (e.g. like GitHub does), you’ll find the ASP.NET Core rate limiting middleware does not support this. You won’t have access to the “number of requests remaining” or other metadata. Not in OnRejected, and definitely not if you want to return this data as headers on every request.

If this is something that matters to you, I advise to check out Stefan Prodan’s AspNetCoreRateLimit, which has many (many!) more options available. Or chime in on this GitHub issue.

Types of rate limiters

In our example, we’ve used the FixedWindowLimiter to limit the number of requests in a time window.

There are more rate limiting algorithms available in .NET that you can use:

Concurrency limit is the simplest form of rate limiting. It doesn’t look at time, just at number of concurrent requests. “Allow 10 concurrent requests”.
Fixed window limit lets you apply limits such as “60 requests per minute”. Every minute, 60 requests can be made. One every second, but also 60 in one go.
Sliding window limit is similar to the fixed window limit, but uses segments for more fine-grained limits. Think “60 requests per minute, with 1 request per second”.
Token bucket limit lets you control flow rate, and allows for bursts. Think “you are given 100 requests every minute”. If you make all of them over 10 seconds, you’ll have to wait for 1 minute before you are allowed more requests.

In addition, you can “chain” rate limiters of one type of various types, using the PartitionedRateLimiter.CreateChained() helper.

Maybe you want to have a limit where one can make 600 requests per minute, but only 6000 per hour. You could chain two FixedWindowLimiter with different options.

builder.Services.AddRateLimiter(options =>
{
    options.GlobalLimiter = PartitionedRateLimiter.CreateChained(
        PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
            RateLimitPartition.GetFixedWindowLimiter(httpContext.ResolveClientIpAddress(), partition =>
                new FixedWindowRateLimiterOptions
                {
                    AutoReplenishment = true,
                    PermitLimit = 600,
                    Window = TimeSpan.FromMinutes(1)
                })),
        PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
            RateLimitPartition.GetFixedWindowLimiter(httpContext.ResolveClientIpAddress(), partition =>
                new FixedWindowRateLimiterOptions
                {
                    AutoReplenishment = true,
                    PermitLimit = 6000,
                    Window = TimeSpan.FromHours(1)
                })));

    // ...
});

Note that the ResolveClientIpAddress() extension method I use here is just an example that checks different headers for the current client’s IP address. Use a partition key that makes sense for your application.

Queue requests instead of rejecting them: `QueueLimit`

On most of the rate limiters that ship with .NET, you can specify a QueueLimit next to the PermitLimit. The QueueLimit specifies how many incoming requests will be queued but not rejected when the PermitLimit is reached.

Let’s look at an example:

PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
    RateLimitPartition.GetFixedWindowLimiter(httpContext.ResolveClientIpAddress(), partition =>
        new FixedWindowRateLimiterOptions
        {
            AutoReplenishment = true,
            PermitLimit = 10,
            QueueLimit = 6,
            QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
            Window = TimeSpan.FromSeconds(1)
        })));

In the above example, clients can make 10 requests per second. If they make more requests per second, up to 6 of those excess requests will be queued and will seemingly “hang” instead of being rejected. The next second, this queue will be processed.

If you expect small traffic bursts, setting QueueLimit may provide a nicer experience to your users. Instead of rejecting their requests, you’re delaying them a bit.

I’d personally not go with large QueueLimit, and definitely not for long time windows. As a consumer of an API, I’d rather get a response back fast. Even if it’s a failure, as those can be retried. A few seconds of being in a queue may make sense, but any longer the client will probably time out anyway and your queue is being kept around with no use.

Create custom rate limiting policies

Next to the default rate limiters, you can build your own implementation of IRateLimiterPolicy<TPartitionKey>. This interface specifies 2 methods: GetPartition(), which you’ll use to create a specific rate limiter for the current HttpContext, and OnRejected() if you want to have a custom response when this policy is rejecting a request.

Here’s an example where the rate limiter options are partitioned by either the current authenticated user, or their hostname. Authenticated users get higher limits, too:

public class ExampleRateLimiterPolicy : IRateLimiterPolicy<string>
{
    public RateLimitPartition<string> GetPartition(HttpContext httpContext)
    {
        if (httpContext.User.Identity?.IsAuthenticated == true)
        {
            return RateLimitPartition.GetFixedWindowLimiter(httpContext.User.Identity.Name!,
                partition => new FixedWindowRateLimiterOptions
                {
                    AutoReplenishment = true,
                    PermitLimit = 1_000,
                    Window = TimeSpan.FromMinutes(1),
                });
        }

        return RateLimitPartition.GetFixedWindowLimiter(httpContext.Request.Headers.Host.ToString(),
            partition => new FixedWindowRateLimiterOptions
            {
                AutoReplenishment = true,
                PermitLimit = 100,
                Window = TimeSpan.FromMinutes(1),
            });
    }

    public Func<OnRejectedContext, CancellationToken, ValueTask>? OnRejected { get; } =
        (context, _) =>
        {
            context.HttpContext.Response.StatusCode = 418; // I'm a 🫖
            return new ValueTask();
        };
}

And instead of rejecting requests with a well-known status code, this policy rejects requests with a 418 status code (“I’m a teapot”).

Policies for rate limiting groups of endpoints

So far, we’ve covered global limits that apply to all requests. There’s a good chance you want to apply different limits to different groups of endpoints. You may have endpoints that you don’t want to rate limit at all.

This is where policies come in. In your configuration options, you can create different policies using the .Add{RateLimiter}() extension methods, and then apply them to specific endpoints or groups thereof.

Here’s an example configuration adding 2 fixed window limiters with different settings, and a different policy name ("Api" and "Web").

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("Api", options =>
    {
        options.AutoReplenishment = true;
        options.PermitLimit = 10;
        options.Window = TimeSpan.FromMinutes(1);
    });

    options.AddFixedWindowLimiter("Web", options =>
    {
        options.AutoReplenishment = true;
        options.PermitLimit = 10;
        options.Window = TimeSpan.FromMinutes(1);
    });

    // ...
});

Before we look at how to apply these policies, let’s first cover an important warning…

Warning: The .Add{RateLimiter}() extension methods partition rate limits based on the policy name. This is okay if you want to apply global limits per group of endpoints, but it’s not when you want to partition per user or per IP address or something along those lines.
If you want to add policies that are partitioned by policy name and any aspect of an incoming HTTP request, use the .AddPolicy(..) method instead:
options.AddPolicy("Api", httpContext =>
    RateLimitPartition.GetFixedWindowLimiter(httpContext.ResolveClientIpAddress(),
    partition => new FixedWindowRateLimiterOptions
    {
        AutoReplenishment = true,
        PermitLimit = 10,
        Window = TimeSpan.FromSeconds(1)
    }));

With that out of the way, let’s see how you can apply policies to certain endpoints.

Rate limiting policies with ASP.NET Core Minimal API

When using ASP.NET Core Minimal API, you can enable a specific policy per endpoint, or per group of endpoints:

// Endpoint
app.MapGet("/api/hello", () => "Hello World!").RequireRateLimiting("Api");

// Group
app.MapGroup("/api/orders").RequireRateLimiting("Api");

Similarly, you can disable rate limiting per endpoint or group:

// Endpoint
app.MapGet("/api/hello", () => "Hello World!").DisableRateLimiting();

// Group
app.MapGroup("/api/orders").DisableRateLimiting();

Rate limiting policies with ASP.NET Core MVC

When using ASP.NET Core MVC, you can enable and disable policies per controller or action.

[EnableRateLimiting("Api")]
public class Orders : Controller
{
    [DisableRateLimiting]
    public IActionResult Index()
    {
        return View();
    }

    [EnableRateLimitingAttribute("ApiListing")]
    public IActionResult List()
    {
        return View();
    }
}

You’ll find this works similar to authorization and authorization policies.

ASP.NET Core rate limiting with YARP proxy

In your application, you may be using YARP, to build a reverse proxy gateway sitting in front of various backend applications. For example, you may run YARP to listen on example.org, and have it proxy all requests going to this domain while mapping /api and /docs to different web apps running on diffreent servers.

In such scenario, rate limiting will also be useful. You could rate limit each application separately, or apply rate limiting in the YARP proxy. Given both YARP and ASP.NET Core rate limiting are middlewares, they play well together.

As an example, here’s a YARP proxy that applies a global rate limit of 10 requests per minute, partitioned by host header:

using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = 429;
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
        RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: httpContext.Request.Headers.Host.ToString(),
            factory: partition => new FixedWindowRateLimiterOptions
            {
                AutoReplenishment = true,
                PermitLimit = 10,
                QueueLimit = 0,
                Window = TimeSpan.FromMinutes(1)
            }));
});

builder.Services.AddReverseProxy()
    .LoadFromConfig(builder.Configuration.GetSection("ReverseProxy"));

var app = builder.Build();
app.UseRateLimiter();
app.MapReverseProxy();
app.Run();

Just like with ASP.NET Core Minimal API and MVC apps, you can use the AddRateLimiter() extension method to configure rate limits, and AddReverseProxy() to register the YARP configuration.

To then register the configured middlewares in your application, use the UseRateLimiter() and MapReverseProxy() can be used.

Wrapping up

By limiting the number of requests that can be made to your application, you can reduce the load on your server and have more fair usage of resources among your users. ASP.NET Core provides an easy way to implement rate limiting in your applications. By using the built-in middleware, you can easily configure rate limiting for your application.

In this post, I wanted to give you some insights about how you can use the ASP.NET Core rate limiting middleware. It’s not as complete as Stefan Prodan’s AspNetCoreRateLimit, but there are enough options available to add rate limiting to your application.

In a future blog post, I’ll cover more concepts around rate limiting. Stay tuned!

Share on

Twitter Facebook LinkedIn

14 responses

Anand Sowmithiran • September 28th, 2022
Very useful feature. Looking at the different classes involved, there is no way to know how many permits have been issued so far for a given partition key. That would be nice. Also is there a way to retain the count of leases issued between server restarts? It is right now stored in memory.
Maarten Balliauw • October 1st, 2022
For the statistics, do give this issue a +1 - https://github.com/dotnet/aspnetcore/issues/44140
Storage is indeed always in memory, the framework doesn’t have anything out of the box there (and not easily pluggable).
Silent Tremor • November 15th, 2022
This is beyond cool, any distributed rate limiting options?
Nuri Yilmaz • December 17th, 2022
I see that enginer tought that always posible to resolve client IP (ResolveClientIpAddress). How about service use delivery network as cloudflare etc. Our service will get same IP for many different clients. Its not useful solution, they must change ıt msay be more detailed request types, middleware will decide to counting or not…
Chris • April 26th, 2023
Thanks for your article. Most illuminating! Out of curiosity what does your httpContext.ResolveClientIpAddress() extension method look like?
Any reason why you don’t use httpContext.Connection.RemoteIpAddress? Thanks,
Maarten Balliauw • May 5th, 2023
@Chris it checks a bunch of X-Forwarded-For and other headers to make sure reverse proxy/CDN/… gets the correct IP of the client if possible.
-- • June 30th, 2023
The IP addresses of the clients are not static, and each IP address can make a maximum of 100 requests in 10 minutes. Additionally, all clients combined can make a maximum of 1000 requests in 10 minutes. When implementing such a rate limit, what do you think is the most appropriate way? Thank you.
-- • June 30th, 2023
When I think about it a bit more, I came up with the following idea, but I’m not sure if it would be correct. While I’m at it, let me write it down. We can achieve the rule ‘all clients can make a maximum of 1000 requests in 10 minutes’ by applying one of the methods shown in the video. For the rule ‘each client can make a maximum of 100 requests in 10 minutes’, we will need to write an additional middleware. In this middleware, we will retrieve the IP address of the request using HttpContext, and then store it in Redis. If the request count has not exceeded the limit, we will increment it by 1, call ‘next’ to proceed with other operations. If there is an issue, we will return the response without further processing. We can also create a background service using Hangfire to clean up Redis every 10 minutes. I’m not sure what the best practice is for this, but I wonder if this approach would work
Deivydas • November 30th, 2023
What if my application has multiple instances? How can I enable rate limiting when one request from same ip goes to one instance, the other to another?
Mike Towill • April 16th, 2024
Here’s an example where the rate limiter options are partitioned by either the current authenticated user, or their hostname
httpContext.Request.Headers.Host.ToString()
HTTP Host header is the server domain. Not the client’s hostname. So this method is inadequate for unauthenticated users on a login screen, for example
Gajo • May 9th, 2024
Indeed, as Mike Towill mentioned, the HTTP Host header is the server domain. I was considering the Origin header, but it can be null in certain situations: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Origin.
Robert te Kaat • July 11th, 2024
I tried to limit concurrency to a single request for a certain MVC controller method. However, it keeps accepting a few requests, which are then executed one-by-one. So it seems this WaitAsync uses a too long timeout for my preference. This is my policy:
services.AddRateLimiter(options => { options.RejectionStatusCode = 429; options.OnRejected = (ctx, ct) => { ctx.HttpContext.Response.Headers.Append(“Retry-After”, “60”); return ValueTask.CompletedTask; }; options.AddPolicy(policyName: Config.ConcurrencyPolicy_Limit_1, (ctx) => { // One concurrency limiter per customer var customerCode = ctx.RequestServices.GetRequiredService().CustomerIdentifier ?? string.Empty;
return new RateLimitPartition<string>(customerCode, s => new ConcurrencyLimiter(new ConcurrencyLimiterOptions() { PermitLimit = 1, QueueLimit = 0 // Immediate rejection })); }); });
Alexander • July 31st, 2024
It there any way to define rate limit policies in the config file and load them dynamically at runtime like we can do with yarp routes and clusters? What if I want to add specific policy to the specific route at runtime(Like in this old library https://github.com/stefanprodan/AspNetCoreRateLimit/wiki/ClientRateLimitMiddleware#update-rate-limits-at-runtime)?
Dishant • July 31st, 2024
My Application is .Net6 based. can I not use Rate Limiter in my application?
What if I add/create separate .csproj project pointed to .net7 framework and add only ratelimiter to it. try to reference it from my application?

Maarten Balliauw

ASP.NET Core rate limiting middleware in .NET 7

What is rate limiting?

Rate limiting in ASP.NET Core

Configure what happens when being rate limited

Types of rate limiters

Queue requests instead of rejecting them: `QueueLimit`

Create custom rate limiting policies

Policies for rate limiting groups of endpoints

Rate limiting policies with ASP.NET Core Minimal API

Rate limiting policies with ASP.NET Core MVC

ASP.NET Core rate limiting with YARP proxy

Wrapping up

Share on

Leave a Comment

14 responses

You May Also Enjoy

Time for a change… Moving from JetBrains to Duende Software

Talk - Bringing C# nullability into existing code

Test-Driving Windows 11 Dev Drive for .NET

Provide opt-in to experimental APIs using C#12 ExperimentalAttribute

Maarten Balliauw

What is rate limiting?

Rate limiting in ASP.NET Core

Configure what happens when being rate limited

Types of rate limiters

Queue requests instead of rejecting them: QueueLimit

Create custom rate limiting policies

Policies for rate limiting groups of endpoints

Rate limiting policies with ASP.NET Core Minimal API

Rate limiting policies with ASP.NET Core MVC

ASP.NET Core rate limiting with YARP proxy

Wrapping up

Share on

Leave a Comment

14 responses

You May Also Enjoy

Time for a change… Moving from JetBrains to Duende Software

Talk - Bringing C# nullability into existing code

Test-Driving Windows 11 Dev Drive for .NET

Provide opt-in to experimental APIs using C#12 ExperimentalAttribute

Queue requests instead of rejecting them: `QueueLimit`