
Count number of pages scraped by bots over time

Alexandru Puiu
January 5, 2023
3 min

Tracking number of pages scraped by web crawlers

To count the number of web pages scraped, we can use a simple middleware with a predefined list of known bots, and increment a measurement with IMetricsService everytime a request comes from a bot.

using Microsoft.AspNetCore.Http;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Toggly.FeatureManagement;

namespace Web.Helpers
    public class BotTrackerMiddleware
        private readonly RequestDelegate _next;
        private readonly IMetricsService _metricsService;

        List<string> _crawlers = new List<string>()
            "bot","crawler","spider","80legs","baidu","yahoo! slurp","ia_archiver","mediapartners-google",
            "esther","felix ide","hamahakki","kit-fireball","fouineur","freecrawl","desertrealm",
            "htdig","ingrid","informant","inspectorwww","iron33","teoma","ask jeeves","jeeves",
            "searchprocess","senrigan","shagseeker","site valet","skymob","slurp","snooper","speedy",
            "urlck","valkyrie libwww-perl","verticrawl","victoria","webscout","voyager","crawlpaper",

        public BotTrackerMiddleware(RequestDelegate next, IMetricsService metricsService)
            _next = next;
            _metricsService = metricsService;

        /// <summary>
        /// Increase measurement for BotScrape metric each time the user agent matches a bot
        /// </summary>
        /// <param name="context"></param>
        /// <returns></returns>
        public async Task InvokeAsync(HttpContext context)
            string ua = context.Request.Headers.UserAgent.FirstOrDefault().ToLower() ?? string.Empty;
            if (_crawlers.Exists(x => ua.Contains(x)))
                await _metricsService.MeasureAsync("BotScrape", 1);

            await _next(context);

Then in Startup.cs we can include our middleware conditionally, based on a feature flag, before we call app.UseEndpoints


Next, in Toggly we’ll go to Features under our Application

Toggy Features
Toggy Features

And we’ll add a definition for our BotScrape flag

Add BotScrape Feature
Add BotScrape Feature

Finally, we’ll define the metric on our Metrics tab

Create BotScrape metric
Create BotScrape metric