Apache and 403 Responses — HTTP/2.0 v. HTTP1.1

I’ve spent a good part of the last two days trying to track down an issue that was bothering me. My server is tuned to send a lot of annoying bots to the scrap heap with Rewrite rules that return a 403 response. I also just converted the server to HTTP/2.0 (yeah, I know; quiet in the back).

However, many of the bots use HTTP/1.1. What was weird is that when you look at the logs in Apache, you get the following items.

172.232.187.115 - - [06/May/2026:18:51:26 +0000] "GET / HTTP/1.1" 403 2877 "-" "Mozilla/5.0 (iPod; U; CPU iPhone OS 3_1 like Mac OS X) AppleWebKit/534.39.5 (KHTML, like Gecko) Version/3.0.5 Mobile/8B116 Safari/6534.39.5"

172.232.187.115 - - [06/May/2026:18:51:42 +0000] "GET / HTTP/2.0" 403 90 "-" "Mozilla/5.0 (iPod; U; CPU iPhone OS 3_1 like Mac OS X) AppleWebKit/534.39.5 (KHTML, like Gecko) Version/3.0.5 Mobile/8B116 Safari/6534.39.5"

Can anyone spot the issue? Well, if you look closely, you’ll see that HTTP/1.1 response is recorded as being much larger than that of the HTTP/2.0 response for the same 403 response.

Guess what? This is an artifact of the way that Apache processes these requests! My friend Claude described it this way:

For HTTP/1.1, when [F] fires, Apache generates the full default error page first (2911 bytes), logs that size via %b, then ErrorDocument substitutes it with the 44-byte response before sending. The log records the pre-substitution size.

For HTTP/2.0, mod_http2 logs the post-substitution size (plus HTTP/2 frame overhead accounting for the extra 82 bytes above 44).

It’s always fun to go off on a Snipe Hunt and learn a lot about the internals of software you use every day.

Attack Vector: Turkish ASNs

Over the last 3 weeks my new firewall deployment has seen a number of sustained HTTP attack attempts from AS212269, AS203771, AS212193, and a few others. All of these originate from Turkiye.

My local firewall is very aggressive — it’s my server and I can do what I want! — and I block large sections of the internet in an attempt to limit traffic to real humans as much as possible. So it was only through monitoring my live firewall stats that I was able to see these attack attempts.

These scanners aren’t particularly graceful. After encountering a DROP rule, they just…keep….going…and…going. They run for 2-4 hours (sometimes longer) without checking to see if they get a response. So why Turkiye and why now?

Turkiye has recently started appearing the top attacking countries list for a number of security providers. This appears to be a result of a large number of compromised IoT devices that have been integrated into “DDoS-as-a-Service” (DDoSaaS) organizations to make it very easy for organizations to use this as a starter kit for whatever purposes they are trying to achieve.

This is further amplified by the current geopolitical situation in the Persian Gulf (Iran/US conflict, closing of the Strait of Hormuz, etc.). One customer of these DDoSaaS is likely groups within Iran that are looking for a way to attack or annoy western organizations.

I will continue to monitor this, but it is always interesting to see how effective some experimentation with local firewall setups can lead to interesting cyber findings.

Enterprise Shield on Dinosaur Hardware

There’s a certain kind of satisfaction that comes from taking something old and making it do something remarkable. This is the story of how a 2008 MacBook 13” aluminum — a machine that predates the iPhone App Store — ended up running a multi-threaded, self-healing, boot-persistent IP threat blocking system protecting a production web server on Ubuntu 24.04. It took a full day of iterative development, a fair amount of debugging, and one very honest conversation about an 18-year-old piece of hardware.


The Starting Point

The project began with a script called Enterprise Shield v11.4. On paper it did what it promised: it blocked traffic from hostile Autonomous System Numbers (ASNs) and geographic regions by maintaining a massive ipset of known-bad IP ranges, then dropping packets matching that set at the firewall level. In practice, it was held together with duct tape.

The first code review found problems at every layer. There was a truncated grep statement in the country block loop — a literal syntax error that prevented the script from ever completing. The leading-zero stripping logic for CIDR normalisation ran in the wrong order, cleaning data after the validation regex had already rejected it. The script injected custom iptables rules directly while also running ufw --force reset, meaning UFW silently wiped those rules on every reload. And perhaps most practically damaging: it fetched IP data for every ASN serially, sleeping two seconds between each query, making a large blocklist a multi-hour operation.

The objective was clear: fix everything, make it fast, make it resilient, and make it understand its own hardware.


Understanding the Hardware

Before optimising anything, we needed to understand what we were working with. The machine is a 2008 MacBook with a Core 2 Duo processor — a 64-bit dual-core chip from the era when 4GB of RAM was considered ambitious. This one has been upgraded to 8GB, which turned out to matter significantly for one specific decision later.

The Core 2 Duo changes the calculus on parallelism. Modern CPUs handle process spawning cheaply. On a processor from 2008, every subprocess fork is measurably expensive, and context switching between background jobs has real overhead. This shaped nearly every optimisation decision that followed: eliminate unnecessary subprocess forks, use bash builtins instead of external binaries wherever possible, and be conservative with thread counts.

It also runs Ubuntu Server 24.04, which introduced a subtle wrinkle: the system ships with iptables-nft, a compatibility shim that translates iptables commands into nftables rules. Early in the project we suspected this would break the ipset integration — specifically the --match-set rule that does the actual packet dropping. A quick check of the live chain output confirmed it was working:

93  5448 DROP  ...  match-set blocked_asns src

Those 93 drops told us the integration was solid. We moved on.


Phase 1: Making It Correct

The first rewrite — v11.5 — focused entirely on correctness before touching performance.

The truncated grep was fixed. The UFW/iptables conflict was documented and mitigated by injecting the ipset DROP rule into /etc/ufw/before.rules, making it survive UFW reloads. The leading-zero stripping was reordered so it ran before validation, not after. The ipset restore file was given a flush directive so stale entries from previous partial runs couldn’t accumulate. The country feed fetches were given --fail flags so 404 error pages didn’t silently pass through as IP data.

Most importantly: the script was given a proper trap ... EXIT so temp files were always cleaned up, the root check was moved to the absolute first line, and every (( counter++ )) was replaced with counter=$(( counter + 1 )) — because in bash, arithmetic that evaluates to zero returns exit code 1, which set -e interprets as a fatal error.


Phase 2: Making It Fast

With a correct foundation, the next challenge was the whois lookup bottleneck. The serial version queried RADB one ASN at a time with a two-second sleep between each. With 152 ASNs in the blocklist, that’s over five minutes of wall clock time before any actual data processing begins.

The first parallel version — v11.6 — used export -f to pass a bash worker function into xargs -P subshells. It looked right. It wasn’t. On many systems, xargs subshells don’t reliably inherit exported bash functions. Workers spawned successfully, registered their completion files, and wrote nothing. The blocklist came back at roughly one-third of its expected size. The failure was completely silent.

The fix was architectural. Instead of relying on function inheritance, the worker logic was written to a self-contained bash script at runtime — /tmp/shield_whois_worker.sh — and each background job executed that file directly. No inheritance, no environment dependencies, no silent failures.

The second parallel problem was subtler: all threads were hitting RADB simultaneously, triggering connection throttling that caused empty responses with no error code. RADB doesn’t say “rate limited.” It just stops returning data. The solution was per-worker random jitter (0–2.5 seconds) combined with inter-batch pausing — every 20 dispatches, all active workers drain and a 3-second pause lets RADB’s connection count settle before the next batch opens.

The final thread count settled at 4. Eight threads was causing the silent data loss. Four threads with batching gives full coverage with no throttling, and on a Core 2 Duo the overhead of managing 4 concurrent background jobs is well within budget.


Phase 3: Making It Resilient

A firewall system that runs once nightly creates a specific failure mode: if something goes wrong with a data source — RADB is slow, a country feed returns an error, the network hiccups — the next scheduled run could silently shrink the blocklist without anyone noticing.

The delta check was the answer. After every run, the entry count is written to /var/lib/shield/last_entry_count. The following night, before committing the new ruleset, the script compares. If the new count is more than 10% below the previous run, the atomic swap is aborted entirely — the existing live ipset is preserved untouched — and an alert is written to a separate log file.

“Atomic swap” is the key phrase here. The shield script never modifies the live ipset directly. It builds a complete replacement set in /tmp, populates it, then executes ipset swap blocked_asns-temp blocked_asns — a single kernel operation that is instantaneous and never leaves the firewall in a partially-updated state. The machine is always either running the old ruleset or the new one. There is no window where it’s running neither.


Phase 4: Surviving Reboots

This is where the project surfaced its most interesting architectural gap.

The ipset kernel module stores its data entirely in memory. Every reboot wipes it. The script saves a snapshot to /etc/ipset.conf after each run, but nothing was loading that snapshot back on boot. The result: after every reboot, the machine came up with an empty blocked_asns set. UFW loaded its rules, including the DROP rule that referenced blocked_asns — but the set it referenced didn’t exist. Traffic flowed freely until 2AM when the cron job fired.

The fix required two systemd services with precise ordering:

shield-ipset-restore.service   (Before ufw.service)
    └── ufw.service
          └── shield-iptables-restore.service  (After ufw.service)

The ipset service runs before UFW and loads the saved set. The iptables service runs after UFW and rebuilds the custom SHIELD-LOGIC iptables chain using iptables-restore --noflush, which merges the saved rules into UFW’s ruleset without disturbing UFW’s own chains.

Both services include first-boot guards: if their respective state files don’t exist yet (fresh install before the first cron run), they exit cleanly rather than failing and potentially delaying UFW startup.

After the first reboot with both services running, verification was clean:

Active: active (exited)   ← correct for a oneshot service
status=0/SUCCESS
shield-ipset-restore: blocklist restored from /etc/ipset.conf

Phase 5: The Operational Tooling

A blocking system is only as useful as its ability to respond to threats that aren’t in the scheduled blocklist. The companion tool — block_asn.sh — evolved through five versions across the session.

The original script had several problems: it saved to the wrong path (meaning penalty box entries vanished on reboot), it validated IP addresses with a pattern that accepted octets above 255, and it made one kernel call per route which was painfully slow for large ASNs.

The rewrite introduced two distinct modes:

Penalty box — adds ASN routes directly to the live ipset. No file writes. Effective immediately. Cleared automatically on the next 2AM cron run when the ipset is rebuilt from scratch.

Permanent — does everything the penalty box does, plus appends the ASN to /etc/blocklist_asns.txt with a timestamp and an operator-supplied reason note. Persists forever.

Later, a third mode was added: --cidr accepts a single IP range for penalty box injection. CIDRs are never written to the permanent blocklist by design — they’re too specific and ephemeral for a long-term list.

The most important optimisation was replacing the per-route injection loop with a single ipset restore call. For a 500-route ASN, the old approach was 500 process forks and 500 kernel netlink calls. The new approach is one of each. The practical difference is roughly 5 seconds versus 50 milliseconds.

A before/after entry count snapshot provides transparent reporting on every injection — you know exactly how many routes were genuinely new versus already present.


The Bug That Was Hiding Everywhere

Late in the project, a test with CIDR 186.179.0.0/18 failed validation with “Invalid CIDR.” Tracing through the normalisation pipeline revealed a bug that had been quietly corrupting data all along.

The perl zero-stripping substitution s/(^|\.)0+\./$1./g was intended to fix malformed octets like 023.23. from RADB output. Instead, it matched any zero octet followed by a dot — including valid ones. 103.0.0.0/24 became 103..0.0/24. 5.0.0.0/8 became 5..0.0/8. Both silently failed validation and were dropped.

Every network with a zero in a non-terminal octet position — and there are many — had been invisible to the blocklist since the normalisation code was written.

The fix changes 0+ to 0+([0-9]), requiring the match to include at least one additional digit after the leading zeros. Lone zeros are left alone. The fix was applied to both enterprise_shield.sh and block_asn.sh.

# Before (broken)
perl -pe 's/(^|\.)0+\./$1./g'

# After (correct)
perl -pe 's/(^|\.)0+([0-9])\./$1$2./g'

Results

At the end of the session, the system was running with:

  • 343,966 blocked IP ranges loaded in the live ipset, consuming approximately 9.8MB of kernel memory
  • Boot-persistent protection — full blocklist restored within 3 seconds of kernel start, before UFW processes its first rule
  • Nightly automated updates at 2AM with delta checking, atomic swaps, and structured logging
  • On-demand injection for immediate response via block_asn.sh
  • Full documentation covering installation, operation, monitoring, and uninstall

The final cron run after all fixes produced:

[INFO ] --- Run complete: status=SUCCESS entries=343966 elapsed=76s ---

76 seconds. On an 18-year-old machine. For a complete rebuild of a 344,000-entry firewall blocklist from live external data sources.


What Made It Work

Looking back across the session, a few principles drove the outcomes:

Fix correctness before optimising. The original script had bugs that would have made any performance work meaningless. Getting it right first meant the parallel version had a solid foundation to build on.

Understand the failure modes of your tools. export -f failing silently. RADB returning empty responses instead of errors when rate-limited. ipset restore erroring on an existing set without -exist. None of these produced clear error messages. Each required understanding what the tool was supposed to do versus what it actually did under pressure.

Instrument everything. The structured logging, delta checks, and before/after entry counts weren’t cosmetic additions. They were what allowed us to diagnose the shrinking entry count issue (thread pressure), the double-logging issue (cron redirect + direct file append), and the missing public IP (lookup happening during UFW teardown).

Respect the hardware. Reducing threads from 8 to 4, using bash builtins instead of forking date on every log line, sorting in RAM with a 1GB buffer — these decisions were driven by understanding that a Core 2 Duo is not a cloud VM. It has constraints. Working within them produced a faster, more stable result than ignoring them.


The Machine

The 2008 MacBook 13” aluminum is not a recommended platform for production server workloads. It draws more power than a modern ARM server, runs warmer, and has a shorter remaining hardware lifespan than purpose-built server equipment.

It’s also, as of this writing, blocking nearly 344,000 hostile IP ranges, rebuilding its blocklist every night, surviving reboots gracefully, and responding to threats on demand in under a second.

Sometimes the best server is the one you already have.

The overuse of no-store in Cache-Control Headers

Many of the sites that I work with have this habit of using a browser Cache-Control header without fully understanding what it means:

cache-control: max-age=0, no-cache, no-store, private

Everything in that header is moot once no-store is added, as Cache-Control rules always default to the most restrictive directive in the list. So the effective set of caching rules defined by that group of directives equals

cache-control: no-store

Now, the issue comes when the visitor refreshes the page. They do not get the opportunity to REVALIDATE the content, as the browser has been told to completely block the content from being stored anywhere.

If the goal is to actually force a visitor to REVALIDATE the content on every page view, then use this instead:

cache-control: max-age=0, no-cache, private

While this set of directives would seemingly prevent any caching, its actual objective is to force the browser to process the content as if it is stale, and send an if-modified-since (including any relevant ETag information) to the server confirming if the content it has stored in a transitory state is still valid.

Performance a REVALIDATE rather than a full load reduces the amount of data transferred between client and server and can improve performance and reduce CDN costs, especially at scale.

My First Patagonia Catalog

[NOTE: This post is restored from the Wayback Machine. It was initially published December 22 2016 and lost during a database transfer sometime in the past. ]

I can’t remember the exact year, but I know it was in the late 1980s, when I got my first Patagonia Catalogue (I am Canadian after all). It opened my eyes to some amazing outdoor adventures, as well as introducing me to the history of the company – there was a long company history article among the pages.

The product I remember the most from the catalogue? The Ironworker Climbing Pants. The concept of these have stuck with me for nearly 30 years. Pants so tough that they could survive the abuse of an ironworker and a climber on Half Dome.

But also remember the crazy sailing and fishing products that they had. It impressed me that the people who worked for Patagonia and designed the products weren’t just crazy stonewallers, but wanted to be a part of the outdoors, no matter where the outdoors were.

I have never owned anything from Patagonia. My kids wore Patagonia, when they were younger, as we had a fantastic Goodwill store when we lived in San Mateo and people were dropping off some amazing stuff during the crazy years of the boom.

As I have gotten older and more sedentary, I likely can’t fit into any of their products with my spreading middle-aged frame. I could buy some knock-off or one of the amazing brands that has appeared in the intervening years (I see The North Face everywhere right now – is this a hot brand or just better marketing?).

But this has not stopped my love of (and lust for) Patagonia products. Why would I desire something I could never get into or have any need for?

For the same reason I appreciate anything: the love Patagonia puts into their designs, the simplicity of their complexity, and the pride people have who wear their products not just as a fashion statement, but because they understand what Patagonia stands for.

Link Fixing and the Wayback Machine

This blog has been around for a long time, moved several times (both in hardware and physical locations), been ignored, and has become broken.

Since the start of the month (April 2022) I have been restoring the links and images on this blog from the Internet Archive’s Wayback Machine. If you didn’t want it out there anymore, the Wayback Machine will find it.

It will likely take time to restore the glory that once was the Newest Industry blog, and, yes, some posts will be removed, but it’s coming back.

Web Caching and Old Age

In 2002, I was invited to speak at an IBM conference in San Francisco. When it came time to give my presentation, no one showed up.

I had forgotten about it until I was perusing the Wayback Machine and found the PDF of my old presentation.

The interesting thing is that the discussion in this doc is still relevant, even though the web is a very different beast than it was in 2002. Caching headers and their deployment have not changed much since they were introduced.

And there are still entities out there who get them wrong.


If you like ancient web docs, check out what webperformance.org looked like in 2007. [Courtesy of the Wayback Machine]

Covid Daily Stats and the Question of China

One of my favorite places to get Covid stats from is the Our World In Data data explorer. They aggregate all the stats into a number of great visualizations that you can share with friends.

In this data are nuggets of information that get lost when you are surrounded by North American media. For example, did you in North America know that France and Germany are centers of a new European Covid wave in April 2022?

In this cascade of data are some interesting signs of how our world really uses and abuses information. The NY Times reported on some of the weird Covid data emerging from China (Shanghai’s Low Covid Death Toll Revives Questions About China’s Numbers). The Our World In Data charts show just how unusual this information is.

Covid Case Counts – China

Covid Daily Deaths – China

While I believe that the methods used to control Covid in China are aggressive, they cannot be this successful. Full stop. The case counts are far lower than anywhere else in the world and the confirmed deaths are, well, remarkably low.

Unbelievably low, upon sober thought.

The battle that democratic India is waging to control the release of statistical models of their actual mortality rate change during Covid (India Is Stalling the W.H.O.’s Efforts to Make Global Covid Death Toll Public) shines a brighter light on a country with tighter controls that is able to bury the actual mortality rate more effectively.

Throughout the Pandemic, there have been two battles: one to control the disease; the other to control the facts about the disease. In North America, the disinformation campaign has been incredibly strong; in China, it pales besides the no information campaign

Taking this data, anyone can shape a narrative that reflects their world view. But what narrative can you shape about no data?

Hüsker Dü’s Newest Industry and The World Today

This blog used to have a different name, but a few years ago, I let the registration of the domain lapse and someone else snapped it up. It was based around a Hüsker Dü song from Zen Arcade.

I’ve been listening to that album a lot lately, and this song keeps standing out as a timeless reminder of what we will do to ourselves if things get out of hand.

Listen carefully; these lyrics from 1982/3 still have a deep meaning.

Copyright © 2026 Performance Zen

Theme by Anders NorenUp ↑