Now that you have seen the why of Enterprise Shield, this post presents the how. By migrating from a simple set of SHELL scripts and flat files, things get more complex, but also far more manageable and scalable.
Also, by adding in a reporting capability, it is very easy to track who is trying to get in and report on volume, location, network, and type of attack.
1. The Firewall Architecture: Chains and Sets
1.1 How Traffic Flows
Every inbound packet passes through this evaluation sequence before anything else happens. Enterprise Shield inserts itself at position 1 of the INPUT chain, ahead of UFW’s rules.
Inbound Packet (any source)
│
▼
┌─────────────────────────────┐
│ INPUT chain │ ← UFW manages this
│ [position 1] SHIELD-LOGIC ─┼──────────────────────────────────────────┐
│ [position 2+] UFW rules │ │
└─────────────────────────────┘ │
▼
┌──────────────────────────────────────────┐
│ SHIELD-LOGIC chain │
│ │
│ 1. ESTABLISHED/RELATED → ACCEPT │
│ (existing connections pass through) │
│ │
│ 2. loopback (lo) → ACCEPT │
│ │
│ 3. LAN + own IP → ACCEPT │
│ (192.168.x.x, 127.x.x.x, public IP) │
│ │
│ 4. shield_allow → ACCEPT │
│ (manually whitelisted CIDRs/IPs) │
│ │
│ 5. shield_abuseipdb → DROP + LOG │
│ (AbuseIPDB flagged IPs, ≥90 score) │
│ │
│ 6. shield_block → DROP + LOG │
│ (ASN blocks + country blocks) │
│ │
│ 7. shield_penalty → DROP + LOG │
│ (time-limited penalty box) │
│ │
│ 8. shield_azure → AZURE-RATELIMIT │
│ (AS8075 / Microsoft Azure) │
│ │
│ 9. shield_hyperscaler → CLOUD-RATELIMIT │
│ (AWS, GCP, Oracle, Cloudflare) │
│ │
│ 10. RETURN → UFW handles remaining │
└──────────────────────────────────────────┘
Critical ordering note:
shield_abuseipdbfires before the Azure and hyperscaler rate-limit rules (steps 8-9). This means a known-bad Azure IP is dropped outright rather than merely rate-limited. This was an explicit design
decision made during the final production verification.
1.2 The Rate-Limit Chains
Traffic that matches Azure or hyperscaler ipsets is not simply blocked — it would break Bingbot (which runs on AS8075) and legitimate cloud-based monitoring tools. Instead, it flows into dedicated rate-limit chains:
AZURE-RATELIMIT chain
├── hashlimit: 3 connections/minute per source IP, burst 2
├── If within limit: ACCEPT
└── If over limit: DROP + LOG [SHIELD_AZURE_LIMIT]
CLOUD-RATELIMIT chain
├── hashlimit: 20 connections/minute per source IP, burst 8
├── If within limit: ACCEPT
└── If over limit: DROP + LOG [SHIELD_CLOUD_LIMIT]
Azure gets a tighter limit (3/min) because it’s the most frequently abused hyperscaler ASN against this server. AWS/GCP/Oracle/Cloudflare get more headroom (20/min) to accommodate legitimate crawler and monitoring traffic.
Known limitation: HTTP/1.1 keep-alive connections bypass the hashlimit entirely because ESTABLISHED,RELATED packets are accepted at step 1 before they ever reach the rate-limit chains. This is a fundamental iptables constraint, not a bug in Enterprise Shield.
1.3 The Seven ipsets
┌─────────────────-─┬───────────────┬────────────────────────────────────────┐
│ ipset name │ type │ contents │
├──────────────────-┼───────────────┼────────────────────────────────────────┤
│ shield_allow │ hash:net │ Manually whitelisted CIDRs/IPs │
│ shield_abuseipdb │ hash:net │ AbuseIPDB flagged IPs (score ≥ 90) │
│ shield_block │ hash:net │ ASN blocks + 43-country blocks │
│ shield_penalty │ hash:net │ Time-limited penalty box │
│ shield_azure │ hash:net │ AS8075 (Microsoft Azure) CIDRs │
│ shield_hyperscaler│ hash:net │ AWS, GCP, Oracle, Cloudflare CIDRs │
│ shield_country │ hash:net │ Country blocks (may merge with _block) │
└──────────────────-┴───────────────┴────────────────────────────────────────┘
All sets: maxelem 1000000, hashsize 131072
Total entries across all sets: ~500,000 CIDRs
1.4 The LOG Rules and Tag Format
Every DROP action in SHIELD-LOGIC includes a LOG rule that fires first. Each log message is tagged so rsyslog can route it:
| Tag | Meaning |
|---|---|
[SHIELD_BLOCK] | Dropped by shield_block (ASN or country) |
[SHIELD_ABUSEIPDB] | Dropped by shield_abuseipdb |
[SHIELD_PENALTY] | Dropped by shield_penalty (penalty box) |
[SHIELD_AZURE_LIMIT] | Rate-limited by AZURE-RATELIMIT chain |
[SHIELD_CLOUD_LIMIT] | Rate-limited by CLOUD-RATELIMIT chain |
The log line format from iptables looks like:
Jun 03 14:22:11 creaky2 kernel: [SHIELD_BLOCK] IN=eth0 OUT= SRC=185.220.101.47 DST=x.x.x.x PROTO=TCP DPT=443 ...
rsyslog matches on [SHIELD_ prefix and routes to /var/log/enterprise_shield/hits.log, which hits_parser.py then consumes every 5 minutes.
2. The Database Schema
The SQLite database is the authoritative state of the entire system. If you have the database, you can reconstruct everything.
2.1 Schema Overview
shield.db
│
├── cidr_blocks ← every CIDR the system manages
├── asn_registry ← all 447+ ASNs with their classification
├── country_registry ← 43 countries with ETag cache
├── abuseipdb_entries ← per-IP AbuseIPDB data, independent lifecycle
├── firewall_hits ← raw parsed hits from rsyslog (rolling window)
├── hits_hourly ← aggregated hourly rollup (long-term storage)
├── ip_enrichment ← per-IP enrichment cache (ip-api.com, Shodan)
├── campaigns ← computed attack campaign groupings (rebuilt each run)
└── system_state ← all runtime state: timestamps, flags, hashes
2.2 cidr_blocks — The Core Table
CREATE TABLE cidr_blocks (
id INTEGER PRIMARY KEY,
cidr TEXT NOT NULL, -- e.g. "185.220.0.0/16"
source TEXT NOT NULL, -- 'asn', 'country', 'manual', 'allow'
source_id TEXT, -- ASN number or country code
ipset_target TEXT NOT NULL, -- which ipset this CIDR belongs to
first_seen INTEGER, -- epoch timestamp
last_verified INTEGER, -- epoch of last successful WHOIS confirm
UNIQUE(cidr, ipset_target)
);
The ipset_target field is what drives the actual ipset membership. When the rebuild runs, it diffs this table against the live ipset state and applies only the delta.
2.3 asn_registry — ASN Classification
CREATE TABLE asn_registry (
asn TEXT PRIMARY KEY, -- 'AS3209', 'AS8075', etc.
name TEXT, -- human-readable name from WHOIS
classification TEXT NOT NULL, -- 'block', 'azure', 'hyperscaler', 'allow'
last_whois INTEGER, -- epoch of last WHOIS lookup
fail_count INTEGER DEFAULT 0, -- consecutive WHOIS failures
note TEXT -- operator annotation
);
The classification field determines which ipset a CIDR ends up in. azure → shield_azure, hyperscaler → shield_hyperscaler, block → shield_block.
2.4 abuseipdb_entries — Separate Lifecycle
CREATE TABLE abuseipdb_entries (
ip TEXT PRIMARY KEY,
abuse_score INTEGER, -- 0-100 confidence score
country_code TEXT,
last_seen TEXT, -- from AbuseIPDB's "lastReportedAt"
refreshed_at INTEGER, -- epoch of our last fetch
in_ipset INTEGER DEFAULT 0 -- currently loaded into shield_abuseipdb?
);
AbuseIPDB entries are refreshed 5× daily (at 00:00, 05:00, 10:00, 15:00, 20:00 UTC) because of the free tier’s rate limit on the blacklist endpoint. The confidence threshold is 90 — only IPs with a score of 90 or above are loaded into the ipset. The table currently holds ~10,000 entries.
2.5 firewall_hits and hits_hourly — The Hit Pipeline
CREATE TABLE firewall_hits (
id INTEGER PRIMARY KEY,
hit_time INTEGER NOT NULL, -- epoch timestamp
src_ip TEXT NOT NULL,
dst_port INTEGER,
protocol TEXT,
shield_tag TEXT, -- SHIELD_BLOCK, SHIELD_ABUSEIPDB, etc.
log_line TEXT -- raw log line for debugging
);
CREATE TABLE hits_hourly (
hour_bucket INTEGER NOT NULL, -- epoch rounded to hour
src_ip TEXT NOT NULL,
shield_tag TEXT NOT NULL,
hit_count INTEGER DEFAULT 0,
PRIMARY KEY (hour_bucket, src_ip, shield_tag)
);
Raw hits accumulate in firewall_hits. The hits_rollup.py job runs at 03:00 daily and aggregates rows older than RAW_RETENTION_DAYS (default: 3 days) into hits_hourly, then deletes the raw rows. hits_hourly retains data for AGGREGATE_RETENTION_DAYS (default: 365 days).
2.6 system_state — No More Flat Files
CREATE TABLE system_state (
key TEXT PRIMARY KEY,
value TEXT
);
-- Key entries:
-- 'last_rebuild_time' : epoch of last successful rebuild
-- 'last_entry_count' : CIDR count at last successful rebuild
-- 'rebuild_in_progress' : '1' if a rebuild is currently running (crash detection)
-- 'last_public_ip' : server's external IP at last rebuild
-- 'config_hash' : SHA-256 of /etc/enterprise_shield/config.conf
-- 'last_abuseipdb_refresh' : epoch of last AbuseIPDB refresh
-- 'hits_parser_position' : byte offset in hits.log (resume parsing from here)
The rebuild_in_progress flag is the crash recovery mechanism. If the system reboots
mid-rebuild, restore.py detects this flag on boot and triggers a full rebuild before
loading ipsets.
3. The Rebuild Flow
3.1 Nightly Rebuild (shield.py rebuild)
[cron: 30 2 * * *]
│
▼
┌─────────────────────────────────────┐
│ Acquire PID lock │
│ Set rebuild_in_progress = 1 in DB │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Check config SHA-256 hash │
│ If changed: re-import config file │
│ Destroy any orphan staging ipsets │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ WHOIS refresh (only stale ASNs) │
│ Staleness threshold: 14 days │
│ Typically 2-5 ASNs per night │
│ On failure: keep existing CIDRs, │
│ increment fail_count │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Country block refresh │
│ ETag conditional GET to GitHub │
│ 304 Not Modified: skip download │
│ 200 OK: parse and update DB │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Compute delta │
│ DB state vs. live ipset state │
│ Entries to ADD: new CIDRs in DB │
│ Entries to DELETE: removed CIDRs │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ DELTA SAFETY CHECK │
│ New count < (last × 0.95)? │
│ YES → ABORT, preserve current set │
│ NO → proceed │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Apply delta to staging ipsets │
│ ipset swap staging → live (atomic) │
│ ipset destroy staging sets │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Update DB: │
│ - last_rebuild_time │
│ - last_entry_count │
│ - rebuild_in_progress = 0 │
│ Release PID lock │
└─────────────────────────────────────┘
3.2 The Atomic Swap
The ipset swap is the critical section of the rebuild. The live set is never empty:
shield_block_staging (new CIDRs)
│
│ ipset swap shield_block_staging shield_block
▼
shield_block (now has new CIDRs — atomically)
│
│ ipset destroy shield_block_staging
▼
(staging set gone)
If the swap fails for any reason, the original shield_block is untouched and the staging set is left for cleanup on the next run.
4. The Boot Persistence Architecture
This is one of the most critical (and most debugged) parts of the system. Getting the ordering wrong leaves the server unprotected between reboot and first cron run.
4.1 The Two-Service Boot Sequence
Boot sequence:
kernel loads
│
▼
systemd-modules-load.service ← ensures ip_tables, ip_set kernel modules are loaded
│
▼
local-fs.target ← ensures /var/lib/enterprise_shield/ is mounted
│
▼
enterprise-shield-ipset-restore.service ← runs restore.py --ipsets-only
│ Loads all 7 ipsets from SQLite
│ Sets ipsets before UFW needs them
│
▼
ufw.service ← UFW loads its rules
│ The ipsets now exist when UFW's before.rules references them
│
▼
enterprise-shield-chain-restore.service ← runs restore.py --chains-only
Rebuilds SHIELD-LOGIC, AZURE-RATELIMIT, CLOUD-RATELIMIT chains
Inserts "INPUT -j SHIELD-LOGIC" at position 1
Uses --noflush so UFW's chains are preserved
4.2 Why the Ordering Matters
| Service | Must run BEFORE | Must run AFTER |
|---|---|---|
ipset-restore | ufw.service | systemd-modules-load, local-fs.target |
chain-restore | (nothing) | ufw.service, ipset-restore |
The DefaultDependencies=no setting is not used (a lesson learned from a failed boot cycle) — it was removed too aggressively early and prevented the kernel module loading dependency from being honoured.
4.3 What restore.py Does
# restore.py --ipsets-only
for each ipset in [shield_allow, shield_abuseipdb, shield_block,
shield_penalty, shield_azure, shield_hyperscaler]:
create ipset if not exists
bulk-load CIDRs from cidr_blocks WHERE ipset_target = ipset
# restore.py --chains-only
create SHIELD-LOGIC chain (flush if exists)
add ESTABLISHED/RELATED ACCEPT rule
add loopback ACCEPT rule
add LAN + own IP ACCEPT rule (reads last_public_ip from system_state)
add shield_allow ACCEPT rule
add shield_abuseipdb DROP+LOG rule
add shield_block DROP+LOG rule
add shield_penalty DROP+LOG rule
add shield_azure → AZURE-RATELIMIT rule
add shield_hyperscaler → CLOUD-RATELIMIT rule
create AZURE-RATELIMIT chain with hashlimit rules
create CLOUD-RATELIMIT chain with hashlimit rules
iptables -I INPUT 1 -j SHIELD-LOGIC
4.4 Crash Recovery
If the server loses power mid-rebuild:
Next boot
│
▼
enterprise-shield-ipset-restore.service
│
▼
restore.py checks system_state WHERE key='rebuild_in_progress'
│
├── value = '0': normal restore from DB
│
└── value = '1': CRASH DETECTED
│
▼
Log CRITICAL to /var/log/enterprise_shield/restore.log
Load last known-good CIDRs from DB (last successfully committed state)
Continue with normal restore
Set rebuild_in_progress = 0
The last known-good state is whatever was in the database at the time of the crash. Because the DB commit happens after the ipset swap succeeds, any incomplete rebuild simply means the previous night’s CIDRs are loaded — which is correct behaviour.
5. The Hit Logging Pipeline
iptables LOG rule fires
│
│ Kernel writes to syslog
▼
/var/log/syslog
│
│ rsyslog matches: if $msg contains '[SHIELD_'
▼
/var/log/enterprise_shield/hits.log
│
│ hits_parser.py runs every 5 minutes via cron
│ Resumes from byte offset stored in system_state.hits_parser_position
▼
firewall_hits table (SQLite)
│
│ hits_rollup.py runs at 03:00 daily
│ Aggregates rows older than RAW_RETENTION_DAYS into hits_hourly
│ Deletes aggregated raw rows
▼
hits_hourly table (SQLite)
│
│ deep_shield.py runs every 10 minutes
│ Queries both tables, enriches IPs, computes campaigns
▼
private threat analysis dashboard
│ public_shield.py runs hourly
▼
https://performancezen.com/shield/public_shield.html
(public-facing summary — sanitised, no internal data)
5.1 rsyslog Configuration
# /etc/rsyslog.d/10-enterprise-shield.conf
:msg, contains, "[SHIELD_" /var/log/enterprise_shield/hits.log
& stop
The & stop prevents these messages from also going into /var/log/syslog, keeping the main syslog uncluttered.
5.2 hits_parser.py — Resumable Parsing
The parser reads hits.log from the byte offset stored in system_state.hits_parser_position.
On each 5-minute run, it:
- Opens the file, seeks to the stored position
- Reads all new lines since the last run
- Parses each
[SHIELD_*]log line forSRC=,DPT=,PROTO= - Inserts rows into
firewall_hits - Updates
hits_parser_positionto the new end-of-file offset
If the log file is rotated (via logrotate), the parser detects the file is smaller than the stored offset and resets to position 0.
6. The AbuseIPDB Integration
AbuseIPDB operates on a completely separate lifecycle from the main rebuild:
[cron: 0 0,5,10,15,20 * * *] (5× daily)
│
▼
abuseipdb.py refresh
│
├── Query AbuseIPDB /api/v2/blacklist
│ Parameters: confidenceMinimum=90, limit=10000
│
├── Parse response → list of {ip, abuseConfidenceScore, countryCode}
│
├── Diff against abuseipdb_entries table:
│ - New IPs: INSERT into table, add to shield_abuseipdb ipset
│ - Removed IPs: DELETE from table, remove from shield_abuseipdb ipset
│ - Unchanged IPs: update refreshed_at timestamp only
│
└── Log summary: X added, Y removed, Z unchanged
The AbuseIPDB ipset (shield_abuseipdb) is kept live-updated between the main nightly rebuilds. A known-bad IP that appears on AbuseIPDB is blocked within 5 hours maximum, without waiting for the 2:30AM rebuild.
7. The Penalty Box
The penalty box (shield_penalty ipset) handles time-limited blocks — typically IPs that have triggered specific application-layer rules or been manually added for investigation.
Adding to penalty box:
shield.py penalty add <ip> [--hours N] (default: 24 hours)
│
├── INSERT into penalty_entries table with expires_at timestamp
└── ipset add shield_penalty <ip>
Expiry check (every 15 minutes via cron):
penalty.py expire
│
├── SELECT from penalty_entries WHERE expires_at < NOW()
├── For each expired entry:
│ ipset del shield_penalty <ip>
│ DELETE from penalty_entries
└── Log: "Expired N penalty entries"
The penalty box also survives reboots — restore.py loads shield_penalty from the database on boot, but only entries whose expires_at is still in the future. Expired entries are not restored.
8. Module Structure
All Python modules live at /usr/local/lib/enterprise_shield/:
/usr/local/lib/enterprise_shield/
├── shield.py ← Main CLI: rebuild, add, add-asn, check, status
├── restore.py ← Boot restore: --ipsets-only, --chains-only
├── abuseipdb.py ← AbuseIPDB refresh daemon
├── hits_parser.py ← rsyslog hits.log → firewall_hits table
├── hits_rollup.py ← firewall_hits → hits_hourly aggregation
├── penalty.py ← Penalty box expiry
├── deep_shield.py ← Private threat analysis dashboard generator
├── public_shield.py ← Public summary dashboard generator
├── db.py ← Database connection and schema management
└── config.py ← Configuration constants and file parsing
Leave a Reply