BotExorcist Documentation

Setup Guide, User Manual & Firewall Intelligence

Learn how to install BotExorcist, use Observation Mode, activate Protection Mode, understand IP scoring, review crawler intelligence, manage blocks, and protect your WordPress site from no-signal bot traffic.

Getting Started Install BotExorcist and understand the core setup flow.

Observation Mode Monitor suspicious traffic before enabling active enforcement.

Protection Mode Block repeat offenders, no-signal visits, probes, and hostile IPs.

IP Control Manage allowlists, blocklists, CIDR ranges, and firewall sync.

What BotExorcist Does

BotExorcist separates real traffic from automated noise.

BotExorcist is a WordPress bot protection and SEO firewall system designed to expose traffic that ordinary analytics often hides or blends into normal visitor data. It monitors requests, identifies no-signal behaviour, scores repeat offenders, protects verified search crawlers, and gives you direct control over IPs, CIDRs, ranges, ASNs, and suspicious traffic sources.

Detect

Find short visits, no-signal sessions, crawler noise, 404 probes, parameter abuse, and repeat offenders.

Score

Build evidence against IPs over time instead of treating every request as isolated.

Block

Stop hostile IPs, ranges, CIDRs, and ASNs depending on the evidence and active protection mode.

Report

Surface live traffic intelligence, crawler behaviour, PPC integrity signals, and blocked traffic evidence.

Core idea: real users send signals. Bots often arrive, hit a URL, do nothing useful, and disappear. BotExorcist helps you see that difference.

Operating Modes

Observation Mode vs Protection Mode

BotExorcist can operate in a monitoring-first mode or an active enforcement mode. The scoring system should continue in both modes, but enforcement only happens when Protection Mode is enabled.

Observation Mode

Records suspicious traffic.
Builds offence scores.
Shows which IPs would have breached tolerance.
Does not automatically block offenders.
Useful for learning the traffic profile before enforcement.

Protection Mode

Continues scoring every offender.
Blocks IPs that breach configured tolerance.
Uses active block short-circuiting for already-blocked traffic.
Can promote eligible blocks into hard enforcement.
Stops repeat abuse from consuming WordPress resources.

Important: switching from Observation Mode to Protection Mode should immediately enforce against IP cards already over the configured tolerance threshold, unless they are allowlisted or protected.

Live War Room

Reading the Live Signal Feed

The Live War Room shows recent traffic activity and labels each row by behaviour, risk, status, crawler identity, referrer, URL slug, and enforcement state. It is designed for fast review rather than deep forensic storage.

Signal	Meaning	Recommended Reading
Green / Normal	No major breach, no repeat abuse, and no elevated risk.	Generally safe traffic unless future behaviour changes.
Orange / Watch	IP has suspicious evidence or offence score above normal, but has not breached tolerance.	Worth watching. Not necessarily block-worthy yet.
Red / Blocked	IP is blocked, has breached tolerance, or is being denied by an active rule.	Confirmed enforcement or high-risk state.
CRAWLER	Traffic has crawler-like identity or verified crawler intelligence.	Google and Bing are protected when verified. Other crawlers are subject to policy.
BL	The IP currently appears on the blocklist.	Check whether the row is a historical request or a live denied request.

Display rule: URL columns may show only the slug to save space, while the full path remains stored in the evidence.

Colour Logic

Traffic colours should reflect current risk.

Green

Normal traffic. No major offence history. No known repeat abuse. Not in breach of tolerance.

Orange

Suspicious or worth watching. Usually used when offence score is elevated but not yet over the enforcement threshold.

Red

Blocked, in breach, or denied. This should be used for active block states, threshold breaches, and confirmed hostile patterns.

Offence Scoring

How offence scoring should work

BotExorcist should record suspicious behaviour into an IP memory system. The score should rise when an IP repeats the same hostile or no-signal pattern. A single event may be suspicious. Repeated events become evidence.

Behaviour	Why It Matters	Typical Outcome
1-second / no-signal visit	The request arrived but produced no meaningful human signal.	Score increment and watch state.
Repeated no-signal events	Same IP repeatedly arrives without engagement.	Escalate score and block when threshold is breached.
404 probing	Bot scans for missing, old, or vulnerable URLs.	Score, evidence log, and possible block.
Parameter probing	IP tests query strings, search URLs, or suspicious parameter combinations.	Score and block after repeat tolerance.
Robots violation	Requester ignores disallowed paths.	Policy-based warning, score, or enforcement.
Same URL repeat abuse	IP or crawler keeps hammering the same slug without useful crawl diversity.	Watch, score, rate-limit, or block depending on policy.

Rule: an IP with 5 no-signal style events within 72 hours should be considered over tolerance in Protection Mode, unless allowlisted or protected.

Violation Governor

The enforcement decision must follow one route.

The Violation Governor is the logic layer that receives a violation, updates the offender memory, checks the enforcement threshold, and decides whether to block. The dashboard should display the decision, not recalculate it separately.

Violation detected A no-signal event, probe, robots violation, trap hit, or repeat abuse pattern is recorded.

IP memory updated The IP card receives the event, timestamp, offence type, path, and score contribution.

Threshold checked The system compares the offender’s score and event count against the configured tolerance.

Protection Mode decides enforcement If Protection Mode is active and tolerance is breached, the IP is blocked. In Observation Mode, the breach is recorded but not enforced.

Dashboard reports the outcome The UI should show whether the request was denied now, or whether the IP was blocked after the historical request.

Safe Traffic

Protected traffic must pass through first.

Safe-lane checks should run before scoring, rate limiting, ASN blocking, or no-signal detection. This protects the site owner, WordPress admin actions, legitimate builder saves, and verified search crawlers.

Always protect

Logged-in admin/editor save routes.
Divi builder saves.
WordPress admin AJAX where appropriate.
Whitelisted IPs.
Sitemap routes.
Verified Googlebot and Bingbot.

Policy-based

Yandex and other non-protected verified crawlers.
AI crawler user-agents.
Hosting network traffic.
Cloud infrastructure visitors.
Suspicious but not-yet-breached IPs.

Important distinction: verified crawler does not automatically mean protected crawler. Google and Bing are protected when verified. Other crawlers can still be watched, limited, or blocked by policy.

Blocking & IP Control

Already blocked traffic should be cheap traffic.

Once an IP, CIDR, range, or ASN has already been blocked, BotExorcist should not investigate it again on every request. The best performance route is to deny it immediately and record only what it tried.

Block Type	Purpose	Best Use
IP Block	Blocks one specific IP address.	Manual blocks, repeat offenders, clear single-source abuse.
CIDR / Range Block	Blocks a broader network range.	When multiple abusive IPs appear from the same network.
ASN Block	Blocks or flags an autonomous system/network owner.	Only when evidence is strong and protected networks are excluded.
Allowlist	Exempts trusted IPs or verified crawler infrastructure.	Own IP, known admin IPs, verified Google/Bing crawler handling.

Performance rule: active block checks should happen early. If blocked, return 403 immediately, skip deep investigation, and write only a lightweight log entry.

Crawler Intelligence

Crawler data is not just security data — it is SEO intelligence.

BotExorcist should help SEO teams see which crawlers are touching the site, when they last appeared, which pages they requested, and whether their behaviour is useful, excessive, or suspicious.

Verified Crawlers

Google and Bing should be verified, not trusted only because of user-agent text.

Pages Crawled

Review the last URLs touched by crawler systems and whether they are hitting important pages.

Crawl Waste

Detect bots repeatedly requesting thin, deleted, old, duplicate, or low-value URLs.

Recommended label: non-protected verified crawlers that hammer the same slug should be treated as crawler policy issues, not human no-signal bounces.

AI Traffic

AI user-agents need confidence labels.

AI-looking user-agents should not automatically be treated as verified official AI infrastructure. A request may claim to be ChatGPT, Claude, or another AI system, but user-agents can be spoofed.

Confidence Label	Meaning	Action
AI UA detected	The user-agent looks like an AI tool or crawler.	Watch or manually block if unwanted.
AI UA + hosting	The AI-looking request came from a hosting/cloud network.	Treat with caution.
AI UA + cached intel	BotExorcist has matching IP intelligence from previous activity.	Review history and repeat behaviour.
Known Meta AI range	Traffic matches known Meta AI infrastructure/range logic.	Allow, watch, or block by policy.
Protected verified crawler	Verified Google/Bing-style protected search crawler logic.	Do not block automatically.

PPC Integrity

A click is not intent. Arrival is not interest.

PPC Integrity is designed to observe paid-click behaviour without blocking from that screen. Its purpose is to preserve evidence around paid arrivals, no-signal sessions, missing follow-up sessions, out-of-area traffic, and suspicious post-click behaviour.

Observation Only

PPC Integrity should not block traffic directly. Blocking paid-click IPs can hide recurring patterns and reduce evidence quality.

Evidence First

Track landing page, referrer, paid marker, dwell, click/scroll signals, form events, and follow-up session presence.

Drive-by principle: a paid-click IP that arrives, does not park, does not knock, does not ring the bell, does not scroll, and leaves immediately is not auditable genuine interest.

Static Error Pages

403 and 404 pages should stay lightweight.

BotExorcist can use static root-level error pages for fast denial. These pages should not load WordPress for every blocked hit. That keeps denied traffic cheap and reduces server load.

403 Page

Used when access is denied. Best served statically for hard-blocked traffic.

404 Page

Used for missing pages and probe visibility. Should not become an expensive WordPress load path for bots.

Blocked Page

Can display branded messaging while still keeping the denial route simple and fast.

Do not over-report from static 403 pages: a live beacon on every denied hit increases load. Server-log import is the better long-term reporting route.

Performance & I/O

The shortest safe route is the best route.

BotExorcist should not treat every request like a full investigation. Clean traffic should be cheap. Safe traffic should be cheap. Blocked traffic should be cheapest. Only suspicious traffic should become expensive.

Traffic Type	Ideal Route	Why
Safe traffic	Allow early, minimal logging.	Protects admins, editors, APIs, sitemaps, and verified search crawlers.
Blocked traffic	403 immediately, lightweight log only.	Stops repeat offenders without wasting WordPress resources.
Unknown traffic	Observe lightly and wait for signals.	Avoids unnecessary lookups for ordinary visitors.
Suspicious traffic	Score, enrich, and evaluate threshold.	Only bad behaviour deserves deeper investigation.

Efficiency rule: do less for clean traffic, almost nothing for already-blocked traffic, and full investigation only for suspicious traffic.

Setup Guide

Recommended setup sequence

Install BotExorcist Upload and activate the plugin from WordPress Plugins. Confirm the BotExorcist admin menu appears.

Start in Observation Mode Let the plugin monitor traffic, build IP cards, detect no-signal patterns, and reveal crawler behaviour.

Review the Live War Room Watch green, orange, and red traffic states. Confirm that obvious bots and repeat offenders are being scored.

Configure IP Control Add trusted IPs to the allowlist. Review existing blocklist entries. Sync hard blocks where needed.

Verify crawler protection Make sure Google and Bing are verified and protected, not trusted by user-agent alone.

Enable Protection Mode Once confidence is high, enable active enforcement so breached IPs are blocked automatically.

Watch server load Confirm blocked traffic is being denied cheaply and that admin/editor saves still work normally.

Troubleshooting

Common issues and what they mean

Issue	Likely Cause	Check
Save spinner never ends	Admin save/API route is being inspected or blocked too deeply.	Confirm logged-in/editor save routes are in the safe lane.
Blocked row shows 200	The request was historical, and the IP was blocked after the request.	Message should say “IP later blocked,” not “denied by active rule.”
Bing/Google stats show zero	Dashboard card may be using narrow crawler logic.	Use shared crawler stat definitions across all dashboards.
No-signal graph looks too low	Graph may be counting only one no-signal source.	Include block table, clearance queue, daily rollups, and canonical statuses carefully.
403 static hits missing from dashboard	.htaccess stopped the request before WordPress loaded.	This is intentional for performance. Use server logs for those hits.
Debug log shows wpdb::prepare notices	A query has mismatched placeholders or no placeholders.	Patch the plugin query source, not WordPress core.

FAQ

Frequently asked questions

Should I block Googlebot?

No. Verified Googlebot should be protected. Never trust Googlebot purely by user-agent text, but once verified, it should not be blocked or rate-limited.

Should I block Bingbot?

No. Verified Bingbot should also be protected. Bing search traffic and crawler visibility are useful SEO signals.

Can I block Yandex?

Yes. Yandex can be genuine and still not be protected. Verified non-protected crawlers can be watched, limited, or blocked by policy if their behaviour is not useful to your site.

Why do some URLs show only the slug?

Slug-only display saves table space. The full URL/path should still be stored in the raw evidence and visible through hover/details where needed.

Why are static 403 hits not always in BotExorcist stats?

Hard .htaccess denials can happen before WordPress loads. That reduces server load, but it also means WordPress-level dashboards may not see those hits.

Does Observation Mode still score offenders?

Yes. Observation Mode should keep scoring and recording breaches. It should simply avoid automatic blocking until Protection Mode is enabled.

What happens when Protection Mode is enabled?

IPs already over tolerance should be eligible for immediate enforcement, unless they are allowlisted or protected.

What is a no-signal visit?

A no-signal visit is a request that does not produce meaningful human behaviour, such as dwell, click, scroll, or other interaction within the configured gate.

Need help reading the evidence?

Use the Live War Room, IP Control, Crawler Intelligence, and PPC Integrity sections together. One row shows what happened now. The IP card shows what happened over time.

Contact Support