Setup Guide, User Manual & Firewall Intelligence
Learn how to install BotExorcist, use Observation Mode, activate Protection Mode, understand IP scoring, review crawler intelligence, manage blocks, and protect your WordPress site from no-signal bot traffic.
BotExorcist separates real traffic from automated noise.
BotExorcist is a WordPress bot protection and SEO firewall system designed to expose traffic that ordinary analytics often hides or blends into normal visitor data. It monitors requests, identifies no-signal behaviour, scores repeat offenders, protects verified search crawlers, and gives you direct control over IPs, CIDRs, ranges, ASNs, and suspicious traffic sources.
Find short visits, no-signal sessions, crawler noise, 404 probes, parameter abuse, and repeat offenders.
Build evidence against IPs over time instead of treating every request as isolated.
Stop hostile IPs, ranges, CIDRs, and ASNs depending on the evidence and active protection mode.
Surface live traffic intelligence, crawler behaviour, PPC integrity signals, and blocked traffic evidence.
Core idea: real users send signals. Bots often arrive, hit a URL, do nothing useful, and disappear. BotExorcist helps you see that difference.
Observation Mode vs Protection Mode
BotExorcist can operate in a monitoring-first mode or an active enforcement mode. The scoring system should continue in both modes, but enforcement only happens when Protection Mode is enabled.
- Records suspicious traffic.
- Builds offence scores.
- Shows which IPs would have breached tolerance.
- Does not automatically block offenders.
- Useful for learning the traffic profile before enforcement.
- Continues scoring every offender.
- Blocks IPs that breach configured tolerance.
- Uses active block short-circuiting for already-blocked traffic.
- Can promote eligible blocks into hard enforcement.
- Stops repeat abuse from consuming WordPress resources.
Important: switching from Observation Mode to Protection Mode should immediately enforce against IP cards already over the configured tolerance threshold, unless they are allowlisted or protected.
Reading the Live Signal Feed
The Live War Room shows recent traffic activity and labels each row by behaviour, risk, status, crawler identity, referrer, URL slug, and enforcement state. It is designed for fast review rather than deep forensic storage.
| Signal | Meaning | Recommended Reading |
|---|---|---|
| Green / Normal | No major breach, no repeat abuse, and no elevated risk. | Generally safe traffic unless future behaviour changes. |
| Orange / Watch | IP has suspicious evidence or offence score above normal, but has not breached tolerance. | Worth watching. Not necessarily block-worthy yet. |
| Red / Blocked | IP is blocked, has breached tolerance, or is being denied by an active rule. | Confirmed enforcement or high-risk state. |
| CRAWLER | Traffic has crawler-like identity or verified crawler intelligence. | Google and Bing are protected when verified. Other crawlers are subject to policy. |
| BL | The IP currently appears on the blocklist. | Check whether the row is a historical request or a live denied request. |
Display rule: URL columns may show only the slug to save space, while the full path remains stored in the evidence.
Traffic colours should reflect current risk.
Normal traffic. No major offence history. No known repeat abuse. Not in breach of tolerance.
Suspicious or worth watching. Usually used when offence score is elevated but not yet over the enforcement threshold.
Blocked, in breach, or denied. This should be used for active block states, threshold breaches, and confirmed hostile patterns.
How offence scoring should work
BotExorcist should record suspicious behaviour into an IP memory system. The score should rise when an IP repeats the same hostile or no-signal pattern. A single event may be suspicious. Repeated events become evidence.
| Behaviour | Why It Matters | Typical Outcome |
|---|---|---|
| 1-second / no-signal visit | The request arrived but produced no meaningful human signal. | Score increment and watch state. |
| Repeated no-signal events | Same IP repeatedly arrives without engagement. | Escalate score and block when threshold is breached. |
| 404 probing | Bot scans for missing, old, or vulnerable URLs. | Score, evidence log, and possible block. |
| Parameter probing | IP tests query strings, search URLs, or suspicious parameter combinations. | Score and block after repeat tolerance. |
| Robots violation | Requester ignores disallowed paths. | Policy-based warning, score, or enforcement. |
| Same URL repeat abuse | IP or crawler keeps hammering the same slug without useful crawl diversity. | Watch, score, rate-limit, or block depending on policy. |
Rule: an IP with 5 no-signal style events within 72 hours should be considered over tolerance in Protection Mode, unless allowlisted or protected.
The enforcement decision must follow one route.
The Violation Governor is the logic layer that receives a violation, updates the offender memory, checks the enforcement threshold, and decides whether to block. The dashboard should display the decision, not recalculate it separately.
Protected traffic must pass through first.
Safe-lane checks should run before scoring, rate limiting, ASN blocking, or no-signal detection. This protects the site owner, WordPress admin actions, legitimate builder saves, and verified search crawlers.
- Logged-in admin/editor save routes.
- Divi builder saves.
- WordPress admin AJAX where appropriate.
- Whitelisted IPs.
- Sitemap routes.
- Verified Googlebot and Bingbot.
- Yandex and other non-protected verified crawlers.
- AI crawler user-agents.
- Hosting network traffic.
- Cloud infrastructure visitors.
- Suspicious but not-yet-breached IPs.
Important distinction: verified crawler does not automatically mean protected crawler. Google and Bing are protected when verified. Other crawlers can still be watched, limited, or blocked by policy.
Already blocked traffic should be cheap traffic.
Once an IP, CIDR, range, or ASN has already been blocked, BotExorcist should not investigate it again on every request. The best performance route is to deny it immediately and record only what it tried.
| Block Type | Purpose | Best Use |
|---|---|---|
| IP Block | Blocks one specific IP address. | Manual blocks, repeat offenders, clear single-source abuse. |
| CIDR / Range Block | Blocks a broader network range. | When multiple abusive IPs appear from the same network. |
| ASN Block | Blocks or flags an autonomous system/network owner. | Only when evidence is strong and protected networks are excluded. |
| Allowlist | Exempts trusted IPs or verified crawler infrastructure. | Own IP, known admin IPs, verified Google/Bing crawler handling. |
Performance rule: active block checks should happen early. If blocked, return 403 immediately, skip deep investigation, and write only a lightweight log entry.
Crawler data is not just security data — it is SEO intelligence.
BotExorcist should help SEO teams see which crawlers are touching the site, when they last appeared, which pages they requested, and whether their behaviour is useful, excessive, or suspicious.
Google and Bing should be verified, not trusted only because of user-agent text.
Review the last URLs touched by crawler systems and whether they are hitting important pages.
Detect bots repeatedly requesting thin, deleted, old, duplicate, or low-value URLs.
Recommended label: non-protected verified crawlers that hammer the same slug should be treated as crawler policy issues, not human no-signal bounces.
AI user-agents need confidence labels.
AI-looking user-agents should not automatically be treated as verified official AI infrastructure. A request may claim to be ChatGPT, Claude, or another AI system, but user-agents can be spoofed.
| Confidence Label | Meaning | Action |
|---|---|---|
| AI UA detected | The user-agent looks like an AI tool or crawler. | Watch or manually block if unwanted. |
| AI UA + hosting | The AI-looking request came from a hosting/cloud network. | Treat with caution. |
| AI UA + cached intel | BotExorcist has matching IP intelligence from previous activity. | Review history and repeat behaviour. |
| Known Meta AI range | Traffic matches known Meta AI infrastructure/range logic. | Allow, watch, or block by policy. |
| Protected verified crawler | Verified Google/Bing-style protected search crawler logic. | Do not block automatically. |
A click is not intent. Arrival is not interest.
PPC Integrity is designed to observe paid-click behaviour without blocking from that screen. Its purpose is to preserve evidence around paid arrivals, no-signal sessions, missing follow-up sessions, out-of-area traffic, and suspicious post-click behaviour.
PPC Integrity should not block traffic directly. Blocking paid-click IPs can hide recurring patterns and reduce evidence quality.
Track landing page, referrer, paid marker, dwell, click/scroll signals, form events, and follow-up session presence.
Drive-by principle: a paid-click IP that arrives, does not park, does not knock, does not ring the bell, does not scroll, and leaves immediately is not auditable genuine interest.
403 and 404 pages should stay lightweight.
BotExorcist can use static root-level error pages for fast denial. These pages should not load WordPress for every blocked hit. That keeps denied traffic cheap and reduces server load.
Used when access is denied. Best served statically for hard-blocked traffic.
Used for missing pages and probe visibility. Should not become an expensive WordPress load path for bots.
Can display branded messaging while still keeping the denial route simple and fast.
Do not over-report from static 403 pages: a live beacon on every denied hit increases load. Server-log import is the better long-term reporting route.
The shortest safe route is the best route.
BotExorcist should not treat every request like a full investigation. Clean traffic should be cheap. Safe traffic should be cheap. Blocked traffic should be cheapest. Only suspicious traffic should become expensive.
| Traffic Type | Ideal Route | Why |
|---|---|---|
| Safe traffic | Allow early, minimal logging. | Protects admins, editors, APIs, sitemaps, and verified search crawlers. |
| Blocked traffic | 403 immediately, lightweight log only. | Stops repeat offenders without wasting WordPress resources. |
| Unknown traffic | Observe lightly and wait for signals. | Avoids unnecessary lookups for ordinary visitors. |
| Suspicious traffic | Score, enrich, and evaluate threshold. | Only bad behaviour deserves deeper investigation. |
Efficiency rule: do less for clean traffic, almost nothing for already-blocked traffic, and full investigation only for suspicious traffic.
Recommended setup sequence
Common issues and what they mean
| Issue | Likely Cause | Check |
|---|---|---|
| Save spinner never ends | Admin save/API route is being inspected or blocked too deeply. | Confirm logged-in/editor save routes are in the safe lane. |
| Blocked row shows 200 | The request was historical, and the IP was blocked after the request. | Message should say “IP later blocked,” not “denied by active rule.” |
| Bing/Google stats show zero | Dashboard card may be using narrow crawler logic. | Use shared crawler stat definitions across all dashboards. |
| No-signal graph looks too low | Graph may be counting only one no-signal source. | Include block table, clearance queue, daily rollups, and canonical statuses carefully. |
| 403 static hits missing from dashboard | .htaccess stopped the request before WordPress loaded. | This is intentional for performance. Use server logs for those hits. |
| Debug log shows wpdb::prepare notices | A query has mismatched placeholders or no placeholders. | Patch the plugin query source, not WordPress core. |
Frequently asked questions
Should I block Googlebot?
No. Verified Googlebot should be protected. Never trust Googlebot purely by user-agent text, but once verified, it should not be blocked or rate-limited.
Should I block Bingbot?
No. Verified Bingbot should also be protected. Bing search traffic and crawler visibility are useful SEO signals.
Can I block Yandex?
Yes. Yandex can be genuine and still not be protected. Verified non-protected crawlers can be watched, limited, or blocked by policy if their behaviour is not useful to your site.
Why do some URLs show only the slug?
Slug-only display saves table space. The full URL/path should still be stored in the raw evidence and visible through hover/details where needed.
Why are static 403 hits not always in BotExorcist stats?
Hard .htaccess denials can happen before WordPress loads. That reduces server load, but it also means WordPress-level dashboards may not see those hits.
Does Observation Mode still score offenders?
Yes. Observation Mode should keep scoring and recording breaches. It should simply avoid automatic blocking until Protection Mode is enabled.
What happens when Protection Mode is enabled?
IPs already over tolerance should be eligible for immediate enforcement, unless they are allowlisted or protected.
What is a no-signal visit?
A no-signal visit is a request that does not produce meaningful human behaviour, such as dwell, click, scroll, or other interaction within the configured gate.
Need help reading the evidence?
Use the Live War Room, IP Control, Crawler Intelligence, and PPC Integrity sections together. One row shows what happened now. The IP card shows what happened over time.
Contact Support