Spam Detection¶
Comprehensive guide to Mailborder's multi-layered spam detection system.
Overview¶
Mailborder uses a sophisticated multi-engine approach to spam detection, combining:
- Rspamd - Machine learning and statistical analysis
- SpamAssassin - Rule-based content filtering
- RBL Checks - Real-time blacklist queries
- Bayesian Learning - Adaptive filtering based on your email
- Custom Rules - Organization-specific keywords and patterns
Detection Flow¶
Incoming Email
↓
Connection Check (RBL, GeoIP)
↓
Envelope Analysis (SPF, DKIM, DMARC)
↓
Content Scanning (Rspamd + SpamAssassin)
↓
Bayesian Analysis
↓
Custom Rules
↓
Score Calculation
↓
Verdict (Pass/Quarantine/Reject)
Spam Scoring System¶
Score Thresholds¶
Mailborder assigns a spam score to each email (0-20+):
- 0-6.0 - Pass (deliver normally)
- 6.0-20.0 - Quarantine (suspicious)
- 20.0+ - Reject (obvious spam)
Default Configuration¶
sudo mb-config get spam.threshold.pass
# Output: 6.0
sudo mb-config get spam.threshold.quarantine
# Output: 6.0
sudo mb-config get spam.threshold.reject
# Output: 20.0
Adjusting Thresholds¶
More strict (catch more spam, risk false positives):
sudo mb-config set spam.threshold.pass 5.0
sudo mb-config set spam.threshold.quarantine 5.0
sudo mb-config set spam.threshold.reject 15.0
More lenient (fewer false positives, may miss some spam):
sudo mb-config set spam.threshold.pass 7.0
sudo mb-config set spam.threshold.quarantine 7.0
sudo mb-config set spam.threshold.reject 25.0
Apply changes:
Rspamd Engine¶
Overview¶
Rspamd is Mailborder's primary spam detection engine, using:
- Machine learning algorithms
- Neural networks
- Statistical analysis
- URL analysis
- MIME parsing
- Fuzzy hashing
Configuration¶
Check Rspamd status:
View Rspamd configuration:
Test email with Rspamd:
Example output:
Score: 7.5 / 15.0
Action: add header
RCPT_COUNT(0.00)[]
MISSING_SUBJECT(1.00)[]
MIME_HTML_ONLY(0.50)[]
R_SUSPICIOUS_URL(2.50)[example.com]
BAYES_SPAM(3.50)[95.23%]
Rspamd Modules¶
Enabled modules:
Key modules: - bayes - Bayesian classifier - dkim - DKIM signature checking - spf - SPF validation - dmarc - DMARC policy enforcement - rbl - DNS blacklists - phishing - Phishing detection - fuzzy_check - Fuzzy hashing
Training Bayesian Filter¶
Learn spam:
Learn ham (legitimate email):
Check statistics:
Example output:
Statfile: BAYES_SPAM type: sqlite3; length: 1.50M; free blocks: 0; total blocks: 192.38k; free: 0.00%; learned: 5000; users: 1; languages: 0
Statfile: BAYES_HAM type: sqlite3; length: 2.20M; free blocks: 0; total blocks: 281.60k; free: 0.00%; learned: 10000; users: 1; languages: 0
Total learns: 15000
Fuzzy Hashing¶
Fuzzy hashing detects spam variants with similar content.
Add to fuzzy storage:
Check fuzzy match:
SpamAssassin Engine¶
Overview¶
SpamAssassin provides rule-based spam detection with thousands of tests:
- Content analysis
- Header analysis
- DNS checks
- Bayesian filtering
- Network tests
Configuration¶
Check SpamAssassin status:
Test email:
Example output:
X-Spam-Status: Yes, score=8.2 required=5.0
X-Spam-Report:
* 2.5 BAYES_50 BODY: Bayes spam probability is 40 to 60%
* 1.0 HTML_MESSAGE BODY: HTML included in message
* 2.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
* 1.5 MISSING_HEADERS Missing required headers
* 1.2 SUSPICIOUS_RECIPS Suspicious recipient addresses
Custom Rules¶
Add custom rule:
# Custom rules
header LOCAL_BLOCKED_WORD Subject =~ /viagra|cialis|pills/i
describe LOCAL_BLOCKED_WORD Subject contains blocked pharmaceutical terms
score LOCAL_BLOCKED_WORD 5.0
body LOCAL_SUSPICIOUS_LINK /bit\.ly|tinyurl\.com/i
describe LOCAL_SUSPICIOUS_LINK Contains suspicious shortened URL
score LOCAL_SUSPICIOUS_LINK 2.0
Reload SpamAssassin:
Managing Rules¶
Update rules:
Test rule syntax:
Real-time Blacklists (RBL)¶
Configured RBLs¶
List active RBLs:
Default RBLs: - zen.spamhaus.org (comprehensive blocklist) - bl.spamcop.net (spam sources) - dnsbl.sorbs.net (relay and spam sources) - cbl.abuseat.org (compromised machines)
Managing RBLs¶
Add RBL:
Remove RBL:
Test IP against RBL:
If listed, returns:
Check your mail server IP:
MY_IP=$(curl -s ifconfig.me)
sudo dig +short $(echo $MY_IP | awk -F. '{print $4"."$3"."$2"."$1}').zen.spamhaus.org
RBL Best Practices¶
False Positives
RBLs can cause false positives. Monitor quarantine for legitimate emails.
Recommended weights: - Spamhaus ZEN: 3.0 (highly trusted) - SpamCop: 2.5 (trusted) - SORBS: 2.0 (occasional false positives) - Smaller lists: 1.0-1.5 (test carefully)
Greylisting¶
How It Works¶
Greylisting temporarily rejects email from unknown senders. Legitimate mail servers retry; spammers typically don't.
First Attempt → Temporary Reject (450)
↓
Wait 5 minutes
↓
Retry → Accept
↓
Add to whitelist (future emails pass immediately)
Configuration¶
Enable greylisting:
sudo mb-config set greylist.enabled true
sudo mb-config set greylist.delay 300 # 5 minutes
sudo mb-config set greylist.expire 604800 # 7 days
Whitelist known good senders:
Monitor greylisting:
Greylisting Exceptions¶
Automatic whitelist after: - First successful delivery - Domain reputation score > threshold - SPF/DKIM/DMARC all pass
Custom Spam Keywords¶
Managing Keywords¶
Add blocked keyword:
sudo mb-spam-keyword add "viagra" --score 5.0
sudo mb-spam-keyword add "nigerian prince" --score 10.0
sudo mb-spam-keyword add "bitcoin wallet" --score 3.0
List keywords:
Remove keyword:
Keyword Types¶
Subject keywords:
Body keywords:
From address patterns:
Regular Expressions¶
Pattern matching:
sudo mb-spam-keyword add "/v[i1][a@]gr[a@]/" --regex --score 5.0
sudo mb-spam-keyword add "/c[i1][a@]l[i1]s/" --regex --score 5.0
Bayesian Learning¶
Training Data¶
Collect training samples:
Train Rspamd:
Train SpamAssassin:
Ongoing Training¶
From quarantine:
# User reports false positive
sudo mb-quarantine-release <message-id> --learn-ham
# User confirms spam
sudo mb-quarantine-delete <message-id> --learn-spam
Check Learning Progress¶
Rspamd statistics:
SpamAssassin statistics:
Example output:
0.000 0 3 0 non-token data: bayes db version
0.000 0 10000 0 non-token data: nspam
0.000 0 5000 0 non-token data: nham
Performance Tuning¶
Rspamd Workers¶
Adjust worker count:
Caching¶
Enable caching:
Cache hit statistics:
Scan Limits¶
Message size limits:
Timeout settings:
Monitoring Spam Detection¶
Statistics¶
Spam detection rates:
Example output:
Period: Last 24 hours
Total messages: 10,245
Passed: 8,120 (79.2%)
Quarantined: 1,890 (18.4%)
Rejected: 235 (2.3%)
Average spam score: 3.2
False positive rate: 0.8%
Per-domain statistics:
Real-time Monitoring¶
Watch spam filter log:
Watch Rspamd log:
Alerts¶
Configure threshold alerts:
sudo mb-config set spam.alert.threshold 50 # Alert if >50 spam/hour
sudo mb-config set spam.alert.email admin@example.com
Troubleshooting¶
Too Much Spam Getting Through¶
Symptoms: - Legitimate spam reaching users - Low spam detection rate
Solutions:
-
Lower thresholds:
-
Enable all RBLs:
-
Train Bayesian filter:
-
Enable greylisting:
Too Many False Positives¶
Symptoms: - Legitimate email in quarantine - User complaints about missing emails
Solutions:
-
Raise thresholds:
-
Whitelist legitimate senders:
-
Train with false positives:
-
Review RBL weights:
Rspamd Not Working¶
Check service:
Test manually:
Restart Rspamd:
SpamAssassin Slow¶
Check processing time:
Disable slow rules:
Update and optimize:
Best Practices¶
Initial Setup¶
- Start with default thresholds - Adjust after monitoring
- Enable major RBLs - Spamhaus, SpamCop, Barracuda
- Collect training data - Both spam and legitimate email
- Enable greylisting - Effective with minimal impact
- Monitor for one week - Gather statistics before tuning
Ongoing Maintenance¶
- Weekly review quarantine - Check for false positives
- Monthly training - Update Bayesian classifiers
- Update signatures - Run
mb-updateregularly - Monitor statistics - Watch false positive rate
- User feedback - Encourage spam reports
Whitelisting Strategy¶
Whitelist by priority: 1. Corporate partners 2. Critical services (banking, utilities) 3. Mailing lists 4. Known vendors
Don't whitelist: - Free email providers (Gmail, Yahoo) - too broad - Entire domains unless absolutely necessary - Based on single email - wait for pattern
Performance Optimization¶
- Limit scan size - Skip huge attachments
- Cache aggressively - Reduce redundant scans
- Tune worker counts - Match CPU cores
- Disable unnecessary checks - Balance security vs. performance
See Also¶
- Virus Scanning - Antivirus configuration
- Whitelist/Blacklist - Sender filtering
- Quarantine Management - Managing quarantined emails
- SPF/DKIM/DMARC - Email authentication
- Configuration Reference - Detailed settings