Spam Detection¶

Comprehensive guide to Mailborder's multi-layered spam detection system.

Overview¶

Mailborder uses a sophisticated multi-engine approach to spam detection, combining:

Rspamd - Machine learning and statistical analysis
SpamAssassin - Rule-based content filtering
RBL Checks - Real-time blacklist queries
Bayesian Learning - Adaptive filtering based on your email
Custom Rules - Organization-specific keywords and patterns

Detection Flow¶

Incoming Email
    ↓
Connection Check (RBL, GeoIP)
    ↓
Envelope Analysis (SPF, DKIM, DMARC)
    ↓
Content Scanning (Rspamd + SpamAssassin)
    ↓
Bayesian Analysis
    ↓
Custom Rules
    ↓
Score Calculation
    ↓
Verdict (Pass/Quarantine/Reject)

Spam Scoring System¶

Score Thresholds¶

Mailborder assigns a spam score to each email (0-20+):

0-6.0 - Pass (deliver normally)
6.0-20.0 - Quarantine (suspicious)
20.0+ - Reject (obvious spam)

Default Configuration¶

sudo mb-config get spam.threshold.pass
# Output: 6.0

sudo mb-config get spam.threshold.quarantine
# Output: 6.0

sudo mb-config get spam.threshold.reject
# Output: 20.0

Adjusting Thresholds¶

More strict (catch more spam, risk false positives):

sudo mb-config set spam.threshold.pass 5.0
sudo mb-config set spam.threshold.quarantine 5.0
sudo mb-config set spam.threshold.reject 15.0

More lenient (fewer false positives, may miss some spam):

sudo mb-config set spam.threshold.pass 7.0
sudo mb-config set spam.threshold.quarantine 7.0
sudo mb-config set spam.threshold.reject 25.0

Apply changes:

sudo mb-config reload
sudo systemctl restart mb-filter

Rspamd Engine¶

Overview¶

Rspamd is Mailborder's primary spam detection engine, using:

Machine learning algorithms
Neural networks
Statistical analysis
URL analysis
MIME parsing
Fuzzy hashing

Configuration¶

Check Rspamd status:

sudo systemctl status rspamd

View Rspamd configuration:

sudo cat /etc/rspamd/local.d/options.inc

Test email with Rspamd:

sudo rspamc < /path/to/email.eml

Example output:

Score: 7.5 / 15.0
Action: add header

RCPT_COUNT(0.00)[]
MISSING_SUBJECT(1.00)[]
MIME_HTML_ONLY(0.50)[]
R_SUSPICIOUS_URL(2.50)[example.com]
BAYES_SPAM(3.50)[95.23%]

Rspamd Modules¶

Enabled modules:

sudo rspamadm configdump -m

Key modules: - bayes - Bayesian classifier - dkim - DKIM signature checking - spf - SPF validation - dmarc - DMARC policy enforcement - rbl - DNS blacklists - phishing - Phishing detection - fuzzy_check - Fuzzy hashing

Training Bayesian Filter¶

Learn spam:

sudo mb-rspamd-learn --spam /path/to/spam-folder/

Learn ham (legitimate email):

sudo mb-rspamd-learn --ham /path/to/ham-folder/

Check statistics:

sudo rspamc stat

Example output:

Statfile: BAYES_SPAM type: sqlite3; length: 1.50M; free blocks: 0; total blocks: 192.38k; free: 0.00%; learned: 5000; users: 1; languages: 0
Statfile: BAYES_HAM type: sqlite3; length: 2.20M; free blocks: 0; total blocks: 281.60k; free: 0.00%; learned: 10000; users: 1; languages: 0

Total learns: 15000

Fuzzy Hashing¶

Fuzzy hashing detects spam variants with similar content.

Add to fuzzy storage:

sudo rspamc -f 1 -w 10 < spam.eml

Check fuzzy match:

sudo rspamc fuzzy_check < test.eml

SpamAssassin Engine¶

Overview¶

SpamAssassin provides rule-based spam detection with thousands of tests:

Content analysis
Header analysis
DNS checks
Bayesian filtering
Network tests

Configuration¶

Check SpamAssassin status:

sudo systemctl status spamassassin

Test email:

sudo spamassassin -t < /path/to/email.eml

Example output:

X-Spam-Status: Yes, score=8.2 required=5.0
X-Spam-Report:
    *  2.5 BAYES_50 BODY: Bayes spam probability is 40 to 60%
    *  1.0 HTML_MESSAGE BODY: HTML included in message
    *  2.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
    *  1.5 MISSING_HEADERS Missing required headers
    *  1.2 SUSPICIOUS_RECIPS Suspicious recipient addresses

Custom Rules¶

Add custom rule:

sudo nano /etc/spamassassin/local.cf

# Custom rules
header LOCAL_BLOCKED_WORD Subject =~ /viagra|cialis|pills/i
describe LOCAL_BLOCKED_WORD Subject contains blocked pharmaceutical terms
score LOCAL_BLOCKED_WORD 5.0

body LOCAL_SUSPICIOUS_LINK /bit\.ly|tinyurl\.com/i
describe LOCAL_SUSPICIOUS_LINK Contains suspicious shortened URL
score LOCAL_SUSPICIOUS_LINK 2.0

Reload SpamAssassin:

sudo systemctl reload spamassassin

Managing Rules¶

Update rules:

sudo sa-update
sudo systemctl restart spamassassin

Test rule syntax:

sudo spamassassin --lint

Real-time Blacklists (RBL)¶

Configured RBLs¶

List active RBLs:

sudo mb-rbl list

Default RBLs: - zen.spamhaus.org (comprehensive blocklist) - bl.spamcop.net (spam sources) - dnsbl.sorbs.net (relay and spam sources) - cbl.abuseat.org (compromised machines)

Managing RBLs¶

Add RBL:

sudo mb-rbl add zen.spamhaus.org --weight 3.0
sudo mb-rbl add bl.spamcop.net --weight 2.5

Remove RBL:

sudo mb-rbl remove dnsbl.example.com

Test IP against RBL:

sudo dig +short 100.113.0.203.zen.spamhaus.org

If listed, returns:

127.0.0.2

Check your mail server IP:

MY_IP=$(curl -s ifconfig.me)
sudo dig +short $(echo $MY_IP | awk -F. '{print $4"."$3"."$2"."$1}').zen.spamhaus.org

RBL Best Practices¶

False Positives

RBLs can cause false positives. Monitor quarantine for legitimate emails.

Recommended weights: - Spamhaus ZEN: 3.0 (highly trusted) - SpamCop: 2.5 (trusted) - SORBS: 2.0 (occasional false positives) - Smaller lists: 1.0-1.5 (test carefully)

Greylisting¶

How It Works¶

Greylisting temporarily rejects email from unknown senders. Legitimate mail servers retry; spammers typically don't.

First Attempt → Temporary Reject (450)
    ↓
Wait 5 minutes
    ↓
Retry → Accept
    ↓
Add to whitelist (future emails pass immediately)

Configuration¶

Enable greylisting:

sudo mb-config set greylist.enabled true
sudo mb-config set greylist.delay 300  # 5 minutes
sudo mb-config set greylist.expire 604800  # 7 days

Whitelist known good senders:

sudo mb-whitelist add sender@trusted.com
sudo mb-whitelist add @corporate-partner.com

Monitor greylisting:

sudo grep greylist /var/log/mailborder/mb-filter.log | tail -n 20

Greylisting Exceptions¶

Automatic whitelist after: - First successful delivery - Domain reputation score > threshold - SPF/DKIM/DMARC all pass

Custom Spam Keywords¶

Managing Keywords¶

Add blocked keyword:

sudo mb-spam-keyword add "viagra" --score 5.0
sudo mb-spam-keyword add "nigerian prince" --score 10.0
sudo mb-spam-keyword add "bitcoin wallet" --score 3.0

List keywords:

sudo mb-spam-keyword list

Remove keyword:

sudo mb-spam-keyword remove "viagra"

Keyword Types¶

Subject keywords:

sudo mb-spam-keyword add "RE: Invoice" --location subject --score 2.0

Body keywords:

sudo mb-spam-keyword add "click here now" --location body --score 3.0

From address patterns:

sudo mb-spam-keyword add "@suspicious-domain.com" --location from --score 5.0

Regular Expressions¶

Pattern matching:

sudo mb-spam-keyword add "/v[i1][a@]gr[a@]/" --regex --score 5.0
sudo mb-spam-keyword add "/c[i1][a@]l[i1]s/" --regex --score 5.0

Bayesian Learning¶

Training Data¶

Collect training samples:

# Spam folder
mkdir -p /tmp/spam-training

# Ham (legitimate) folder
mkdir -p /tmp/ham-training

Train Rspamd:

sudo mb-rspamd-learn --spam /tmp/spam-training/
sudo mb-rspamd-learn --ham /tmp/ham-training/

Train SpamAssassin:

sudo sa-learn --spam /tmp/spam-training/
sudo sa-learn --ham /tmp/ham-training/

Ongoing Training¶

From quarantine:

# User reports false positive
sudo mb-quarantine-release <message-id> --learn-ham

# User confirms spam
sudo mb-quarantine-delete <message-id> --learn-spam

Check Learning Progress¶

Rspamd statistics:

sudo rspamc stat

SpamAssassin statistics:

sudo sa-learn --dump magic

Example output:

0.000          0          3          0  non-token data: bayes db version
0.000          0      10000          0  non-token data: nspam
0.000          0       5000          0  non-token data: nham

Performance Tuning¶

Rspamd Workers¶

Adjust worker count:

sudo nano /etc/rspamd/local.d/options.inc

workers {
    normal {
        count = 4;  # Increase for high volume
    }
    controller {
        count = 1;
    }
}

Caching¶

Enable caching:

sudo mb-config set spam.cache.enabled true
sudo mb-config set spam.cache.ttl 3600  # 1 hour

Cache hit statistics:

sudo redis-cli info stats | grep hits

Scan Limits¶

Message size limits:

sudo mb-config set spam.max_scan_size 10485760  # 10 MB

Timeout settings:

sudo mb-config set spam.scan_timeout 30  # seconds

Monitoring Spam Detection¶

Statistics¶

Spam detection rates:

sudo mb-spam-stats

Example output:

Period: Last 24 hours
Total messages: 10,245
Passed: 8,120 (79.2%)
Quarantined: 1,890 (18.4%)
Rejected: 235 (2.3%)

Average spam score: 3.2
False positive rate: 0.8%

Per-domain statistics:

sudo mb-spam-stats --domain example.com

Real-time Monitoring¶

Watch spam filter log:

sudo tail -f /var/log/mailborder/mb-filter.log | grep SPAM

Watch Rspamd log:

sudo tail -f /var/log/rspamd/rspamd.log

Alerts¶

Configure threshold alerts:

sudo mb-config set spam.alert.threshold 50  # Alert if >50 spam/hour
sudo mb-config set spam.alert.email admin@example.com

Troubleshooting¶

Too Much Spam Getting Through¶

Symptoms: - Legitimate spam reaching users - Low spam detection rate

Solutions:

Lower thresholds:

sudo mb-config set spam.threshold.pass 5.0
sudo mb-config set spam.threshold.quarantine 5.0

Enable all RBLs:

sudo mb-rbl add zen.spamhaus.org --weight 3.0
sudo mb-rbl add bl.spamcop.net --weight 2.5
sudo mb-rbl add dnsbl.sorbs.net --weight 2.0

Train Bayesian filter:

sudo mb-rspamd-learn --spam /path/to/spam-examples/
sudo sa-learn --spam /path/to/spam-examples/

Enable greylisting:

sudo mb-config set greylist.enabled true

Too Many False Positives¶

Symptoms: - Legitimate email in quarantine - User complaints about missing emails

Solutions:

Raise thresholds:

sudo mb-config set spam.threshold.pass 7.0
sudo mb-config set spam.threshold.quarantine 7.0

Whitelist legitimate senders:

sudo mb-whitelist add @trusted-domain.com
sudo mb-whitelist add partner@company.com

Train with false positives:

sudo mb-quarantine-release <message-id> --learn-ham

Review RBL weights:

sudo mb-rbl list
# Reduce weights or remove problematic RBLs
sudo mb-rbl remove dnsbl.sorbs.net

Rspamd Not Working¶

Check service:

sudo systemctl status rspamd
sudo journalctl -u rspamd -n 50

Test manually:

sudo rspamc ping
# Expected: pong

sudo rspamc stat
# Should show statistics

Restart Rspamd:

sudo systemctl restart rspamd
sudo systemctl restart mb-filter

SpamAssassin Slow¶

Check processing time:

sudo grep "processing time" /var/log/mailborder/mb-filter.log | tail -n 20

Disable slow rules:

sudo nano /etc/spamassassin/local.cf

# Disable network tests
score URIBL_BLACK 0
score RAZOR2_CHECK 0

Update and optimize:

sudo sa-update
sudo sa-compile
sudo systemctl restart spamassassin

Best Practices¶

Initial Setup¶

Start with default thresholds - Adjust after monitoring
Enable major RBLs - Spamhaus, SpamCop, Barracuda
Collect training data - Both spam and legitimate email
Enable greylisting - Effective with minimal impact
Monitor for one week - Gather statistics before tuning

Ongoing Maintenance¶

Weekly review quarantine - Check for false positives
Monthly training - Update Bayesian classifiers
Update signatures - Run mb-update regularly
Monitor statistics - Watch false positive rate
User feedback - Encourage spam reports

Whitelisting Strategy¶

Whitelist by priority: 1. Corporate partners 2. Critical services (banking, utilities) 3. Mailing lists 4. Known vendors

Don't whitelist: - Free email providers (Gmail, Yahoo) - too broad - Entire domains unless absolutely necessary - Based on single email - wait for pattern

Performance Optimization¶

Limit scan size - Skip huge attachments
Cache aggressively - Reduce redundant scans
Tune worker counts - Match CPU cores
Disable unnecessary checks - Balance security vs. performance

Spam Detection¶

Overview¶

Detection Flow¶

Spam Scoring System¶

Score Thresholds¶

Default Configuration¶

Adjusting Thresholds¶

Rspamd Engine¶

Overview¶

Configuration¶

Rspamd Modules¶

Training Bayesian Filter¶

Fuzzy Hashing¶

SpamAssassin Engine¶

Overview¶

Configuration¶

Custom Rules¶

Managing Rules¶

Real-time Blacklists (RBL)¶

Configured RBLs¶

Managing RBLs¶

RBL Best Practices¶

Greylisting¶

How It Works¶

Configuration¶

Greylisting Exceptions¶

Custom Spam Keywords¶

Managing Keywords¶

Keyword Types¶

Regular Expressions¶

Bayesian Learning¶

Training Data¶

Ongoing Training¶

Check Learning Progress¶

Performance Tuning¶

Rspamd Workers¶

Caching¶

Scan Limits¶

Monitoring Spam Detection¶

Statistics¶

Real-time Monitoring¶

Alerts¶

Troubleshooting¶

Too Much Spam Getting Through¶

Too Many False Positives¶

Rspamd Not Working¶

SpamAssassin Slow¶

Best Practices¶

Initial Setup¶

Ongoing Maintenance¶

Whitelisting Strategy¶

Performance Optimization¶

See Also¶