Performance Monitoring¶

Monitoring system performance and identifying bottlenecks.

System Resources¶

CPU Monitoring¶

Real-time monitoring:

top -b -n 1 | head -n 20
htop  # Interactive view

CPU usage by process:

ps aux --sort=-%cpu | head -n 10

Service-specific CPU:

systemctl status mb-rpcd | grep CPU
systemctl status mb-filter | grep CPU
systemctl status clamd@scan | grep CPU

Historical CPU usage:

sar -u 1 10  # 10 samples, 1 second apart

Memory Monitoring¶

Current memory usage:

free -m

Detailed memory breakdown:

cat /proc/meminfo

Top memory consumers:

ps aux --sort=-%mem | head -n 10

Service memory usage:

systemctl status mb-rpcd | grep Memory
systemctl status clamd@scan | grep Memory
systemctl status redis-server | grep Memory

Memory pressure:

# Check for OOM killer activity
dmesg | grep -i "out of memory"
grep -i "killed process" /var/log/syslog

Disk I/O Monitoring¶

Disk usage:

df -h

Inode usage:

df -i

Real-time disk I/O:

iostat -x 1 10

Top disk I/O processes:

iotop -o  # Only show active processes

Disk performance:

# Test read speed
sudo hdparm -t /dev/sda

# Test write speed
dd if=/dev/zero of=/tmp/test bs=1M count=1024 conv=fdatasync
rm /tmp/test

Network Monitoring¶

Network interfaces:

ip -s link

Active connections:

netstat -tunapl | grep ESTABLISHED | wc -l

Connection by service:

ss -tunap | grep ":25 "    # SMTP
ss -tunap | grep ":443 "   # HTTPS
ss -tunap | grep ":11332 " # Rspamd

Network throughput:

iftop -i eth0
nload eth0

Bandwidth usage:

vnstat -l  # Live traffic
vnstat -d  # Daily statistics

Email Processing Metrics¶

Queue Monitoring¶

Queue depth:

sudo postqueue -p | tail -n 1

Queue age:

sudo find /var/spool/postfix/deferred -type f -mtime +1 | wc -l

Messages per hour:

sudo grep "status=sent" /var/log/mail.log | \
  grep "$(date +'%b %_d %H')" | wc -l

Processing time:

sudo grep "delay=" /var/log/mail.log | \
  awk '{print $7}' | sort -n | tail -n 10

Filter Performance¶

Scan times:

sudo tail -n 1000 /var/log/mailborder/mb-filter.log | \
  grep "scan_time" | \
  awk '{sum+=$NF; count++} END {print "Average:", sum/count, "ms"}'

ClamAV performance:

sudo grep "clamd" /var/log/mailborder/mb-filter.log | \
  grep "time" | tail -n 20

Rspamd performance:

curl -s http://localhost:11334/stat | jq

Processing Statistics¶

Daily email volume:

sudo mb-filter-stats --daily

Spam detection rate:

sudo mb-filter-stats --spam-rate --last 24h

Virus detection:

sudo grep "FOUND" /var/log/clamav/clamav.log | wc -l

Database Performance¶

Connection Monitoring¶

Active connections:

sudo mysql -e "SHOW PROCESSLIST"

Connection pool status:

sudo mysql -e "SHOW STATUS LIKE 'Threads_%'"

Max connections:

sudo mysql -e "SHOW VARIABLES LIKE 'max_connections'"

Query Performance¶

Slow queries:

sudo mysql -e "SHOW GLOBAL STATUS LIKE 'Slow_queries'"

Query cache:

sudo mysql -e "SHOW STATUS LIKE 'Qcache%'"

Recent slow queries:

sudo tail -n 50 /var/log/mysql/slow.log

Table Performance¶

Table sizes:

SELECT
  table_name,
  ROUND(((data_length + index_length) / 1024 / 1024), 2) AS size_mb,
  table_rows,
  ROUND((data_length / 1024 / 1024), 2) AS data_mb,
  ROUND((index_length / 1024 / 1024), 2) AS index_mb
FROM information_schema.TABLES
WHERE table_schema = 'mailborder'
ORDER BY (data_length + index_length) DESC
LIMIT 10;

Fragmentation:

SELECT
  table_name,
  ROUND((data_free / 1024 / 1024), 2) AS fragmented_mb
FROM information_schema.TABLES
WHERE table_schema = 'mailborder'
  AND data_free > 0
ORDER BY data_free DESC;

Lock Monitoring¶

Table locks:

sudo mysql -e "SHOW OPEN TABLES WHERE In_use > 0"

Lock waits:

sudo mysql -e "SHOW STATUS LIKE 'Table_locks_waited'"

Redis Performance¶

Memory Usage¶

Memory stats:

redis-cli INFO memory

Key count:

redis-cli DBSIZE

Eviction stats:

redis-cli INFO stats | grep evicted

Performance Metrics¶

Operations per second:

redis-cli INFO stats | grep instantaneous_ops_per_sec

Hit rate:

redis-cli INFO stats | grep keyspace_hits
redis-cli INFO stats | grep keyspace_misses

Latency:

redis-cli --latency
redis-cli --latency-history

Slow Commands¶

Enable slow log:

redis-cli CONFIG SET slowlog-log-slower-than 10000  # 10ms
redis-cli CONFIG SET slowlog-max-len 128

View slow commands:

redis-cli SLOWLOG GET 10

Web Interface Performance¶

Response Times¶

Test endpoint:

curl -o /dev/null -s -w '%{time_total}\n' https://localhost/

Multiple requests:

for i in {1..10}; do
  curl -o /dev/null -s -w '%{time_total}\n' https://localhost/
done | awk '{sum+=$1; count++} END {print "Average:", sum/count, "seconds"}'

PHP-FPM Metrics¶

Pool status:

curl -s http://localhost/fpm-status

Key metrics: - Active processes - Idle processes - Max children reached - Slow requests

Process list:

curl -s http://localhost/fpm-status?full

Nginx Metrics¶

Active connections:

curl -s http://localhost/nginx_status

Access log analysis:

# Request rate
sudo awk '{print $4}' /var/log/nginx/access.log | \
  cut -d: -f2 | sort | uniq -c

# Response times
sudo awk '{print $NF}' /var/log/nginx/access.log | \
  sort -n | tail -n 100

Service Health Checks¶

Automated Monitoring¶

Create health check script:

sudo tee /usr/local/bin/mb-health-check.sh << 'EOF'
#!/bin/bash

# Service checks
for service in mb-rpcd mb-filter mb-milter clamd@scan redis-server nginx php8.2-fpm; do
  if ! systemctl is-active --quiet $service; then
    echo "CRITICAL: $service is not running"
  fi
done

# Disk space
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ $DISK_USAGE -gt 90 ]; then
  echo "WARNING: Disk usage at ${DISK_USAGE}%"
fi

# Memory
MEM_USAGE=$(free | awk 'NR==2 {printf "%.0f", $3*100/$2}')
if [ $MEM_USAGE -gt 90 ]; then
  echo "WARNING: Memory usage at ${MEM_USAGE}%"
fi

# Queue depth
QUEUE_SIZE=$(postqueue -p | tail -n 1 | awk '{print $5}')
if [ ! -z "$QUEUE_SIZE" ] && [ $QUEUE_SIZE -gt 100 ]; then
  echo "WARNING: Mail queue at $QUEUE_SIZE messages"
fi

# Database
if ! mysqladmin ping -h localhost &>/dev/null; then
  echo "CRITICAL: Database not responding"
fi

# Redis
if ! redis-cli ping &>/dev/null; then
  echo "CRITICAL: Redis not responding"
fi

echo "All checks passed"
EOF

sudo chmod +x /usr/local/bin/mb-health-check.sh

Run health check:

sudo /usr/local/bin/mb-health-check.sh

Schedule periodic checks:

# Add to crontab
echo "*/5 * * * * /usr/local/bin/mb-health-check.sh | logger -t mb-health" | sudo crontab -

Performance Alerting¶

Email Alerts¶

Configure alerts:

sudo tee /etc/mailborder/alerts.conf << 'EOF'
# Alert thresholds
CPU_THRESHOLD=80
MEM_THRESHOLD=85
DISK_THRESHOLD=90
QUEUE_THRESHOLD=200

# Alert email
ALERT_EMAIL="admin@example.com"
EOF

Alert script:

sudo tee /usr/local/bin/mb-alert.sh << 'EOF'
#!/bin/bash
source /etc/mailborder/alerts.conf

CPU=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
if [ ${CPU%.*} -gt $CPU_THRESHOLD ]; then
  echo "CPU usage at $CPU%" | mail -s "Mailborder Alert: High CPU" $ALERT_EMAIL
fi

MEM=$(free | awk 'NR==2 {printf "%.0f", $3*100/$2}')
if [ $MEM -gt $MEM_THRESHOLD ]; then
  echo "Memory usage at $MEM%" | mail -s "Mailborder Alert: High Memory" $ALERT_EMAIL
fi

DISK=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ $DISK -gt $DISK_THRESHOLD ]; then
  echo "Disk usage at $DISK%" | mail -s "Mailborder Alert: Low Disk Space" $ALERT_EMAIL
fi
EOF

sudo chmod +x /usr/local/bin/mb-alert.sh

Integration with Monitoring Systems¶

Nagios/Icinga:

# Install NRPE plugin
sudo apt install nagios-nrpe-server

# Configure check
sudo tee -a /etc/nagios/nrpe.cfg << 'EOF'
command[check_mailborder]=/usr/local/bin/mb-health-check.sh
EOF

sudo systemctl restart nagios-nrpe-server

Prometheus:

# Install node_exporter
sudo apt install prometheus-node-exporter

# Custom metrics endpoint
sudo tee /usr/local/bin/mb-metrics.sh << 'EOF'
#!/bin/bash
echo "# HELP mailborder_queue_size Current mail queue size"
echo "# TYPE mailborder_queue_size gauge"
QUEUE=$(postqueue -p | tail -n 1 | awk '{print $5}')
echo "mailborder_queue_size ${QUEUE:-0}"
EOF

Performance Baselines¶

Establish Baselines¶

Collect baseline metrics:

# CPU baseline
echo "CPU Baseline: $(top -bn1 | grep "Cpu(s)" | awk '{print $2}')" >> /var/log/mailborder/baseline.log

# Memory baseline
echo "Memory Baseline: $(free -m | awk 'NR==2 {print $3}')" >> /var/log/mailborder/baseline.log

# Processing rate
RATE=$(sudo mb-filter-stats --rate)
echo "Processing Rate: $RATE emails/hour" >> /var/log/mailborder/baseline.log

Compare to baseline:

# Current vs baseline comparison
sudo tail -n 20 /var/log/mailborder/baseline.log

Performance Dashboards¶

Create Simple Dashboard¶

Web-based status page:

sudo tee /var/www/html/status.php << 'EOF'
<?php
header('Content-Type: application/json');

$status = [
    'services' => [
        'mb-rpcd' => shell_exec('systemctl is-active mb-rpcd'),
        'mb-filter' => shell_exec('systemctl is-active mb-filter'),
        'database' => shell_exec('mysqladmin ping 2>&1'),
    ],
    'queue_size' => (int) shell_exec('postqueue -p | tail -n 1 | awk \'{print $5}\''),
    'cpu_usage' => (float) shell_exec('top -bn1 | grep "Cpu(s)" | awk \'{print $2}\' | cut -d\'%\' -f1'),
    'memory_pct' => (int) shell_exec('free | awk \'NR==2 {printf "%.0f", $3*100/$2}\''),
    'disk_usage' => (int) shell_exec('df -h / | awk \'NR==2 {print $5}\' | sed \'s/%//\''),
];

echo json_encode($status, JSON_PRETTY_PRINT);
?>
EOF

Access dashboard:

curl -s https://localhost/status.php | jq

Tuning Recommendations¶

Based on metrics:

High CPU usage:
Increase worker processes
Enable caching
Optimize database queries
Consider hardware upgrade
High memory usage:
Reduce cache sizes
Limit concurrent processes
Optimize mb-rpcd fork limit
Add more RAM
High disk I/O:
Move logs to separate disk
Optimize database writes
Use SSD for database
Increase buffer sizes
Slow email processing:
Increase filter workers
Tune ClamAV limits
Optimize Rspamd configuration
Review policy complexity
Database bottlenecks:
Add indexes
Optimize queries
Increase connection pool
Archive old data

Performance Monitoring¶

System Resources¶

CPU Monitoring¶

Memory Monitoring¶

Disk I/O Monitoring¶

Network Monitoring¶

Email Processing Metrics¶

Queue Monitoring¶

Filter Performance¶

Processing Statistics¶

Database Performance¶

Connection Monitoring¶

Query Performance¶

Table Performance¶

Lock Monitoring¶

Redis Performance¶

Memory Usage¶

Performance Metrics¶

Slow Commands¶

Web Interface Performance¶

Response Times¶

PHP-FPM Metrics¶

Nginx Metrics¶

Service Health Checks¶

Automated Monitoring¶

Performance Alerting¶

Email Alerts¶

Integration with Monitoring Systems¶

Performance Baselines¶

Establish Baselines¶

Performance Dashboards¶

Create Simple Dashboard¶

Tuning Recommendations¶

See Also¶