Spam (junk email) filtering in Debian GNU/Linux

Chris Lale

       chrislale@users.sourceforge.net

      

Revision History
Revision 0.123 January 2004Revised by: cl
Initial release

Table of Contents
1. Introduction
1.1. What is spam?
1.2. What can you do about spam?
1.3. Email software components
1.4. Mail on a Debian system
1.5. Outline of the installation
1.6. Information you need before you start
1.7. Names used in this document
2. evolution- email client providing the mail user agent (MUA)
2.1. About evolution
2.2. Install evolution
2.3. evolution's 'Inbox' folder
2.4. Configure evolution to send mail to and retrieve mail from a remote server
2.5. Test evolution's configuration
3. eximand procmail
3.1. About exim - mail transport agent
3.2. About procmail - mail delivery agent
3.3. Local mail
3.4. .forward file
4. fetchmail - mail-retrieval and forwarding utility
4.1. About fetchmail
4.2. Install fetchmail
4.3. Configure fetchmail
4.4. Test fetchmail
5. fetchmaildaemon - run fetchmail in the background
5.1. About the fetchmail daemon
5.2. How to invoke the fetchmail daemon
5.3. Configure the fetchmail daemon
6. spamassassin
6.1. About spamassassin
6.2. procmailand spamassassin
6.3. Local mail
6.4. Putting it all together
7. What next?
A. Appendix: About this document
A.1. Copyright information
A.2. Latest version
A.3. Bugs, errors and mistakes
A.4. Spelling, punctuation and grammar
A.5. Conventions used in this document
A.6. Bibliography

1. Introduction


1.3. Email software components

1.3.1. Incoming mail

The type of software that manages email on your computer falls into three loosely defined catagories: Mail Transport Agent, Mail Delivery Agent and Mail User Agent. There is an article describing generic mail flow. These mail agents are often incorporated into a single software application (eg evolution, mozilla mail, etc). Figure 1 shows a typical system for receiving email.

Figure 1: the major components of a mail system dealing with incoming mail.

  1. Mail Transport Agent (MTA), sometimes referred to as a Mail Transfer Agent. A mail transport agent is software that enables mail to travel from one mail system to another. Your provider's server uses an MTA to send you your mail. You must have an MTA at your end of the connection. A standard Debian installation uses exim, but you may come across others eg sendmail, smail, qmail, etc. Typically, the MTA passes the mail onto an MDA.

  2. Mail Delivery Agent (MDA), sometimes referred to as a Local Delivery Agent (LDA). A mail delivery agent is software that takes mail from an MTA and passes it on to a mailbox or another MTA. A standard Debian installation uses procmail, although there are others eg deliver.

  3. Mail User Agent (MUA). A mail user agent is the software that enables a user to write, send, receive and read email. A standard Debian installation uses mail for local mail. Other mail user agents include elm, pine, eudora, pegasus, etc.

Mail from your provider normally reaches you using the POP3 protocol. A protocol is a set of rules that govern how things communicate over the network (see theLinux dictionary).


1.4. Mail on a Debian system


1.4.4. Example of a spam email modified using spamassassin

1.4.4.1. Added headers

The version of spamassassin in Woody is 2.20-1woody. The important header for basic filtering is X-Spam-Flag. It is set to “YES” when spamassassin has identified enough known features of spam messages.

X-Spam-Status: Yes, hits=5.7 required=5.0
     tests=VIAGRA,BASE64_ENC_TEXT,SUBJ_ALL_CAPS version=2.20
X-Spam-Flag: YES
X-Spam-Level: *****
X-Spam-Checker-Version: SpamAssassin 2.20 
     (devel $Id: spam.html.en,v 1.1 2004/04/30 21:50:38 chrislale Exp $)
X-Spam-Prev-Content-Type: text/plain
X-Spam-Prev-Content-Transfer-Encoding: base64
X-Evolution-Source: mbox:/var/mail/local-user
     

1.4.4.3. Lines added to the top of the message

spamassassin 2.20 adds these lines to the top of the message. (Later versions of spamassassin do this differently.)

SPAM: -------------------- Start SpamAssassin results ----------------------
SPAM: This mail is probably spam.  The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM: 
SPAM: Content analysis details:   (5.7 hits, 5 required)
SPAM: Hit! (2.4 points)  BODY: Plugs Viagra
SPAM: Hit! (1.4 points)  Message text disguised using base-64 encoding
SPAM: Hit! (1.9 points)  Subject is all capitals
SPAM: 
SPAM: -------------------- End of SpamAssassin results ---------------------
     

1.5. Outline of the installation

There are several stages in the installation process. There are good reasons for this. If you try to install and configure all the software in one go, you are more likely to run into difficulty because there is simply more to go wrong. You may also find it much more difficult to trace faults. By installing and testing in stages, you can go from each stage to the next one with confidence, knowing that your configuration works correctly. Of course, you may need to change one or two configuration settings in the next stage. However, if you make changes one at a time and a fault occurs, you do not have to look very far to find out why.

If you have a dialup connection you probably retrieve external mail from a POP3 server, and send email via an SMPT server. Your Internet Service Provider (ISP) or web hosting provider gave you this information when you set up your account with them. You can also obtain the details you need by looking at the working configuration of a computer that you currently use to connect to your provider. As a last resort you can contact your provider and ask for the information.


2. evolution- email client providing the mail user agent (MUA)


2.4. Configure evolution to send mail to and retrieve mail from a remote server

You must tell evolution the names of your provider's POP3 and SMTP servers, and your username and password on these servers. Run evolution and configure it using the Mail Settings dialogue from the Tools menu.

Menu bar 
   Tools
      Mail Settings ...  
         Add
Mail settings dialogue 
   Accounts tab  
      Add
   

This brings up the evolution Account Assistant's Mail Configuration dialogue .

Mail Configuration dialogue
   Click "Next" to begin
   Next
   

In the Identity dialogue, enter the full name and email address that you use on your provider's server.

   Identity dialogue
      Required information 
         Full name: firstname surname
         Email address: remote-user-name@example.net
      Next
   

In the Receiving Email dialogue, enter details of the email server that you receive emails from. (This is usually a POP3 server.)

Receiving Email dialogue 
   Server Type: POP
   Configuration
      Host: mail.example.net
      Username: remote-user-name
   Authentication
      Authentication Type: Password
      Remember this password
   Next
   Checking for New Mail
      Automatically check for new mail every 10 minute(s)
      Receiving Email dialogue
Next
   

Enter details of the email server that you send emails to. (This is usually an SMTP server.)

Sending Email dialogue
   Server Type: SMTP
   Server Configuration
       Host: smtp.example.net
Next
   

Enter a name for evolution to identify the account. You can choose anything within reason. One approach is to use your provider's name. You might use 'Example' if your email address is 'remote-user-name@example.net'.

   Account Management dialogue
      Account Information
         Name: Example
         Next
   Done 
Finish
   

You may have other accounts that you wish to add. You will save yourself some work by waiting until the whole system is set up and working. You can add further accounts at any time in exactly the same way


3. eximand procmail

3.1. About exim - mail transport agent

eximis a mail transfer agent (MTA).

The Exim manual describes exim like this. exim “contains facilities for verifying incoming sender and recipient addresses, for refusing mail from specified hosts, networks, or senders, and for controlling mail relaying”. and the Debian Reference. It sounds complicated! This is where Debian comes to the rescue. You install exim from a Debian package (.deb file). exim is probably already configured and working.

The inetd.conf file starts exim running every time you boot your computer (see Exim's README.Debian file).

You may find the newbieDoc article “A simple configuration of Exim” useful.


3.3. Local mail

The local mail system is available to all local users (all those who login to your computer). In a standard Debian installation there is always an account for the Root user. There should be at least on other account for a normal user - your account. (You should never use the Root account unless it is absolutely necessary.)

Time for a quick test! You will need a mail user agent (MUA) to send and receive messages. You could use mutt, but mail. To run mail from the command line.

Send yourself a message using mail. Enter a subject and a short message, but do not worry about sending a carbon copy (Cc:). Use the 'enter' key for each new line, and 'control' key + 'd' key to end each section.

$ mail local-user 
Subject: local message
hello
ctrl-D
Cc:
ctrl-D
$ 
   

Now check that the message has reached you local mailbox. Look inside the /var/mail/local-user file using the cat command.

$ cat /var/mail/local-user
From local-user@local-host Wed Feb 04 12:11:46 2004 
Return-path: <local-user@local-host> 
Envelope-to: local-user@local-host 
Received: from local-user by local-host with local (Exim 3.35 #1 (Debian))
     id 1AoLsg-0000in-00
     for <local-user@local-host>; Wed, 04 Feb 2004 12:11:46 +0000
To: <local-user@local-host> 
Subject: local message 
Message-Id: <E1AoLsg-0000in-00@local-host>
From: local-user <local-user@local-host>
Date: Wed, 04 Feb 2004 12:11:46 +0000
Status: O
hello
$  
   

Read the message using mail. Use the 'enter' key to open the message and enter 'q' to quit.

$ mail 
Mail version 8.1.2 01/15/2001.  Type ? for help. 
"/var/mail/local-user": 1 message 1 new 
>N  1 local-user@local-host      Wed Feb 04 12:11   14/394   local message 
hello
& q
Saved 1 message in /home/local-user/mbox
$
   

Mail stores messages in the /home/local-user/mbox file. You can check this too.

$ cat /home/local-user/mbox
   

4. fetchmail - mail-retrieval and forwarding utility

4.1. About fetchmail

In a previous section on configuring Evolution to send and retrieve mail, you configured evolution to receive mail from directly from your provider's server. The next step is to get a different application (fetchmail) to do the job instead! There is a good reason for this. You must examine each message for spam before it reaches evolution. fetchmail will collect the mail and forward it to the local mailbox (/var/mail/local-user). exim, procmail and spamassassin will deal with spam on the way to the mail box (see figure 5). You may find the newbieDoc article “Fetchmail configuration” useful.

Don't worry about spamassassin yet - get fetchmail working first.

Figure 5: Fetchmail delivers incoming remote mail into the local mail system.

fetchmailretrieves the mail from your provider's server and forwards it to your local email folder (see figure 5). If you logged into your computer as local-user, fetchmail will place it in /var/mail/local-user. This means that you must reconfigure evolution to retrieve the mail from /var/mail/local-user instead of directly from the remote POP3 account. This is so that the mail can be processed before evolution receives it.


4.3. Configure fetchmail

fetchmaillooks for its configuration details in a hidden file in the current local user's home directory (/home/local-user/.fetchmailrc). It is possible to create this file with a text editor. However, there is a configuration tool (fetchmailconf) that makes building the configuration less prone to typos and other mistakes.

Invoke fetchmailconf from the command line as a normal user. It runs as an X11 application.

$ fetchmailconf
   

Choose 'Configure fetchmail'.

fetchmail launcher dialogue
   Configure fetchmail
   

Choose the 'Novice Configuration'.

fetchmail configurator dialogue
   Novice Configuration 
   

fetchmailwill repeatedly check the remote mail server for new mail. The time between each repeat is called the poll interval. A short interval (between 10 and 30 seconds) is useful for testing - you do not want to spend hours staring at you modem lights! You may wish to change this value in a permanent set up to something like 300 seconds (5 minutes). Set the poll interval and the remote mail server's name.

fetchmail novice configurator dialogue
   Fetchmail Run Controls 
      Poll interval: 30 (seconds) 
   Remote Mail Server Configurations 
      New Server:  mail.example.net 
   

Either double-click or press the <Enter > key.

Enter the protocol of the remote mail server and the user name you use to log into it.

Fetchmail host mail.example.net dialogue
   Protocol 
      POP3
   User entries for mail.example.net
      New user: remote-user-name
   

Either double-click or press the <Enter> key

Enter your password on the remote server.

Fetchmail user remote-user-name querying mail.example.net dialogue
   Authentication
      Password: ******* 
   

Enter the username you use to log into the local host (your computer).

   Local names
      New name: local-user
   

Delete remote-user-name since there is no such local user.

      remote-user-name delete
   

Save and exit in reverse order.

Fetchmail user remote-user-name querying mail.example.net dialogue
   OK

Fetchmail host mail.example.net dialogue
   OK

fetchmail configurator dialogue
   Save

fetchmail launcher
   Quit
   

Check the result by looking in ˜/.fetchmailrc. The last line should look like this:

poll mail.example.net with protocol POP3 user 'remote-user-name' there with password '*******' is 'local-user' here
   

Notice that the password is not secure. It is saved in an ordinary user's file. Later on you will move the configuration statement to a safer location.


4.4. Test fetchmail

NoteCollect your email before testing
 

It is wise to download all your email before going on. Using your existing email client software. It is possible that some email may be lost during testing if something goes wrong.

It is time to send yourself another email. Compose it in and send it from evolution as before. (See the section on testing Evolution's configuration.) fetchmail should receive it a little while later and forward it to your local-user's local mailbox /var/mail/local-user. You can check this by looking in /var/mail/local-user or by using the mail command at the command prompt. (See the section on local mail.)


4.4.2. Connect to the remote mail server and re-test fetchmail

Connect to the remote server using Modem Lights applet (Gnome) or pon and poff (command line). Re-run the test. You may need to repeat the test a few times until the remote server responds.

fetchmail launcher dialogue
   Test fetchmail
    

(Repeat as necessary.)

This is what you get when everything is working properly but you have no mail to collect.

fetchmail run window 
   Running fetchmail -d0 -v --nosyslog
      fetchmail:  5.9.11 querying mail.example.net (protocol POP3) ...
      ...
      fetchmail: POP3< +OK ...
      ...
      No mail for remote-user-name at mail.example.net ...
      ...
      fetchmail: normal termination, status 1
      Done.
Exit
    

If you do have mail to collect, the 'fetchmail run window' will say so.


4.4.3. Send and receive mail from the remote server

Send yourself (local-user) a message from evolution with the subject 'Fetchmail test1'. You saw how to do this in the section on testing Evolution's configuration.

Run fetchmail to collect the message.

fetchmail launcher dialogue
   Run fetchmail
    

The fetchmail launcher dialogue opens a new 'fetchmail run window'.

fetchmail run window
   Running fetchmail -d0
      ...
      1 message for remote-user-name at mail.example.net (993 octets). 
      reading message remote-user-name at mail.example.net:1 of 1 (993 octets) 
      flushed
      Done.
Exit
    

fetchmailnot only checks for mail, it also downloads it to your local mailbox and flushes (deletes) it from the remote server.


4.4.4. Check that fetchmail has delivered the test email

Use mail to fetch the message from your local mailbox. Invoke the mail at the command prompt. Read the mail, then type q at the prompt (&) to quit.

$ mail 
Mail version 8.1.2 01/15/2001.  Type ? for help. 
"/var/mail/local-user": 1 message 1 new 
>N  1 local-user@local-host      Wed Feb 04 12:11   14/394   Fetchmail test 1
& q
$
    

4.4.5. Configure evolution to retrieve email from the user's local account

NoteLeave evolution's other settings unchanged
 

You only need to change the way evolution receives mail. evolution will continue to send out-going mail directly to the remote SMTP server.

In the section on configuring Evolution, you configured evolution to collect mail directly from the remote mail server. fetchmail does this job now and saves the mail locally in /var/mail/local-user. What you really want is for the email to end up in evolution again. To do this you must reconfigure evolution to retrieve the mail from /var/mail/local-user.

Run evolution. If you have several accounts, temporarily disable all but one account. You can modify the configuration of other accounts, or install new ones, when you have got the whole installation running satisfactorily.

Menu bar
   Tools  
      Mail Settings ... 
Mail Settings dialogue
   Accounts
    

Select an account to be disabled and disable it.

   Disable
OK
    

(Repeat, if necessary, until only one account is enabled.)

Edit the one remaining account to change the server type from 'POP' to 'Local delivery' from /var/mail/local-user.

Menu bar
   Tools  
      Mail Settings ... 
Mail Settings dialogue
   Accounts
    

Make sure that the active account is selected, the edit it.

      Edit
    

Change the details for server type.

Evolution Account Editor dialogue
Receiving Mail tab
   Server Type: Local delivery
   Configuration
      Path: /var/mail/local-user
   OK
Mail Settings dialogue
OK
    

5. fetchmaildaemon - run fetchmail in the background

5.1. About the fetchmail daemon

The fetchmailconf launcher allows you to run fetchmail by clicking on the Run fetchmail button (see the section on sending and receiving from a remote server). This is not very convenient for everyday use. However, you may have noticed that the Run fetchmail button invokes the command fetchmail -d0. The '-d' switch means 'run fetchmail as a daemon'. A daemon is 'a program which runs for an extended period' ... 'in the background, usually unnoticed' (see the the Linux Dictionary). The fetchmail -d command works because you installed the fetchmail daemon when you installed fetchmail as a system-wide service in the section on installing Fetchmail.


5.3. Configure the fetchmail daemon


5.3.2. /etc/fetchmailrc

This file is needed if starting or stopping /etc/init.d/fetchmail. It must contain the configuration statement for polling the remote server that fetchmailconf produced . (There is a typo in this file: 'pool' should be spelled 'poll'.)

poll mail.example.net with protocol POP3 user 'remote-user' there with password '*******'
    

There is an example statement in the /etc/fetchmailrc created during installation. The example statement ends with a semicolon (;). The semicolon does not seem to be necessary as long as the statement is written all on one line.

Leave the defaults unchanged - only batchlimit is uncommented.

batchlimit 100
    

Change owner to user fetchmail and group to root.

WarningSecurity
 

Notice that the password is not encrypted. The /etc/fetchmailrc file must belong to the special user fetchmail, and be readable and writeable only by owner fetchmail.

# chown --verbose fetchmail:root /etc/fetchmailrc
ownership of `/etc/fetchmailrc' retained as fetchmail:root
    

Set permissions to read and write only for owner.

# chmod --verbose 0600 /etc/fetchmailrc
mode of `fetchmailrc' changed to 0600 (rw-------)
    

5.3.4. Test the fetchmail daemon

Stop the fetchmail daemon. You must be the root user to do this.

$ su
Password: *******
# /etc/init.d/fetchmail stop
Stopping mail retrieval agent: fetchmail.
    

You will get an error if you have an incorrect configuration.

Stopping mail retrieval agent: system-wide fetchmail not running.)
    

Start the daemon.

# /etc/init.d/fetchmail start
Starting mail retrieval agent: fetchmail.
# exit
$
    

You will get an error if you have an incorrect configuration.

Starting mail retrieval agent: fetchmail (failed!)
    

The most likely cause of an error is a mistake in either /etc/fetchmailrc or /etc/default/fetchmail.

Once the daemon is running, sent yourself a test mail from evolution via the remote server. Connect to your provider's server, just as you did when testing Evolution's configuration, compose and send the message.

Compose a message dialogue
   To: remote-user-name@example.net
   Subject: Fetchmail daemon test1
   Send
    

Give the message a couple of minutes to reach the remote server and to be returned. Keep an eye on your modem lights and you should see the regular polling of mail.example.net and the email downloading to /var/mail/local-user. Retrieve the message using evolution's Send / Receive button.

Button bar
   Send / Receive
    

(Repeat as necessary.)

Hopefully, everything works correctly. evolution downloads messages just as before.


6. spamassassin


6.3. Local mail

Local mail still collects in /var/mail/local-user. Previously, you would have read local mail using mail's mail command. It is best not to do this now because mail read in this way would end up in /home/local-user/mbox. You can filter on the name of local computer and divert it into a new folder called Local. (Substitute the name of your computer for local-host. You can check what this is using the command hostname .)

Menu bar
   Tools
      Filters
Filters dialog
   Incoming 
      Filter rules
        Add
Edit Rule dialog
   Rule name: Move to local folder
   If box 
      Recipients contains @local-host
   Then box
      Move to folder
         Click here to select a folder
            New
               Create new folder dialog
                  Folder name: Local
                  Folder type: Mail
                  Specify where to create the folder: Inbox
            OK
OK
   

7. What next?

The Woody version of spamassassin is rather old. You can keep spamassassin up to date by downloading the latest version from CPAN.

You can add vipul's razor to your system. vipul's razor allows you to report spam that gets through your current system to a database. It is a collaborative database with many contributers. vipul's razor uses the database online to weed out the spam.


A. Appendix: About this document

A.1. Copyright information

Copyright ©2004 Chris Lale . Permission is granted to copy, distribute and/or modify this document with no Invariant Sections, with no Front-Cover texts and with no Back-Cover Texts under the terms of the GNU Free Documentation License, version 1.1 or any later version, published by the Free Software Foundation. A copy of the license can be found at http://www.fsf.org/copyleft/fdl.html.


A.6. Bibliography

Linux dictionary

The Linux Documentation Project's online Linux dictionary at http://www.tldp.org/LDP/Linux-Dictionary/html/index.html(online) or linux dict http://www.tldp.org/guides.html (download).

Syslog(3)

The manual page of the System Logger. Enter the command man 3 syslog at the command prompt.

Generic Mail Flow

An article in the online Linux Journal at http://www.linuxjournal.com/modules.php?op=modload&name=NS-lj-issues/issue46&file=2516s2

Debian Reference

(Osamu Aoki) Chapter 9 “Tuning a Debian system” includes brief notes about exim, fetchmail and procmail at http://qref.sourceforge.net/

Exim manual

Install the exim-doc package, then view the files in /usr/share/doc/exim.Start by looking at /usr/share/doc/exim/oview.txt with a text editor or viewer. Alternatively, install the exim-doc-html package, then view the file in /usr/share/doc/exim/manual.html/oview.html with a web browser.

A simple configuration of Exim

(Oohara Yuuma). This newbieDoc article describes how to configure Exim for local use. http://newbiedoc.sourceforge.net/networking/exim.html

Exim README.Debian

Information about the “Debianisation” of exim is held in the file /usr/share/doc/exim/README.Debian.

eximon

- X monitor for the exim mail transport agent.

Fetchmail configuration

(Oohara Yuuma) This newbieDoc article describes how to configure fetchmail. http://newbiedoc.sourceforge.net/networking/fetchmail.html