1. Introduction

1.1. What is spam?

Spam is junk email. You may come across other synonyms for spam including UCE (Unsolicited Commercial Email) and UBE (Unsolicited Bulk Email).

To spam is to mass-mail unrequested identical or nearly-identical email messages, particularly those containing advertising. Especially used when the mail addresses have been culled from network traffic or databases without the consent of the recipients (see the Linux Dictionary).

Firewalls do not protect you against spam, but you can use other software to identify and filter out junk email.

1.2. What can you do about spam?

One thing everyone can do is to be careful when sending email to multiple addresses. Use Bcc: (blind carbon copy) to send to a message to multiple addresses, and leave the To: and Cc: fields empty. This means that none of the recipients gets any of the other of the other addresses. The addresses are less likely to be picked up by potential spammers .

If your ISP (Internet Service Provider) or web hosting provider has the facility to filter out spam, use it. It save you wasting time and money downloading rubbish.

If your provider does not filter email, or if you still have some spam coming through your provider's filter, read on. This document explains how to set up email filtering on your own home computer.

1.3. Email software components

1.3.1. Incoming mail

The type of software that manages email on your computer falls into three loosely defined catagories: Mail Transport Agent, Mail Delivery Agent and Mail User Agent. There is an article describing generic mail flow. These mail agents are often incorporated into a single software application (eg evolution, mozilla mail, etc). Figure 1 shows a typical system for receiving email.

Figure 1: the major components of a mail system dealing with incoming mail.

  1. Mail Transport Agent (MTA), sometimes referred to as a Mail Transfer Agent. A mail transport agent is software that enables mail to travel from one mail system to another. Your provider's server uses an MTA to send you your mail. You must have an MTA at your end of the connection. A standard Debian installation uses exim, but you may come across others eg sendmail, smail, qmail, etc. Typically, the MTA passes the mail onto an MDA.

  2. Mail Delivery Agent (MDA), sometimes referred to as a Local Delivery Agent (LDA). A mail delivery agent is software that takes mail from an MTA and passes it on to a mailbox or another MTA. A standard Debian installation uses procmail, although there are others eg deliver.

  3. Mail User Agent (MUA). A mail user agent is the software that enables a user to write, send, receive and read email. A standard Debian installation uses mail for local mail. Other mail user agents include elm, pine, eudora, pegasus, etc.

Mail from your provider normally reaches you using the POP3 protocol. A protocol is a set of rules that govern how things communicate over the network (see theLinux dictionary).

1.4. Mail on a Debian system

1.4.4. Example of a spam email modified using spamassassin

1.4.4.1. Added headers

The version of spamassassin in Woody is 2.20-1woody. The important header for basic filtering is X-Spam-Flag. It is set to “YES” when spamassassin has identified enough known features of spam messages.

X-Spam-Status: Yes, hits=5.7 required=5.0
     tests=VIAGRA,BASE64_ENC_TEXT,SUBJ_ALL_CAPS version=2.20
X-Spam-Flag: YES
X-Spam-Level: *****
X-Spam-Checker-Version: SpamAssassin 2.20 
     (devel $Id: x27.html.en,v 1.1 2004/04/30 21:50:38 chrislale Exp $)
X-Spam-Prev-Content-Type: text/plain
X-Spam-Prev-Content-Transfer-Encoding: base64
X-Evolution-Source: mbox:/var/mail/local-user
     

1.4.4.3. Lines added to the top of the message

spamassassin 2.20 adds these lines to the top of the message. (Later versions of spamassassin do this differently.)

SPAM: -------------------- Start SpamAssassin results ----------------------
SPAM: This mail is probably spam.  The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM: 
SPAM: Content analysis details:   (5.7 hits, 5 required)
SPAM: Hit! (2.4 points)  BODY: Plugs Viagra
SPAM: Hit! (1.4 points)  Message text disguised using base-64 encoding
SPAM: Hit! (1.9 points)  Subject is all capitals
SPAM: 
SPAM: -------------------- End of SpamAssassin results ---------------------
     

1.5. Outline of the installation

There are several stages in the installation process. There are good reasons for this. If you try to install and configure all the software in one go, you are more likely to run into difficulty because there is simply more to go wrong. You may also find it much more difficult to trace faults. By installing and testing in stages, you can go from each stage to the next one with confidence, knowing that your configuration works correctly. Of course, you may need to change one or two configuration settings in the next stage. However, if you make changes one at a time and a fault occurs, you do not have to look very far to find out why.

If you have a dialup connection you probably retrieve external mail from a POP3 server, and send email via an SMPT server. Your Internet Service Provider (ISP) or web hosting provider gave you this information when you set up your account with them. You can also obtain the details you need by looking at the working configuration of a computer that you currently use to connect to your provider. As a last resort you can contact your provider and ask for the information.

1.6. Information you need before you start

You need some detailed information about your ISP (Internet Service Provider) or web hosting provider. If you have printed this document, you may wish to fill in the details in this table.

1.7. Names used in this document

The examples in this document use pseudonyms to represent usernames, computer names etc. If you have printed this document, you may wish to use this table to make a note of the real names on your system.