Setting up Dspam as a filter for Postfix on Debian Etch

Dspam logoI recently switched from amavisd-new and Spamassassin to Dspam on my mail server. Although I was quite satisfied with the results of my previous setup, I found it was using too much resources (especially RAM).

I will try and describe my current setup in this article which is largely inspired by already existing how-to’s (see last section).

I take as granted that you have a working Postfix setup, and already know the basis of Debian system administration. As usual, comments are welcome!

Short introduction to Dspam

Dspam is described by its author as a “DSPAM is a scalable and open-source content-based spam filter designed for multi-user enterprise systems”, and is reported to have results varying from 99.5% to 99.95%!

Its main strengths compared to other spam filtering tools are (or why I have chosen Dspam as an alternative to amavisd-new/SA):

  • low resources requirements: it is written in C and is thus very light
  • usability: you don’t need to be a perl or whatever expert to configure Dspam to fit your needs (one main configuration file, documented options…) – it can even be administered through a web interface (which won’t be described here as I don’t use it)
  • Accuracy: although I had no precise idea of its results before using it, I had read a lot of users’s comments reporting it to work great even on small setup after a period of training, and it proved to be true!
  • Support for Clamav: although Dspam is a spam filter engine, it can although filter infected messages through Clamav, without needing setting up another filter such as clamSMTP.

Please refer to the project homepage for a complete list of features and an accurate description of the product.

Description of my setup

I administer a home server only used by me and a few relatives, some of them only using their @kirya.net address as a forwarding to their own personal address only to filter out spam and infected messages.

Although it is possible to set up a per-user configuration, all the users share the same data (tokens) stored in a MySQL database.

The subject of the messages detected as spam are changed to contain eg. [SPAM]. Infected messages are bounced (thus not delivered to the users) – Dspam support for Clamav is quite recent and it is possible either to bounce the infected messages or treat them as spam; I use a patch to expend the bounce behavior to send a notice to the original recipient (see below).

Dspam training is made easy with the special aliases: false positives are forwarded to ham@mydomain.com whereas spam messages are sent to spam@mydomain.com.

Installing Dspam

Dspam is now part of the current Debian stable release (Etch), and no backport is needed.

Also note that mailgraph can easily be adapted to take dspam virus notifications into account, see this entry in the Debian BTS.

To install dspam, simply run:

sudo aptitude install dspam libdspam7-drv-mysql

As usual, required dependencies will be pulled automatically. I suggest you follow the debconf instructions to create the MySQL database which will be used to store all the data (signatures, tokens, stats).

It wood also be good to add a specific MySQL user with restrictive rights only on the newly created database. Choose a strong password, eg. generated by pwgen -cn 16.

For the rest of the article, I will consider your database name is dspam, and the MySQL user which you created is dspam-maint.

Setting up Dspam

First, let’s configure dspam to use the MySQL database. Required information should be stored in /etc/dspam/dspam.d/mysql.conf provided by libdspam7-drv-mysql. As it contains the password to access to the database, this file should not be readable by everyone (standard permissions are -rw-r----- dspam root).

Dspam is not chrooted, thus you might want to prefer using the socket rather than the network connection by commenting out the line beginning with MySQLPort, and setting the MySQLServer variable to /var/run/mysqld/mysqld.sock.

Then, you will have to edit the main configuration file /etc/dspam/dspam.conf to fit your needs. Here are the options I use for my configuration:

# cat /etc/dspam/dspam.conf | grep -v '^#' | grep -v '^[ ]*$'
Home /var/spool/dspam
StorageDriver /usr/lib/dspam/libmysql_drv.so
DeliveryHost        127.0.0.1
DeliveryPort        10026
DeliveryIdent       "DSPAM-Daemon"
DeliveryProto       SMTP
OnFail error
Trust root
Trust dspam
Trust mail
Trust mailnull 
Trust smmsp
Trust daemon
TrainingMode teft
TestConditionalTraining on
Feature whitelist
Algorithm graham burton
PValue graham
SupressWebStats on
Preference "spamAction=tag"
Preference "signatureLocation=headers"	# 'message' or 'headers'
Preference "showFactors=off"
Preference "spamSubject=[SPAM]"
Notifications	off
LocalMX 127.0.0.1
SystemLog off
UserLog   off
Opt out
TrackSources spam
ParseToHeaders off
ChangeModeOnParse on
ChangeUserOnParse off
ClamAVPort	3310
ClamAVHost	127.0.0.1
ClamAVResponse 	accept
ServerPID              /var/run/dspam.pid
ServerMode auto
ServerPass.Relay1	"secret"
ServerParameters	"--deliver=innocent"
ServerIdent		"localhost.localdomain"
ServerDomainSocketPath  "/var/run/dspam.sock"
ClientHost	/var/run/dspam.sock
ClientIdent	"secret@Relay1"
ProcessorBias on
Include /etc/dspam/dspam.d/

Also do not forget to edit /etc/dspam/dspam.d/mysql.

As you can understand, I use a client/server configuration, which proves to be very effective. The “Opt” is very important: if you want to activate spam filtering for all the users by default, set it to “out”, or you will have to define manually all the users that want to use the filtering system. As I don’t use the web interface, I disabled logging et stats. The integration with clamav is really easy, and only needs 3 lines in this configuration!

Dspam uses a signature system to remember all incoming messages, and it is more convenient to put this signature in the headers, rather that at the end of the body (see signatureLocation preference).

I won’t explain all the other options here, as the default configuration file is pretty well commented.

So that all the users share the same data, you’ll need to configure a shared group by creating a file called group in /var/spool/dspam/ with the following single line:

dspam:shared:*

It is very easy to define several groups for different types of users. This is very well documented in dspam README in section 2.1 “Configuring groups”.

Postfix integration

After setting up dspam, we have to tell postfix to use it! There are plenty ways of doing that. If you have installed the dspam-doc packages, one is documented in /usr/share/doc/dspam-doc/postfix.txt.gz and example configuration files under /usr/share/doc/dspam-doc/postfix/.

Filter incoming mail

Instead of a simple content filter, I have chosen to follow an approach brought by Richard Patterson on dspam mailing list allowing to filter incoming mail only. This method requires the postfix-pcre package. To sum up:

  • Add a file called dspam_filter_access under /etc/postfix containing:
    # Everything beginning with either ham or spam avoids the filter
    /^(spam|ham)@.*$/ OK
     
    # The rest is redirected to be filtered
    /./ FILTER dspam:dspam
  • Open main.cf and find your smtpd_client_restrictions line
  • Add this to the end of it , check_client_access pcre:/etc/postfix/dspam_filter_access
  • Add the following transport to your master.cf:
    dspam                 unix    -       n       n       -       -    pipe
      flags=Ru user=dspam argv=/usr/bin/dspam --client --deliver=innocent,spam --user ${recipient} --mail-from=${sender}

You also need to make sure Dspam only gets once address at a time by adding: dspam_destination_recipient_limit = 1 to postfix main.cf.

Filter transport

You’ll also need to configure Postfix to listen on a local port for re-injection. This is where DSPAM sends back the “good” mail (or alternatively, tagged mail also). Add this to your master.cf:

localhost:10026 inet  n -       n       -       -        smtpd
  -o content_filter=
  -o receive_override_options=no_unknown_recipient_checks,no_header_body_checks
  -o smtpd_helo_restrictions=
  -o smtpd_client_restrictions=
  -o smtpd_sender_restrictions=
  -o smtpd_recipient_restrictions=permit_mynetworks,reject
  -o mynetworks=127.0.0.0/8
  -o smtpd_authorized_xforward_hosts=127.0.0.0/8

Special aliases to train Dspam

Append the aliases to /etc/aliases:

ham: ham@ham.ham
spam: spam@spam.spam

and run postalias /etc/alias to refresh postfix database.

Then, a special transport has to be configured in /etc/postfix/transports:

spam.spam       dspam-retrain:spam
ham.ham         dspam-retrain:innocent

You need to add transport_maps = hash:/etc/postfix/transports in main.cf and run postmap /etc/postfix/transports.

Then add the following to /etc/postfix/master.cf:

dspam-retrain         unix    -       n       n       -      -     pipe
  flags=Rhq user=dspam argv=/usr/bin/dspam --client --mode=teft --class=$nexthop --source=error --user dspam

You cannot set up simple aliases using a pipe to dspam as the permissions of the configuration files are too restrictive, and this setup would require setuid executables somewhere.

The transport approach allows to run dspam under the dspam user UID (user=dspam). Note that the other --user dspam parameter has to be changed if you use several shared groups (or no group at all).

So as to prevent duplicate X-DSPAM-Signature headers which would prevent the signature to be retrieved for spam reporting. This does happen when you receive messages from a server already running Dspam, or could be used by spammers to prevent you from training your database, by forging the headers.

To avoid this issue, I was proposed to ignore the “previous” headers before mail is passed
to dspam. with postfix header_checks, this is:

/^(X-DSPAM-.*)/      IGNORE

Also add:nested_header_checks= in your main.cf file so that postfix doesn’t delete the X-DSPAM-* headers in the attached messages. Without this line, the signatures cannot be retrieved from the nested message.

Another useful tip is to prevent unwanted use of the dspam aliases. In my case, I only accept mail sent to spam@ and ham@ from my local network (or any authenticated user):

  • Create a dspam_check_aliases file stating:
    /^.*(spam|ham)@.*$/ REJECT
  • Run postmap dspam_check_aliases
  • Add the following lines to your smtpd_recipient_restrictions:
    check_recipient_access pcre:/etc/postfix/dspam_check_aliases,
    check_sender_access pcre:/etc/postfix/dspam_check_aliases

    Be sure you have the permit_mynetworks (or permit_sasl_authenticated if using SASL) before these lines.

This way, mail sent by or from spam@ or ham@ aliases will be rejected by Postfix (error code 554), except if the mail is sent from your local network.

Testing

Edit /etc/default/dspam to change the START=no variable to START=yes.

Note that it may be worth to activate debug in this file for the testing process. Debug logs are stored in /var/log/dspam/dspam.debug.

After (re)starting both dspam and postfix, you should be able to see that dspam runs as the headers of incoming mail contain things like:

X-DSPAM-Result: Innocent
X-DSPAM-Processed: Sun Apr 16 09:06:11 2006
X-DSPAM-Confidence: 0.9928
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 4441ece3267971621324435

If it is not the case, look in your mail logs and dspam debug logs.

Results

After 4 weeks running dspam, I am really happy with it. My server now has a lot of free RAM – the following graph shows the RAM usage the day I switched from amavisd-new to dspam:

Venus RAM graph

dspam_stats reports the following results:

dspam TP: 1015 TN: 3893 FP: 0 FN: 103 SC: 45 NC: 0

Which means no false positive (!), and an accuracy of around 98% which is about the same as with Spamassassin.

And after one year:

dspam:
                TP True Positives:          24649
                TN True Negatives:          35404
                FP False Positives:             4
                FN False Negatives:           601
                SC Spam Corpusfed:            956
                NC Nonspam Corpusfed:           2
                TL Training Left:               0
                SHR Spam Hit Rate          97.62%
                HSR Ham Strike Rate:        0.01%
                OCA Overall Accuracy:      99.00%

Futher resources

I would encourage to read other tutorials and articles about dspam. Essential is in dspam README.

Other links can be found on the official website as well as on the wiki.

All electronics are losing their dependency on batteries. The best example is digital camera. Soon we will have printers and scanners too. Watching a dvd rental will be possible anywhere.