translated by Google

Machine-translated page for increased accessibility for English questioners.


FI and MU mail systems provide some services to limit the spread of infected mail and to detect at least partial spam (usually commercial) spam. On this page we will describe the overall operation of the system and the possibility of user configuration. We emphasize that there are two levels of control. The first is a university mail server and the second is a faculty mail server.

This protection is not and cannot be 100%. Still, do not open attachments that you have received from unknown senders (or at first glance known senders, and the message looks suspicious)!

A serious threat to any mail filtering is a false sense of security. Expect some viruses to be detected, but not necessarily all of them!

An essential part of the antivirus protection of a computer running the MS Windows operating system is a high-quality antivirus program installed directly on the computer. Antivirus licenses are purchased for computers that are owned by FI (ask for installation instructions) ). Students can use one of the freely available antivirus programs for their computers. avast! .

If your mail is redirected using a file .forward , there is no antispam control. If you want to forward and check your mail, use the program procmail .

Check incoming mail at university level

Before entering the university network, each incoming e-mail is scanned on by an antivirus program and goes through a comprehensive anti-spam routine (especially so-called greylisting). A positive finding in either of these two tests is followed by an immediate and irreversible drop of the mail without the knowledge of the potential recipient. If the result of the tests is negative, the mail is forwarded to the university network, including the following headers:
  • X-Muni-Spam-TestIP - IP address of the server where received the mail;
  • X-Muni-Envelope-From - The email address of the sender specified in the mail envelope.
With sensitive handling, this information can be used to additionally filter mail at the user level before delivery to the mailbox (e.g. procmailem ).

Check incoming mail at the faculty level

Mail coming from the internal network of the faculty does not go through any anti-spam checking. If mail comes from outside and is delivered to the destination mailbox by Anxur or Aisa (see details ; however, forwarding via procmail is considered as delivery to the destination mailbox for this purpose), it will normally pass an anti-spam check on the following stages (in the order listed):
  1. whitelist / blacklist
  2. SpamAssassin
  3. dSpam

Whitelisting / blacklisting

Whitelist is, in the simplest sense, a list of addresses from which no incoming mail is to be classified as spam, with absolute validity, at all times. There can also be ambiguous items in a whitelist thanks to a special '*' character that represents any number (even zero) of any characters. Thus, "*" whitelists all addresses in the domain . There are two levels of whitelist - global (maintained by CVT, valid for all Anxur or Aise users) and user (valid for incoming single user mail). At FI, the whitelist is implemented using the SpamAssassin program (see below). You can define your whitelist by adding any number of lines of the following shape to the file ~/.spamassassin/user_prefs :
To take this configuration file into account, you must:
  • home directory and directory ~/.spamassassin grant right " x " for others
  • file ~/.spamassassin/user_prefs grant right " r " for others
Similarly, anti-spam checking can be circumvented by whitelisting keywords in the mail header Subject . Any mail whose subject matter contains even a single keyword as a substring will also avoid anti-spam checking and will always be properly delivered. The keyword list is listed in the same file as the whitelist (the same access rights requirements apply), but the line format is as follows:
whitelist_subject WhitelistovanyRetezec
The blacklist (according to the sender and subject) has a dual function to the whitelist: a satisfactory message will be marked as spam without any further checks. The blacklist configuration is the same as the whitelist, only the location whitelist_from write blacklist_from and place whitelist_subject then blacklist_subject .


This filter performs heuristic analysis. Defines a fixed set of phenomena (rules, rules; mostly the presence of a word or phrase), the occurrence of which is detected in mails. This set is fixed; it can only be changed manually by the administrator (all-faculty) or the user (for his address). Each phenomenon is assigned a weight (real number) that expresses its severity. A positive weight indicates phenomena that are characteristic of spam, a negative weight indicates phenomena that are typical of regular mail. The sum of the weights of all events that occur in the mail (duplicate occurrences of one event is ignored) is called the score. If the score is greater than or equal to 7, the mail is marked as spam and does not enter the dSpam check process at all.

You will find a header in the mail that was processed by SpamAssassin X-Spam-Status to read the results of the mail analysis.
X-Spam-Status: Yes, score=14.3 required=7.0 tests=FI_NOTFROMFI,
	NO_REAL_NAME,UNDISC_RECIPS autolearn=disabled version=3.1.9
  • mail has been marked as spam
  • his score is 14.3
  • the phenomena that contributed to the score are listed in the list tests
What is the meaning of individual phenomena is usually irrelevant, but you can learn it from the head X-Spam-Report that SpamAssassin puts in every mail it considers spam.

You can configure SpamAssassin behavior to some extent in the file mentioned earlier ~/.spamassassin/user_prefs (access permission requirements also apply). The following sections describe how to use the file for three basic configuration changes; details about them and other possible modifications will be provided by the order man Mail::SpamAssassin::Conf .

Change the minimum spam score

By default, any mail with a score equal to or greater than 7 is marked as spam. To change this boundary, insert a line (instead of X set the desired fair value):
required_hits X
Increasing the border score is relatively safe; the number of spam not recognized will likely increase. Lowering the border is not recommended.

Definition of own phenomena

Most often, it is desirable to identify in the mail certain text strings or patterns that bear a certain informative value about spam, respectively. the harmlessness of the entire mail. In SpamAssassin, these patterns are defined using Perl regular expressions . To introduce a phenomenon called RULE (with verbal description DESC and score X ) to search for a string matching the term EXP in the header HDR In the configuration file, insert the following lines:
header    RULE  HDR =~ EXP
describe  RULE  DESC
score     RULE  X
For example:
header    FI_CHEAPOEM  Subject =~ /cheap\s+oem\s+soft/i
describe  FI_CHEAPOEM  Cheap OEM Soft..
score     FI_CHEAPOEM  4.5
To introduce a phenomenon called RULE (with verbal description DESC and score X ) to search for a string matching the term EXP in the body of the mail, insert the following lines into the configuration file:
body      RULE  EXP
describe  RULE  DESC
score     RULE  X
For example:
body      FI_ISMU  /IS\s*MU/
describe  FI_ISMU  IS MU
score     FI_ISMU  -2

Change the score of a phenomenon defined globally

It is possible to change (with effect only for your mail) the score of those phenomena that you did not introduce yourself. The implementation is intuitive: adding the following line to the configuration file changes the score of the phenomenon RULE on X :
score  RULE  X


The Bayesian statistical filter dSpam also recognizes text substrings in phenomena - phenomena - to which it assigns a certain score and, considering all the phenomena found, determines whether or not the mail in question is spam. However, it differs from SpamAssassin's behavior.

The set of events and their scores change spontaneously, without explicit administrator or user intervention in the configuration. The filter defines / modifies phenomena and scores in the so-called learning phase (training) based on the examination of e-mails for which the spam / nesam category is explicitly predetermined. In other words: in the learning phase, the filter is sent to emails that are known to be spam, and the filter (according to the Bayesian formula - hence the type of filter) adjusts the idea of how spam looks like to be filtered Recognize by mail. The same is true of spam. Learning takes place constantly (the filter is reconfigured every day on the order of dozens of new learning e-mails) and the filter only slowly forgets what it has learned before. This creates a very sensitive detection mechanism that is tailored to the specific environment - the specific ideas of the administrator and users about how spam looks and how it looks like spam. In addition, this mechanism evolves over time to take account of the ever-changing features of spam.

When dSpam was first deployed to FI, it was trained on a non-trivial set of spam, respectively. the spam that the manager has accumulated in recent months. Since then, dSpam has been learning mostly automated. The sources of learning spam are:

  • e-mails coming to a certain e-mail address (so called honeypot) in the FI domain that does not belong to any user and which is ostentatiously published on the Internet in order to be reported to spam producers (represented by programs that do not usually possess much )
  • emails coming to

The source of learning spam is emails coming to .

Every e-mail that is processed by dSpam is enriched with headers that specify the verdict of dSpam evaluation as well as the reasoning behind this verdict. Headers are key X-DSPAM-Result and X-DSPAM-Factors . For example:

X-DSPAM-Result: Spam
X-DSPAM-Factors: 15,
	liable+for, 0.00448,
	liable, 0.00673,
	shall+not, 0.00738,
	Offers+e, 0.99000,
	Offers+Microsoft, 0.99000,
	MSN+shall, 0.99000,
	mail+communications, 0.99000,
	WA+98052, 0.99000,
	target="_blank">More+Newsletters, 0.99000,
	This+shall, 0.99000,
	Feature+Offers, 0.99000,
	not+unsubscribe, 0.99000,
	content+nor, 0.99000,
	©2008, 0.99000,
	Newsletters+|, 0.99000
Mail with these headers:
  • did not comply with the whitelist or blacklist, and SpamAssassin labeled it as spam
  • was marked as spam by dSpam
  • this verdict was determined by finding the text patterns that are listed in the header X-DSPAM-Factors

Text patterns do not require description here - they directly express what was found to be defective or harmless in the mail. For those patterns with a number greater than 0.5 in the header, mail puts more on the spam side; other patterns on the side I do not spam. The larger the number is, the more informative the pattern bears. Pragmatically, it can be seen from the illustration that dSpam currently considers strong "spam" to be a "© 2008" string in the body or headers, or the word "Feature" separated by a white space from the word "Offers". Conversely, the appearance of the word "liable" strongly suggests regular mail.

Disable or restrict anti-spam scanning

The coarser filter configuration can be done using a simple application . Operation is intuitive; her skills include:

  • specify the mailbox where dSpam and / or SpamAssassin will put spam
  • turn off one or both filters completely
  • activation of notifications of received spam
  • control the handling of duplicate mail filter
Settings made by this application are saved in a file ~/.procmail.setup .

Important notes on FI's anti-spam infrastructure

Just addresses and use to override the Bayesian filter if it has made a mistake. It is necessary to include a badly classified message (false incorrect negative spam, or a false-positive regular spam message) to one of these addresses, not forward, but redirect, bounce; in the mutt, the "b" key, in Thunderbird, makes this plugin available mailredirect ). To recap: false-negatives bounce on , false-positives to . By alerting the filter to errors, you will help improve the anti-spam control on a faculty scale.

It follows that the addresses (not) can be used by mistake or intentionally to disorient the filter by sending nonsensical or intentionally incorrect mails ( , respectively. nesam na ). To avoid this, the contents of the mailboxes Checked regularly by an administrator who eliminates white noise and messages that only a limited group of users might wish to reclassify. In the future, it is expected to introduce "user spam databases" - bounce incorrectly classified mail to (not) it does not change the behavior of the filter to all users, but only to the user to whom the mail was originally delivered.

Another consequence of the previous one is the fact that an e-mail marked incorrectly as spam no longer seems to make sense with SpamAssassin , since only dSpam can be learned in this way. It's half true. dSpam also extracts a bit of knowledge from an e-mail that didn't come into his hand at all (because, as has been said, if SpamAssassin decides that the e-mail is spam, dSpam is no longer presenting it at all). But even aside from this pinch of knowledge, such a booted spam is still worthwhile, because the administrator learns of possible defects in the SpamAssassin configuration that needs to be edited manually (SpamAssassin cannot learn automatically). This indicates a defect in the SpamAssassin configuration that should be removed globally. It should be noted, however, that the administrator must be very restrained in global configuration adjustments, since what is appropriate for one user may not be appropriate for another. Keep in mind that you can also influence SpamAssassin with your local configuration.

Special comment deserves whitelisting of whole domains, eg. whitelist_from * , whitelist_from * etc. Such a local configuration modification is dangerous in the sense that it can pass spam messages to your regular mailbox that would otherwise have been correctly recognized and that no real person from the specified domains was sure to send. It is quite common for spam to declare its origin by declaring any known address on the Internet as its sender, and this includes addresses in the domain, or any other. In addition to partial concealment of origin, spam can make it easier to circumvent those brave whitelists, which include entire domains and not individual addresses. In practice, the introduction of such a whitelist is both frustrating the user at the malfunctioning of the anti-spam system and a perpetual bouncing of "whitelist spam" on in the hope that the problem will be solved. However, the only possible solution is to cancel the problematic whitelist record.