translated by Google

Machine-translated page for increased accessibility for English questioners.


FI and MU mail systems provide some services to reduce the spread of incoming mail and for at least partial detection of spam (usually commercial) spam. On this page, we describe the overall system activity and user configuration options. We emphasize that the check is on two levels. The first is a university mail server and the other is a faculty mail server.

This protection is not and can not be 100%. Still true: Do not open attachments that you have received from unknown senders (or at first sight of known senders, the message looks suspicious)!

A serious threat accompanying any mail filtering is a false sense of security. Count on the fact that some viruses will be detected, but they will not necessarily be revealed!

An essential anti-virus protection component of the MS Windows-based computer is a computer-based, high-quality antivirus program whose virus-recognized database is regularly updated (once a day near optimum). For FI-owned computers, antivirus software licenses are purchased (please contact ). Students can use one of the freely available antivirus programs for their computers, avast! .

If your mail is redirected using a file .forward , antispam control does not occur. If you want to forward and control the mail, use the program procmail .

Check incoming mail at the entire university level

Prior to joining the university network, each incoming mail is scanned by an antivirus program on the server and passes through a robust antispam routine (especially greylisting). A positive finding in any of these two tests is followed by an instant and irrevocable mail drop without the potential recipient being aware. If the test result is negative, mail is forwarded to the university network, including with the following added headers:
  • X-Muni-Spam-TestIP - The IP address of the server from where received the mail;
  • X-Muni-Envelope-From - Email address of the sender specified in the mail envelope.
This information can be used in sensitive handling for additional mail filtering at user level prior to delivery to the mailbox (e.g. procmailem ).

Checking incoming mail at the faculty level

Mail originating from the Faculty's internal network does not undergo any anti-spam control. If the mail originates from the outside and is delivered to the destination mailbox by Anxur or Aisa (see details ; Forwarding via procmail is, however, considered to be delivered to the destination mailbox), by default, it passes through anti-spam control of the following phases (respectively):
  1. whitelist / blacklist
  2. SpamAssassin
  3. dSpam

Whitelisting / blacklisting

Whitelist is, in the simplest sense, a list of addresses from which no incoming mail should be marked as spam, with absolute validity, in all circumstances. In the whitelist, there may also be multi-signatures with a special '*', which represents any number (even zero) of any characters. Therefore, "" will whitelist all addresses in the domain . There are two levels of whitelist - global (maintained CVT, valid for all Anxuro or Aise user emails) and user (valid for incoming mail of a single user). The FI is whitelist implemented with SpamAssassin (see below). You can define your whitelist by adding any number of lines of the following shape to a file ~/.spamassassin/user_prefs :
To take this configuration file into account, you must:
  • home directory and directory ~/.spamassassin grant the right " x " for others
  • file ~/.spamassassin/user_prefs grant the right " r " for others
Similarly, you can bypass anti-spam control and whitelisting keywords in the mail header Subject . Any mail whose subject line will be one of the keywords as a substring will also avoid anti-spam control and will always be delivered properly. A list of keywords is listed in the same file as a list of whitelisted addresses (the same access rights apply), but the following format is the line format:
whitelist_subject WhitelistovanyRetezec
Blacklist (both by sender and by subject) has a dual function to the whitelist: a satisfactory message will be left unchecked, and will eventually be marked as spam. Blacklist configuration is the same as whitelist, only place whitelist_from write blacklist_from and place whitelist_subject then blacklist_subject .


This filter performs a heuristic analysis. It defines a fixed set of phenomena (rules, rules, mostly the presence of a word or phrase) whose occurrence is detected in the mails. This set is fixed; it can only be changed manually by the administrator (with the most valid) or the user (valid for his address). Each phenomenon is assigned a weight (real number) that expresses its severity. Positive weight refers to phenomena that are characteristic of spam, negative weight phenomena typical of regular mail. The sum of the weights of all occurrences in the mail (duplicate occurrences of one phenomenon are ignored) is called a score. If the score is greater than or equal to 7, the mail is marked as spam and does not enter the dSpam control process at all.

In the mail that was processed by SpamAssassin you will find a header X-Spam-Status , from which the results of the mail analysis can be read.
X-Spam-Status: Yes, score=14.3 required=7.0 tests=FI_NOTFROMFI,
	NO_REAL_NAME,UNDISC_RECIPS autolearn=disabled version=3.1.9
  • mail was marked as spam
  • its score is 14.3
  • the events that participated in the score are listed tests
What is the meaning of each phenomenon is usually irrelevant, but you can learn it from the header X-Spam-Report , which SpamAssassin inserts into any mail it considers to be spam.

You can configure SpamAssassin's behavior to a certain extent in the previously mentioned file ~/.spamassassin/user_prefs (access authorization requirements apply). The following paragraphs describe how to use the file for three basic configurations; details of these and other possible modifications will be given by the order man Mail::SpamAssassin::Conf .

Change the minimum spam score

By default, any mail with a score equal to or above 7 is marked as spam. If you want to change this boundary, insert a row (instead of X achieve the desired fair value):
required_hits X
Increasing the score is relatively safe; the number of spam that will not be recognized will probably increase. Lower boundary is not recommended.

Definition of own phenomena

It is most desirable to identify certain text strings or patterns in the mail that carry a certain amount of spam, harmless of the entire mail. In SpamAssassin, these patterns are defined using Pearl Regular Expressions . To introduce a phenomenon called RULE (with a verbal description DESC and score X ), which will involve searching for a string matching the expression EXP in the header HDR mail, paste the following lines into the configuration file:
header    RULE  HDR =~ EXP
describe  RULE  DESC
score     RULE  X
For example:
header    FI_CHEAPOEM  Subject =~ /cheap\s+oem\s+soft/i
describe  FI_CHEAPOEM  Cheap OEM Soft..
score     FI_CHEAPOEM  4.5
To introduce a phenomenon called RULE (with a verbal description DESC and score X ), which will involve searching for a string matching the expression EXP in the body of the mail, paste these lines into the configuration file:
body      RULE  EXP
describe  RULE  DESC
score     RULE  X
For example:
body      FI_ISMU  /IS\s*MU/
describe  FI_ISMU  IS MU
score     FI_ISMU  -2

Change the score of a globally defined phenomenon

It is possible to change (with validity only for your mail) scores of those phenomena that you have not introduced yourself. Execution is intuitive: Add the following line to the configuration file to change the score RULE on X :
score  RULE  X


The statistical bayesis filter dSpam also recognizes the text substring - the phenomena - in which it assigns a certain score, and, based on consideration of all the phenomena found, determines whether the mail being examined is spam or not. However, SpamAssassin's behavior differs.

The set of phenomena and their scores are self-transforming without the explicit intervention of the administrator or the user in the configuration. The filter defines / modifies the phenomenon and scores in the so-called learning phase based on the investigation of emails in which spam / spam membership is explicitly defined. In other words: In the learning phase, filters are submitted to mails that are known to be spam and filter by themselves (using the Bayes formula - hence the type of filter) adjusts the idea of ​​how rough the spam looks to be between real filtered to recognize it by post. The same is true for spams. Learning is ongoing (daily filters are reconfigured to dozens of new learning emails), and the filter just slowly forgets what he has learned before. This creates a very sensitive detection mechanism that is tailor-made for a specific environment - specific user and administrator ideas about how spam looks and how it does not look like. In addition, this mechanism evolves over time to continuously take into account the ever-changing features of spam.

When dSpam was first installed on FI, he was trained on a non-trivial set of spam, respectively. the memory accumulated by the administrator during the last months. Since then, dSpam has been teaching mostly automated. The source of spamming is:

  • mails arriving at a FI site that is not accessible to any user and which is blatantly published on the Internet to report to spam producers (represented by programs that do not normally use the internal intelligence to detect honeypots )
  • mails coming to address

The source of learning memories is mails coming to address .

Each mail that is processed by dSpam is enriched with headers that feature the dSpam assessment verdict, as well as the justification for this verdict. The keys are the key X-DSPAM-Result and X-DSPAM-Factors . For example:

X-DSPAM-Result: Spam
X-DSPAM-Factors: 15,
	liable+for, 0.00448,
	liable, 0.00673,
	shall+not, 0.00738,
	Offers+e, 0.99000,
	Offers+Microsoft, 0.99000,
	MSN+shall, 0.99000,
	mail+communications, 0.99000,
	WA+98052, 0.99000,
	target="_blank">More+Newsletters, 0.99000,
	This+shall, 0.99000,
	Feature+Offers, 0.99000,
	not+unsubscribe, 0.99000,
	content+nor, 0.99000,
	©2008, 0.99000,
	Newsletters+|, 0.99000
Mail with these headers:
  • failed the whitelist or the blacklist, and SpamAssassin called him "no"
  • dSpam was marked as spam
  • this verdict was determined by finding the text patterns that are listed in the header X-DSPAM-Factors

Text templates do not require a description here - they directly express what exactly was found in the mail to be defective or harmless. Those patterns that have a number greater than 0.5 in the header indicate that the mail builds more on the spam side; other patterns on the back side. The larger the number, the greater the value the pattern carries. Pragmatically, it can be seen from this illustration that dSpam currently considers a strong feature of spam that the "© 2008" or "Feature" separated by a white space from "Offers" appears in the body or headings. On the contrary, the occurrence of the word "liable" strongly suggests the regular post.

Turn off or limit anti-spam control

The coarsest filter configuration can be done with a simple application . Control is intuitive; her skills include:

  • clipboard specifications where dSpam and / or SpamAssassin will postpone spam
  • Complete switching off one or both filters
  • Activate a spam notification
  • control of filter handling with duplicated mails
The settings made by this application are saved in a file ~/.procmail.setup .

Important notes on the FI anti-spam infrastructure

Just the addresses and use the Bayesian filter if it has made a mistake. An incorrect message (false-negative or false-positive) is one of those addresses not to forward but to redirect (redirect, bounce; in mutt the "b" key, Thunderbird will make this plug-in available mailredirect ). For recap: false-negatives bounce to , false-positives on . By filtering out your bugs, you'll help improve anti-spam scrutiny on a full-scale scale.

It follows from the above that the addresses (not) can be mistakenly or deliberately used to filter the filter by sending meaningless or targeted mail incorrectly , respectively. sleep on ). To avoid this, the contents of the (not) regularly controlled by an administrator that eliminates white noise and reports that only a limited group of users might wish to reclassify. In the future, it is expected to introduce "user spam databases" - bounce incorrectly classified mail to (not) does not cause the filter behavior to change for all users, but only to the user who originally received the mail in question.

Another consequence of the previous one is the fact that SpamAssassin's spamming spam incorrectly seems to make no sense to bump to , because only dSpam can do this. It's true half. dSpam also extracts a bit of knowledge from the mail that had never been handed to him before (as SpamAssassin decides that the mail is spam, dSpam is not present at all). But even if we ignore this spell of knowledge, it is still a bouncing spell worthwhile because the administrator knows about possible defects in the SpamAssassin configuration, which needs to be modified manually (SpamAssassin can not automatically learn). This way, you can draw attention to a defect in the SpamAssassin configuration, which should be removed globally. However, it should be noted that the administrator must be very reticent in global configuration modifications, since what is appropriate for one user may not be appropriate for others. Keep in mind that SpamAssassin can also influence your local configuration.

A special comment deserves the whitelisting of entire domains, whitelist_from * , whitelist_from * etc. Such a local configuration configuration is dangerous in the sense that it can leak into your regular mailbox spams that would otherwise be correctly recognized and which no real person from these domains certainly did not send. It is quite common for spam to camouflage its origin by declaring any known Internet address to its sender, including addresses in, or any other. In addition to partial concealment of origin, spam can be more easily circumvented by those brave whitelists that include entire domains, not individual addresses. In real-world practice, the introduction of such a whitelist is a disillusionment of the user with the anti-spam system and the unstoppable bounce of "whitelisted spam" on in the hope that the issue will be resolved. The only possible solution, however, is to remove the problematic record in the whitelist.