Network and Systems Services, Computing and Information Services, Texas A&M University
blue horizontal rule

NETWORKS

SYSTEMS

CIS

TAMU

blue horizontal rule

Greylisting incoming e-mail to TAMU

What is Greylisting?

The Greylisting method can be used until it is possible to verify senders. It's behavior is centered around the existing standards behaviour of properly operating mail servers, notably (excerpted from RFC 2821):

  • 4.5.4.1 Sending Strategy
    [...] mail that cannot be transmitted immediately MUST be queued and periodically retried by the sender.
  • 4.2.1 Reply Code Severities and Theory
    [...]
    4yz Transient Negative Completion reply
    The command was not accepted, and the requested action did not occur. However, the error condition is temporary and the action may be requested again. [ ... ] A rule of thumb to determine whether a reply fits into the 4yz or the 5yz category (see below) is that replies are 4yz if they can be successful if repeated without any change in command form or in properties of the sender or receiver (that is, the command is repeated identically and the receiver does not put up a new implementation.)

So, the first time a particular off-campus host tries to send mail from some e-mail address to a tamu.edu address, we respond with a 400-level message telling the remote site to try again later. After a configured timeout, the next attempt by that mail server to send e-mail from that sender address to that tamu.edu address will succeed and future deliveries will have no delay since the system stores the fact that it was properly queued and retried.

For a detailed description of the process and the initial proposal that was sent to the local system administrators, please see the following documents:


Known Issues

We have seen a number of hosts that have a variety of failures to comply with the SMTP protocol.
Ignoring 400-level SMTP response
The most notable situation pertains to mailers that entirely ignore the 400-level response and incorrectly attempt to continue with the SMTP dialog. The following servers have corrected the problem and just need upgrading to properly handle this SMTP situation:
Long initial delay
Our server is configured for an extremely small delay for first time messages (and zero delay for all subsequent messages). If the first message is taking more than a few minutes (and in particular, more than an hour), that tells you more about the sending server and its configured delay/retry than anything else. Since temporary problems can happen at any time on any server, the configured retry delay on the sending side tells what importance that sending side places on their e-mail being sent in a timely fashion. Asking your postmaster to decrease the time for retries would benefit all mail being sent out, not just to TAMU. The standard recommends at least 2 tries within the first hour.

Other
Other servers do not have a well identified fix to make them compliant. For that, the following two procedures exist:
  1. Manually perform the "resend" that your server should be doing automatically:
    1. If it has been a few minutes and the message is not arriving, just re-send your message once to the intended recipient. The recipient might receive 2 copies (one from each send), but you will never have to repeat that process, future messages will pass through without any delay (with only a single sending of the message).
  2. Request help in identifying the problem if the above does not work to immediately pass your e-mail:
    1. E-mail <helpdesk@tamu.edu> with the bounce message
    2. We will extract the information needed from that message
    3. We will verify the information (which can take quite a while if a bounce message is not provided to give us a starting point for the search) and see if the problem did or did not stem from a communication with our server,
    4. If feasible, we might manually assist the process from this end by manually updating the incomplete entries for which the sending server is not performing the retry for completion (though please try step 1 above first).

Those same problems on the sender side can cause other problems we well, though. Take, for instance, the following scenario:

  • a host uses LDAP for delivery routing when their LDAP is unavailable or has a spool filesystem fill.
  • that host will respond to delivery attempts with a 400-level message (just like greylisting does)
  • being a 400-level message, the sending system MUST queue and retry the message, not bounce it
  • there is nothing the receiving side can do to "workaround" the situation until the root cause of the problem is corrected
Functionally, there is no difference between the above scenario and how greylisting behaves. So, the sending side needs to correct their system so they do not incorrectly fail a delivery attempt in other situations where there are no manual actions on the receiving side that can be done to work-around their broken server/configuration.

[ NETWORKS | SYSTEMS | CIS | TAMU ]