E-Mail problems: a detective story

Few weeks ago, the company where I work for changed the internet provider. Faster access, lower price, you know the drill.

But problems started to appear, namely DNS issues, that took some time to solve, but at the end it worked ok.

Anyway, complains started to appear that people have send e-mails to us, and they were never answered or received…

A quick check on the Postfix logs, showed that those customers where always disconnected:

The smtp transaction was something like:

connect from smtp.domain.com[X.Y.Z.L]
lost connection after DATA (0 bytes) from smtp.domain.com[X.Y.Z.L]
disconnect from smtp.domain.com[X.Y.Z.L]

So It looked like those customer servers connected and disconnected after a few seconds.

A quick search on Google (:-) ) showed that it might be a Firewall issue, a MTU issue or a Postfix bug (!).

We’ve checked, and double checked the firewall. It was ok, and anyway it wouldn’t explain why other server had no problem connecting and transfering mail.

The MTU issue: Ok, we’ve changed the Internet Provider, but all our firewall interfaces are on Ethernet. Anyway we’ve changed the MTU to 1492: ifconfig eth0 mtu 1492. It didn’t solved.Back to square zero.

Postfix bug: It has been working fine on the last 6 months, so very strange… We’ve disable PostGrey (the GLD Daemon). Not a greylisting issue. Bummer.

So we enabled the Postfix debuging features by adding the line: debug_peer_list = smtp.domain.com ( instructions here: http://www.postfix.org/DEBUG_README.html ) and waited for a connection.

Anyway to cut a long story short, we’ve found out that all SMTP transaction was OK until the DATA portion, where it looked that the other e-mail server disconnected. What we did see was that we had a long list o RBL’s to check, and it was taking to long to check every one of them, so it looked like the other e-mail server disconnected after a time-out period even during the email transfer process. I think that is a bug on the other peer, but the issue was ours…

So we’ve cut to half our RBL list, only keeping NJABL, Spamcop and Spamhaus lists, and the servers with problems started to connect and transfer right away the stacked queued messages to us.

Moral of the story: The longer the RBL list the longer the time to process incoming data. Some email servers will just “barf” at these long times.

Right now: Zero problems.

Shorewall firewall on UBUNTU LTS 8.04 server doesn’t start on boot

I have in one of my machines a pretty annoying situation related to the fact if the UBUNTU based firewall reboot’s, the firewall doesn’t start automatically…

This is pretty annoying because it means that after a power failure, there is a need for manual intervention to restore exterior access trough the firewall to internal servers.

[EDIT] The problem is that the firewall is started from the init scripts and it is running, but the rules for port forwarding are not active. You need to make sure that the following options are enabled on the shorewall.conf

STARTUP_ENABLED=Yes

IP_FORWARDING=On

Ubuntu: The perfect mail gateway server

I’m setting up a new firewall, email gateway server using Ubuntu 8.04LTS. As in a previous post, I’m following (not always by the book) the Howtoforge to setting up a email/spam gateway Ubuntu mail gateway.

Three things didn’t go quite as expected, and so here they are engraved for ever in the internet stone:

1st) General: In the Webmin Others->Server and System Status, the Apache server monitoring reports apache as always down. The solution is to go to Servers->Apache Webserver and select on the top the link Module Config. At the bottom the option Path to Apache PID file is set to Automatic. Change it to point to /var/run/apache2.pid and save. The webmin monitor now for apache should show the apache status as OK.

2nd) Mail: I really like Mailscanner and it’s partner Mailwatch. One of the issues that I was having was that the messages that where in quarantine, Mailwatch didn’t show any option to delete or release the messages for them. The issue was lack of folder permissions. Mailwatch couldn’t access the quarantine directory. So executing the command chown -R postfix:www-data /var/spool/MailScanner and chown -R postfix:www-data /var/lib/MailScanner/ did the trick, and I can now delete and release quarantine messages.

3rd) Mail relay: After setting up postfix, all incoming messages where refused on the external interface because postfix denied relaying. Please note that I’m using this server to receive mail from the internet, to see if a mail is safe, no virus, no spam, no phishing, and so on, and then forward it to our internal mail servers. So I have a relay_domains file that describes our domains, and a transport file that specifies where are the “real” mail servers, but even with this, postfix was always refusing the mails.

The solution for this issue is on the empty mydestinations option. Setting up this option with mydestination = hash:/etc/postfix/relay_domains and stoping and restarting postfix, did the trick.

Regarding the howtoforge manual, I skiped some of the things, like fuzzyocr, and removed Bind9 from the server.

As I progress in setting up and configuring the the server, if anything worth mention, I’ll post it here.

Linux mail gateway

I’ve run where I work for 4 years a Mandrake based firewall with Postfix and Mailscanner. I really, really liked mailscanner, but for my colleagues the setup was “too complicated”. So I moved to EFW, Endian Firewall comunity edition. What it brings in ease of use it lacks in flexibility.

Finally my prayers where listen, and I’m going to move again to a custom build full fledged mail gateway with Mailscanner. Check out: this howto.

Linux firewalls

Where I work, despite being a Windows shop (small one), nobody trusts ISA Server as a firewall… 🙂 so we have Linux running non stop as a firewall/proxy since 2003 with Postfix, Mailscanner, Spamassassin and iptables and doing a fine job.
So far so good, but I though that after 5 years of non stop service I should look for something easier to manage to my Linux challenged colleagues 🙂

I looked basically to two solutions: IPCop and Endian Firewall:

IPCop: Is basically oriented for the home user. Mail processing is done through a SMTP proxy that doesn’t look too solid. It’s also an add on to the basic IPCop system.

Endian Firewall: It looks like it’s IPCop based, but mail processing is done with PostFix and Amavisd and Spamassassin. It also scans mails with clamav.

Both solutions have web based interfaces, traffic graphs, and almost no need to go into a shell. I do prefer Mailscanner better than Amavisd for mail filtering. First in MailScanner, blocked e-mails can be unblocked and delivered to the user, without too much of a problem. In Amavisd you must feed them again into the system because the “blocked” format is raw, so if you really need that blocked email, the only way I know (yet) is to use Outlook Express for viewing and forward the email.

Both system lack basic tools like wget, nslookup, dig, whois that can help debugging your internet connection. You need to add them after installing, and that can be quite a challenge.

Also clamd daemon, doesn’t seem too solid. It has the habit of crashing without any trace or any bit of information on the log files…. In my original firewall system we use Mcaffee for Linux and it worked always, but we are also paying for it…

So until clamd started crashing out constantly last week I had a good impression of EFW firewall, but I’ll replace the virus scanner for using the command line clamscan instead of the daemon clamd. Them clamd people must sort it’s instability issues as soon as possible. It’s not EFW fault.