4 msgnested_header_checks
10 msgHotmail Problem
4 msgfeedback request: scheduled delivery of messages

pkgsrc 'postfix-stress' option (Re: PATCH versi...
\ Geert Hendrickx (14 Jan 2008)

10 msgHow to enforce users send email with the real f...
15 msgCatchall setup problem with Virtual domains and...
2 msgquestion
5 msgrestrictions
5 msgproblems with virtual alias table
2 msgsome mails bounce with 'Name or service not known'
2 msgExternal recipients within same domain
2 msg~RE: stopping Spam with postfix
9 msgstopping Spam with postfix
17 msgMessages stuck in active queue
4 msgvirtual: Command as adress list
1 msgvariable quota policy ideas
2 msgHow to unistall postfix from compiling source(m...
1 msgBackup mx with local delivery and forwarding fo...
4 msgReceiving Mail with from mydomain.com from unkn...
10 msgWhich is the best soft for mailscanning?
Subject:pkgsrc 'postfix-stress' option (Re: PATCH version 2: Stress-dependent server personality)
Group:Postfix-users
From:Geert Hendrickx
Date:14 Jan 2008


 

FWIW: I've included the postfix-2.4-stress-patch as an option in NetBSD's
pkgsrc/mail/postfix package. It's disabled by default but can be enabled
with option "postfix-stress".

Geert


On Tue, Sep 04, 2007 at 05:10:21PM -0400, Wietse Venema wrote:
> This is an update of yesterday's stress patch for Postfix 2.3 and
> later.
>
> - This version triggers a "postfix reload" for the affected network
> service, so that all servers switch to stress-aware mode. This
> change also helped to make a few code simplifications.
>
> - The logfile warning message now points to a non-existent document,
> http://www.postfix.org/STRESS_README.html, that will be based on
> information from this document and on feedback from the Postfix
> community. Planning for the future...
>
> Attached you will find a patch relative to yesterday's patch.
>
> The problem
> ===========
>
> Last week some ratware was causing trouble by connecting to SMTP
> servers and keeping server ports occupied for a long time.
>
> Symptoms:
>
> - Postfix logs ``service "smtp" (25) has reached its process limit''.
>
> - SMTP clients have to wait a long time before the server responds.
>
> - The maillog shows lots of "lost connection after CONNECT" messages.
>
> - netstat shows lots of SMTP connections in FIN_WAIT1/2 state.
>
> While Postfix will drop connections when a client hammers the server,
> until now it had no specific response against connections from a
> large number of different clients.
>
> Generic workarounds
> ===================
>
> As a first step, you can mitigate too many connections by specifying
> more smtpd(8) processes in master.cf (don't forget "postfix reload"),
> but you can do only so much without running out of memory, sockets,
> or something else that Postfix needs.
>
> When increasing the number of process becomes unpractical, you can
> try to make Postfix spend less time per SMTP client:
>
> - Eliminate useless or redundant RBL lookups (people often use
> multiple Spamhaus RBLs that include each other)
>
> - Reject non-existent recipients early:
> smtpd_recipient_restrictions =
> reject_unlisted_recipient
> permit_mynetworks
> reject_unauth_destination ...
>
> - Use "421" reply codes for botnet-related RBLs or for selected
> non-RBL restrictions. This causes Postfix 2.3 and later to
> disconnect immediately without waiting for QUIT.
>
> - Don't use before-queue content filters or body_checks.
>
> - Reduce smtpd_timeout and smtpd_hard_error_limit. This may
> interfere with legitimate mail.
>
> - Specify "smtpd_peername_lookup = no" (Postfix 2.3 and later).
> Beware, this is a desperate measure; it can save a lot of time,
> but it breaks all access controls that depend on client hostnames.
>
> The stress-mode workaround
> ==========================
>
> The idea is to change Postfix behavior under stress: terminate SMTP
> sessions sooner, so that Postfix can help more clients in the same
> amount of time, but do this only when really necessary. The stress
> patch provides a way to do that. It is simple enough that it can
> be adopted into the legacy and stable Postfix releases.
>
> The patch works as follows. When all SMTP server processes become
> busy, the Postfix master daemon logs a warning and requests that
> each running SMTP server processes terminate as soon as its SMTP
> session ends. From this point on, Postfix creates SMTP server
> processes that have "-o stress=yes" on their command line, until
> the problematic condition has not happened for at least 1000 seconds.
>
> As shown below, the "stress=yes" parameter setting can be used to
> make main.cf configuration settings stress dependent.
>
> WARNING: the settings in the example are very agressive and may
> affect legitimate mail delivery. But they will help you receive at
> least some mail while you're being flooded by worms, spam, or
> backscatter. Some mail is better than no legitimate mail at all.
>
> 1 /etc/postfix/main.cf:
> 2 smtpd_hard_error_limit = ${stress?1}${stress:20}
> 3 smtpd_timeout = ${stress?5}${stress:300}
> 4 smtpd_banner = $myhostname ESMTP $mail_name${stress? (condition RED)}
>
> NOTES:
>
> - The example looks ugly because main.cf does not implement ${name?x:y}
> syntax. This can't be fixed without major incompatible changes.
>
> - Line 2 uses a reduced smtpd_hard_error_limit under conditions of
> stress, causing Postfix to hang up quickly after rejecting a
> command, without waiting for the client to send "QUIT". This
> won't affect legitimate single-recipient mail, but will delay
> mailing list traffic when you have subscriptions to accounts that
> no longer exist.
>
> - Line 3 uses a reduced SMTP server read/write timeout under stress.
> This causes Postfix to drop connections from ratware that would
> otherwise keep your SMTP ports occupied. But it may cause delays
> with mail from very slow client implementations.
>
> - Line 4 helps you to monitor your SMTP server's stress level
> remotely. It doesn't address the overload condition itself.
>
> Testing
> =======
>
> To test, either specify "smtpd -o stress=yes" in master.cf, or
> create a test-only smtpd server on a non-default port. Note: the
> stress feature does not work for servers that listen only localhost.
>
> There is no configuration parameter to permanently disable stress
> mode. This would greatly increase the footprint of the patch, and
> it would increase the likelihood of patch errors. A configuration
> parameter will likely be introduced in the Postfix 2.5 experimental
> release.
>
> Limitations
> ===========
>
> None of the above can provide the protection that you can get from
> a front-end daemon process that screens connections and keeps the
> suspicious ones away from the MTA. But that is a different project.
> See, for example, OpenBSD spamd at http://www.openbsd.org/spamd/.
>
> Under non-stress conditions, the Postfix master daemon creates SMTP
> server processes with "-o stress=", that is, an empty parameter
> value. Getting rid of this artifact would involve too many changes
> for the stable Postfix releases.
>
> The "-o stress=yes" argument has no effect on the cleanup server,
> the queue manager, and other processes. It works only for servers
> that receive mail from the network, and only when all processes for
> that service are busy.
>
> Wietse
>
> Patch relative to yesterday's patch. To apply:
> $ patch -p0 <this-message
>
> Don't forget to "postfix stop" and "postfix start".
>
> diff -cr /tmp/postfix-2.5-20070824/src/master/master_avail.c src/master/master_avail.c
> *** /tmp/postfix-2.5-20070824/src/master/master_avail.c Tue Sep 4 16:48:15 2007
> --- src/master/master_avail.c Tue Sep 4 16:16:36 2007
> ***************
> *** 76,81 ****
> --- 76,82 ----
> static void master_avail_event(int event, char *context)
> {
> MASTER_SERV *serv = (MASTER_SERV *) context;
> + time_t now;
> int n;
>
> if (event == 0) /* XXX Can this happen? */
> ***************
> *** 84,89 ****
> --- 85,116 ----
> for (n = 0; n < serv->listen_fd_count; n++)
> event_disable_readwrite(serv->listen_fd[n]);
> } else {
> +
> + /*
> + * When all servers for a public internet service are busy, we log a
> + * warning, suggest workarounds, and remain silent until the warning
> + * expires, 1000 seconds later. At the same time, we start creating
> + * server processes with "-o stress=yes" on the command line, and
> + * keep creating such processes until the process count has stayed
> + * below the limit for at least 1000 seconds. This provides a mimimal
> + * solution that can be adopted into legacy and stable Postfix
> + * releases.
> + *
> + * This is not the right place to update serv->stress_param_val in
> + * response to stress level changes. Doing so would would contaminate
> + * the code that implements "postfix reload" with stress management
> + * implementation details, creating a source of future bugs. Instead,
> + * we update simple counters or flags here, and use their values to
> + * determine the proper serv->stress_param_val value when exec-ing a
> + * server process.
> + */
> + if (serv->stress_param_val != 0
> + && !MASTER_LIMIT_OK(serv->max_proc, serv->total_proc + 1)) {
> + now = event_time();
> + if (serv->stress_expire_time < now)
> + master_restart_service(serv);
> + serv->stress_expire_time = now + 1000;
> + }
> master_spawn(serv);
> }
> }
> ***************
> *** 101,121 ****
> * monitoring the socket for connection requests. All this under the
> * restriction that we have sufficient resources to service a connection
> * request.
> - *
> - * When all servers for a public internet service are busy, we log a
> - * warning, suggest workarounds, and remain silent until the warning
> - * expires, 1000 seconds later. At the same time, we start creating
> - * server processes with "-o stress=yes" on the command line, and keep
> - * creating such processes until the process count has stayed below the
> - * limit for at least 1000 seconds. This provides a mimimal solution that
> - * can be adopted into legacy and stable Postfix releases.
> - *
> - * This is not the right place to update serv->stress_param_val in response
> - * to stress level changes. Doing so would would contaminate the code
> - * that implements "postfix reload" with stress management implementation
> - * details, creating a source of future bugs. Instead, we update simple
> - * counters or flags here, and use their values to determine the proper
> - * serv->stress_param_val value when exec-ing a server process.
> */
> if (msg_verbose)
> msg_info("%s: avail %d total %d max %d", myname,
> --- 128,133 ----
> ***************
> *** 128,144 ****
> event_enable_read(serv->listen_fd[n], master_avail_event,
> (char *) serv);
> } else if (serv->stress_param_val != 0
> ! && ((now = event_time()),
> ! (serv->stress_expire_time = now + 1000),
> ! (now > serv->busy_warn_time + 1000))) {
> serv->busy_warn_time = now;
> msg_warn("service \"%s\" (%s) has reached its process limit \"%d\": "
> "new clients may experience noticeable delays",
> serv->ext_name, serv->name, serv->max_proc);
> msg_warn("to avoid this condition, increase the process count "
> "in master.cf or reduce the service time per client");
> ! msg_warn("you may also make main.cf options dependent on the "
> ! "existence of a non-empty \"stress\" parameter value");
> }
> }
> }
> --- 140,154 ----
> event_enable_read(serv->listen_fd[n], master_avail_event,
> (char *) serv);
> } else if (serv->stress_param_val != 0
> ! && (now = event_time()) - serv->busy_warn_time > 1000) {
> serv->busy_warn_time = now;
> msg_warn("service \"%s\" (%s) has reached its process limit \"%d\": "
> "new clients may experience noticeable delays",
> serv->ext_name, serv->name, serv->max_proc);
> msg_warn("to avoid this condition, increase the process count "
> "in master.cf or reduce the service time per client");
> ! msg_warn("see http://www.postfix.org/STRESS_README.html for "
> ! "examples of stress-dependent configuration settings");
> }
> }
> }
> diff -cr /tmp/postfix-2.5-20070824/src/master/master_spawn.c src/master/master_spawn.c
> *** /tmp/postfix-2.5-20070824/src/master/master_spawn.c Tue Sep 4 16:48:15 2007
> --- src/master/master_spawn.c Tue Sep 4 14:18:30 2007
> ***************
> *** 224,233 ****
> vstring_sprintf(env_gen, "%s=%o", MASTER_GEN_NAME, master_generation);
> if (putenv(vstring_str(env_gen)) < 0)
> msg_fatal("%s: putenv: %m", myname);
> ! /* Enable stress mode WHILE forking the last process, not AFTER. */
> ! if (serv->stress_param_val
> ! && (serv->total_proc + 1 >= serv->max_proc
> ! || serv->stress_expire_time > event_time()))
> serv->stress_param_val[0] = CONFIG_BOOL_YES[0];
>
> execvp(serv->path, serv->args->argv);
> --- 224,230 ----
> vstring_sprintf(env_gen, "%s=%o", MASTER_GEN_NAME, master_generation);
> if (putenv(vstring_str(env_gen)) < 0)
> msg_fatal("%s: putenv: %m", myname);
> ! if (serv->stress_param_val && serv->stress_expire_time > event_time())
> serv->stress_param_val[0] = CONFIG_BOOL_YES[0];
>
> execvp(serv->path, serv->args->argv);
>



© 2004-2008 readlist.com