Discussion:
[Sqlgrey-users] Frequent disconnect/reconnect issues
Alex
2015-08-11 23:13:30 UTC
Permalink
Hi,

I've been running sqlgrey on fedora for quite a while, set up to use
DB_Cluster with a single master set up with mysql (mariadb). Recently,
I've started to receive a large number of "sqlgrey encountered an SQL
error and triggered a reconnection" messages, followed immediately by
a reconnection message.

It seems to happen at all times during the day, not just when the
network or server is congested or busy.

There are three read_hosts, including the master itself, but one of
the hosts has multiple DNS names, so it may be listed twice.

I don't exactly remember how the database schema was set up, which is
kind of a problem if I ever had to set it up again.

I've included the sqlgrey.conf config below, in hopes someone had some
ideas how to troubleshoot this. This is the config that's been working
fine for more than a year.

# cat sqlgrey.conf |grep '^[a-z|A-Z]'
loglevel = 3
log_override = whitelist:1,grey:3,spam:2
reconnect_delay = 5
awl_age = 31
group_domain_level = 10
db_type = mysql
db_name = sqlgrey
db_host = ns1.example.com;mysql_connect_timeout=1
db_port = default
db_user = sqlgrey
db_pass = mypass
db_cleanup_hostname=ns1.example.com
db_cleandelay = 1800
clean_method = sync
db_cluster = on
read_hosts=localhost,ns1.example.com,mail02.example.com,mail03.example.com,mail01.example.com
prepend = 1
admin_mail = ***@messaging.sprintpcs.com

Thanks,
Alex

------------------------------------------------------------------------------
Alex
2015-08-14 16:58:14 UTC
Permalink
Hi guys,

I sent the message below a few days ago, and haven't received any
responses. I'm still having difficulty with this and really don't know
how to resolve it. It appears to happen to all three systems, two of
which are on the same gigabit network, and the other is on a remote
cable connection (which is the one that least frequently fails).

Would rebuilding the database help? I haven't even been able to find
an error in the logs related to this, only the email that is sent
every time it happens.

Thanks,
Alex
Post by Alex
Hi,
I've been running sqlgrey on fedora for quite a while, set up to use
DB_Cluster with a single master set up with mysql (mariadb). Recently,
I've started to receive a large number of "sqlgrey encountered an SQL
error and triggered a reconnection" messages, followed immediately by
a reconnection message.
It seems to happen at all times during the day, not just when the
network or server is congested or busy.
There are three read_hosts, including the master itself, but one of
the hosts has multiple DNS names, so it may be listed twice.
I don't exactly remember how the database schema was set up, which is
kind of a problem if I ever had to set it up again.
I've included the sqlgrey.conf config below, in hopes someone had some
ideas how to troubleshoot this. This is the config that's been working
fine for more than a year.
# cat sqlgrey.conf |grep '^[a-z|A-Z]'
loglevel = 3
log_override = whitelist:1,grey:3,spam:2
reconnect_delay = 5
awl_age = 31
group_domain_level = 10
db_type = mysql
db_name = sqlgrey
db_host = ns1.example.com;mysql_connect_timeout=1
db_port = default
db_user = sqlgrey
db_pass = mypass
db_cleanup_hostname=ns1.example.com
db_cleandelay = 1800
clean_method = sync
db_cluster = on
read_hosts=localhost,ns1.example.com,mail02.example.com,mail03.example.com,mail01.example.com
prepend = 1
Thanks,
Alex
------------------------------------------------------------------------------
Karl O. Pinc
2015-08-15 01:32:42 UTC
Permalink
I prefer to stay far away from MySQL/MariaDB.
Having said that it seems
like the first thing you should do is instrument the db
so that you can tell when connections might be lost for
some reason. You get the message when a db connection
goes bad for some reason or another. (For me, it's when
my Postgres DB restarts due to security update.)

You could ask the MariaDB people for help with this.

If you followed the install directions your db was created
the same way as everybody else's. You could re-create
it the same way, but I don't see a point until you
know what's wrong.

I'd expect sqlgrey to be configured to log to the
mail log, although I suppose yours could be different.
This will give you timestamps to use when looking through
the db log.

On Fri, 14 Aug 2015 12:58:14 -0400
Post by Alex
Hi guys,
I sent the message below a few days ago, and haven't received any
responses. I'm still having difficulty with this and really don't know
how to resolve it. It appears to happen to all three systems, two of
which are on the same gigabit network, and the other is on a remote
cable connection (which is the one that least frequently fails).
Would rebuilding the database help? I haven't even been able to find
an error in the logs related to this, only the email that is sent
every time it happens.
Thanks,
Alex
Post by Alex
Hi,
I've been running sqlgrey on fedora for quite a while, set up to use
DB_Cluster with a single master set up with mysql (mariadb).
Recently, I've started to receive a large number of "sqlgrey
encountered an SQL error and triggered a reconnection" messages,
followed immediately by a reconnection message.
It seems to happen at all times during the day, not just when the
network or server is congested or busy.
There are three read_hosts, including the master itself, but one of
the hosts has multiple DNS names, so it may be listed twice.
I don't exactly remember how the database schema was set up, which
is kind of a problem if I ever had to set it up again.
I've included the sqlgrey.conf config below, in hopes someone had
some ideas how to troubleshoot this. This is the config that's been
working fine for more than a year.
# cat sqlgrey.conf |grep '^[a-z|A-Z]'
loglevel = 3
log_override = whitelist:1,grey:3,spam:2
reconnect_delay = 5
awl_age = 31
group_domain_level = 10
db_type = mysql
db_name = sqlgrey
db_host = ns1.example.com;mysql_connect_timeout=1
db_port = default
db_user = sqlgrey
db_pass = mypass
db_cleanup_hostname=ns1.example.com
db_cleandelay = 1800
clean_method = sync
db_cluster = on
read_hosts=localhost,ns1.example.com,mail02.example.com,mail03.example.com,mail01.example.com
prepend = 1
Thanks,
Alex
------------------------------------------------------------------------------
_______________________________________________
Sqlgrey-users mailing list
https://lists.sourceforge.net/lists/listinfo/sqlgrey-users
Karl <***@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

------------------------------------------------------------------------------
Alex
2015-08-18 15:58:38 UTC
Permalink
Hi guys,
Post by Karl O. Pinc
I'd expect sqlgrey to be configured to log to the
mail log, although I suppose yours could be different.
This will give you timestamps to use when looking through
the db log.
I've figured out some new information on this problem.

Aug 18 06:06:19 mail01 sqlgrey: warning: Use of uninitialized value
$DBI::errstr in concatenation (.) or string at /usr/sbin/sqlgrey line
250.
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('return','nl2go.com','91.121.21','2015-08-18
05:59:38',NOW()):#012, reconnecting to DB
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: Using DBIx:DBCluster
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: Read_hosts: localhost,
ns1.example.com, mail02.example.com, mail03.example.com,
mail01.example.com

Aug 17 01:43:40 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('financeiro','teamplayy.com.br','91.121.21','2015-08-17
01:37:00',NOW()):#012, reconnecting to DB
Aug 17 01:53:29 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('shefdemwp','uwerim.ru','91.121.21','2015-08-17
01:46:49',NOW()):#012, reconnecting to DB
Aug 17 05:57:58 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('conor.burrell','businessandfinance.com','91.121.21','2015-08-17
05:51:17',NOW()):#012, reconnecting to DB
Aug 17 09:43:48 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('claudinei.moreira','alper.com.br','91.121.21','2015-08-17
09:37:12',NOW()):#012, reconnecting to DB
Aug 17 10:26:43 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012UPDATE domain_awl SET last_seen = NOW(), first_seen =
first_seen WHERE sender_domain = NULL AND src = NULL:#012Can't connect
to MySQL server on 'ns1.example.com' (111), reconnecting to DB
Aug 17 12:44:38 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('memberservices=example-marketplace.com__#-21wpwe5yvrgp4h','rlie15guji343ap6.mnrzh7tg8d0vr194.m8vlizz.8-amgceai.na28.bnc.salesforce.com','136.146.208','2015-08-17
12:39:23',NOW()):#012, reconnecting to DB

Our server was changed to "example" in the above. It looks like maybe
some records were corrupted? Multiple entries contain the 91.121.21
address. Should I just truncate all the tables and let them rebuild
themselves with new email data or perhaps there's a more precise fix?

Thanks,
Alex

------------------------------------------------------------------------------
Alex
2015-08-19 17:33:49 UTC
Permalink
Hi guys,

I'm still having database problems. Can someone tell me under what
circumstances the database would expect to see a value that doesn't
exist?

Aug 19 00:07:39 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('shimvebvl','uwerim.ru','91.121.21','2015-08-19
00:00:57',NOW()):#012, reconnecting to DB

And why this would cause a disconnect/reconnect to occur?

I've searched all three systems in the DB_Cluster, and none of them
have any of these values. Why would sqlgrey think it's there in the
first place?

This appears to be the original occurrence of that entry, just a few
minutes prior:

Aug 19 00:00:57 mail02 postfix/smtpd[24228]: NOQUEUE: reject: RCPT
from webboundaries.com[91.121.21.199]: 450 4.7.1
<***@example.com>: Recipient address rejected: Greylisted
for 5 minutes; from=<***@uwerim.ru>
to=<***@example.com> proto=ESMTP helo=<webboundaries.com>

It then failed open, where postfix accepted it and was spamassassin
marked it as spam.

Thanks,
Alex
Post by Alex
Hi guys,
Post by Karl O. Pinc
I'd expect sqlgrey to be configured to log to the
mail log, although I suppose yours could be different.
This will give you timestamps to use when looking through
the db log.
I've figured out some new information on this problem.
Aug 18 06:06:19 mail01 sqlgrey: warning: Use of uninitialized value
$DBI::errstr in concatenation (.) or string at /usr/sbin/sqlgrey line
250.
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('return','nl2go.com','91.121.21','2015-08-18
05:59:38',NOW()):#012, reconnecting to DB
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: Using DBIx:DBCluster
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: Read_hosts: localhost,
ns1.example.com, mail02.example.com, mail03.example.com,
mail01.example.com
Aug 17 01:43:40 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('financeiro','teamplayy.com.br','91.121.21','2015-08-17
01:37:00',NOW()):#012, reconnecting to DB
Aug 17 01:53:29 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('shefdemwp','uwerim.ru','91.121.21','2015-08-17
01:46:49',NOW()):#012, reconnecting to DB
Aug 17 05:57:58 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('conor.burrell','businessandfinance.com','91.121.21','2015-08-17
05:51:17',NOW()):#012, reconnecting to DB
Aug 17 09:43:48 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('claudinei.moreira','alper.com.br','91.121.21','2015-08-17
09:37:12',NOW()):#012, reconnecting to DB
Aug 17 10:26:43 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012UPDATE domain_awl SET last_seen = NOW(), first_seen =
first_seen WHERE sender_domain = NULL AND src = NULL:#012Can't connect
to MySQL server on 'ns1.example.com' (111), reconnecting to DB
Aug 17 12:44:38 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('memberservices=example-marketplace.com__#-21wpwe5yvrgp4h','rlie15guji343ap6.mnrzh7tg8d0vr194.m8vlizz.8-amgceai.na28.bnc.salesforce.com','136.146.208','2015-08-17
12:39:23',NOW()):#012, reconnecting to DB
Our server was changed to "example" in the above. It looks like maybe
some records were corrupted? Multiple entries contain the 91.121.21
address. Should I just truncate all the tables and let them rebuild
themselves with new email data or perhaps there's a more precise fix?
Thanks,
Alex
------------------------------------------------------------------------------
Karl O. Pinc
2015-08-19 17:42:30 UTC
Permalink
On Wed, 19 Aug 2015 13:33:49 -0400
Post by Alex
Hi guys,
I'm still having database problems. Can someone tell me under what
circumstances the database would expect to see a value that doesn't
exist?
Aug 19 00:07:39 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('shimvebvl','uwerim.ru','91.121.21','2015-08-19
00:00:57',NOW()):#012, reconnecting to DB
And why this would cause a disconnect/reconnect to occur?
I believe you have it backwards. The message
says it can't insert the value. Because it can't
insert the value you wouldn't expect to see the
value in the db. It's probably not inserting the
value because it's connection to the db is broken.
So it tries to reconnect to the db.

Something, somewhere, probably not in sqlgrey,
probably not even in perl -- although all these
probablies are just a guess, is breaking the
connection to the db.

As I mentioned before, the most straightforward
approach is to instrument your db and get logs
of when connections break or otherwise go away.
Your db must know when this happens.


Karl <***@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

------------------------------------------------------------------------------
Lionel Bouton
2015-08-19 17:43:30 UTC
Permalink
Hi,

I don't have answers to all your questions but could you explain why you
are using db_cluster and how/why you set the replication to 5 MySQL
databases? This seems overkill and needlessly complex.

Can you reproduce your problem with a single database and db_cluster off?

Best regards,

Lionel


------------------------------------------------------------------------------
Alex
2015-08-19 18:12:21 UTC
Permalink
Hi,
Post by Lionel Bouton
I don't have answers to all your questions but could you explain why you
are using db_cluster and how/why you set the replication to 5 MySQL
databases? This seems overkill and needlessly complex.
It's actually three databases - two of them have multiple CNAMEs that
appear to be chosen on occasion.

I'm using db_cluster because I have three mail servers for the same
domain with the same weight, resulting in a reconnect potentially
reaching a different mail server, otherwise resulting in another 5
minute greylisting delay.

I thought that was the correct way to address this situation?
Post by Lionel Bouton
Can you reproduce your problem with a single database and db_cluster off?
For the reason described above, I don't think it would be a good idea
to disconnect them from each other.

This configuration seemed to run just fine for months.

There are entries in the mysql error logs when I deliberately
disconnected the master from the slave, but nothing around the time
the dbaccess error was triggered.

Should I look at recording every INSERT in the logs somehow?

Thanks,
Alex

------------------------------------------------------------------------------
Lionel Bouton
2015-08-19 18:38:43 UTC
Permalink
Post by Alex
Hi,
Post by Lionel Bouton
I don't have answers to all your questions but could you explain why you
are using db_cluster and how/why you set the replication to 5 MySQL
databases? This seems overkill and needlessly complex.
It's actually three databases - two of them have multiple CNAMEs that
appear to be chosen on occasion.
You shouldn't use 5 entries in read_hosts but only 2 or 3 depending on
what kind of load distribution between your databases you want. But it's
probably not relevant in your case.
Post by Alex
I'm using db_cluster because I have three mail servers for the same
domain with the same weight,
Do you have system load problems which mandates you having 3 mail
servers? It's usually not a good idea to have several active mail
servers unless you really need them. It's a general recommendation but
in the case of SQLgrey some spammers try to send mails to all servers at
the same time and it can confuse SQLgrey which tries to insert 3 times
the same entry because each instance can't known that the 2 others are
trying too. In this case SQLgrey will handle this as a database error
(because it can't guess if it's not one) and try to reconnect. With a
single server this is not possible.
Post by Alex
resulting in a reconnect potentially
reaching a different mail server, otherwise resulting in another 5
minute greylisting delay.
I thought that was the correct way to address this situation?
No: db_cluster is only meant to solve rare performance issues, has some
drawbacks (replication delay of most replication solutions means slaves
are out of sync with the master which can confuse SQLgrey in some cases)
and requires a bit of DBA work to setup properly. Unless your master
database is so loaded that SQLgrey starts to timeout you shouldn't use
dbcluster.
Post by Alex
Post by Lionel Bouton
Can you reproduce your problem with a single database and db_cluster off?
For the reason described above, I don't think it would be a good idea
to disconnect them from each other.
I don't understand what you mean by "disconnect them from each other".
Disconnecting databases from each other? Disconnecting mail servers from
database servers?

Your database should already be kept in sync by MySQL replication
(although I learned not to trust it myself) so disabling db_cluster
shouldn't have any impact on the database, you will simply write and
read from your single master (ns1.example.com) instead of writing to it
and reading from read_hosts. The slaves won't be used but you can keep
them if you really need to re-enable db_cluster later.
Disconnecting mail servers from the database servers (other than
ns1.example.com) should have zero impact: they are supposed to have the
same data anyway (in fact that's not true there's a replication delay
which can cause problems so disabling db_cluster can actually solve
problems there).
Post by Alex
This configuration seemed to run just fine for months.
Sometimes things work by chance even if they aren't set up properly
until the right set of conditions making them fail happen.

Best regards,

Lionel

------------------------------------------------------------------------------
Alex
2015-08-20 01:09:42 UTC
Permalink
Hi,
Post by Lionel Bouton
Post by Alex
Post by Lionel Bouton
I don't have answers to all your questions but could you explain why you
are using db_cluster and how/why you set the replication to 5 MySQL
databases? This seems overkill and needlessly complex.
It's actually three databases - two of them have multiple CNAMEs that
appear to be chosen on occasion.
You shouldn't use 5 entries in read_hosts but only 2 or 3 depending on
what kind of load distribution between your databases you want. But it's
probably not relevant in your case.
It's not five. It's really only two, unless you count itself. The
other two are localhost and a CNAME entry.
Post by Lionel Bouton
Post by Alex
I'm using db_cluster because I have three mail servers for the same
domain with the same weight,
Do you have system load problems which mandates you having 3 mail
servers? It's usually not a good idea to have several active mail
servers unless you really need them.
Yes, for load and redundancy reasons. The three servers receive about
400k messages per day, at peak.
Post by Lionel Bouton
It's a general recommendation but
in the case of SQLgrey some spammers try to send mails to all servers at
the same time and it can confuse SQLgrey which tries to insert 3 times
the same entry because each instance can't known that the 2 others are
trying too. In this case SQLgrey will handle this as a database error
(because it can't guess if it's not one) and try to reconnect. With a
single server this is not possible.
Okay, that's interesting. And you believe it's possible that's what
happening here?
Post by Lionel Bouton
Post by Alex
resulting in a reconnect potentially
reaching a different mail server, otherwise resulting in another 5
minute greylisting delay.
I thought that was the correct way to address this situation?
No: db_cluster is only meant to solve rare performance issues, has some
drawbacks (replication delay of most replication solutions means slaves
are out of sync with the master which can confuse SQLgrey in some cases)
and requires a bit of DBA work to setup properly. Unless your master
database is so loaded that SQLgrey starts to timeout you shouldn't use
dbcluster.
Except if the master becomes inaccessible, no?
Post by Lionel Bouton
Post by Alex
Post by Lionel Bouton
Can you reproduce your problem with a single database and db_cluster off?
For the reason described above, I don't think it would be a good idea
to disconnect them from each other.
I don't understand what you mean by "disconnect them from each other".
Disconnecting databases from each other? Disconnecting mail servers from
database servers?
Your database should already be kept in sync by MySQL replication
(although I learned not to trust it myself) so disabling db_cluster
shouldn't have any impact on the database, you will simply write and
read from your single master (ns1.example.com) instead of writing to it
and reading from read_hosts. The slaves won't be used but you can keep
them if you really need to re-enable db_cluster later.
I think I've been confusing MySQL replication and dbcluster as being
something that must be used together.
Post by Lionel Bouton
Disconnecting mail servers from the database servers (other than
ns1.example.com) should have zero impact: they are supposed to have the
same data anyway (in fact that's not true there's a replication delay
which can cause problems so disabling db_cluster can actually solve
problems there).
Okay, great. I'll disable dbcluster and follow up. I was following the README:

If [mail1] gets a new request, it'll greylist, respond "450" and
stick client into the "connect" table.
Now heres the problem. Client backs off, and comes back later to
try again, but there is no
garantee it'll get [mail1] again. Infact, if it doesnt, the
greylisting will happen all over. The solution
is to use DBClustering. Each mailserver STILL has its own
SQL-server, but we add a master-sql
and let the local sql-servers be replication slaves:

I don't have the load balancer, as shown in the drawing in the README.
I guess I don't fully understand the purpose of db_cluster.

Thanks so much,
Alex

------------------------------------------------------------------------------
Urban, Frank
2015-08-20 05:45:56 UTC
Permalink
Hi Alex,

here my configuring working since 10 years without any problem. I just got new servers and transferred the config and database from Mysql to Mariadb also without any problem.
In the last 365 days Sqlgerey blocked 150.000.000 emails here. It was more in the past up to 1.000.000.000 per year.
We still have only 20% SPAM after Greylisting.

We are using 7 servers. mail1 to mail7
The Mariadb is installed as master on mail6 and uses mail7 as a replication server.
mail6 is used as AWL and Bayes database for SpamAssassin
mail6 is used as master for Sqlgrey and mail7 as reading host

That's the config used on all servers

/etc/sqlgrey/sqlgrey.conf

db_host = mail6
db_cluster = on
read_hosts = mail7,mail6
db_cleanup_hostname = mail6


Greetings

Frank

-----Ursprüngliche Nachricht-----
Von: Alex [mailto:***@gmail.com]
Gesendet: Donnerstag, 20. August 2015 03:10
An: SQLgrey users mailing-list
Betreff: Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

Hi,
Post by Lionel Bouton
Post by Alex
Post by Lionel Bouton
I don't have answers to all your questions but could you explain why
you are using db_cluster and how/why you set the replication to 5
MySQL databases? This seems overkill and needlessly complex.
It's actually three databases - two of them have multiple CNAMEs that
appear to be chosen on occasion.
You shouldn't use 5 entries in read_hosts but only 2 or 3 depending on
what kind of load distribution between your databases you want. But
it's probably not relevant in your case.
It's not five. It's really only two, unless you count itself. The other two are localhost and a CNAME entry.
Post by Lionel Bouton
Post by Alex
I'm using db_cluster because I have three mail servers for the same
domain with the same weight,
Do you have system load problems which mandates you having 3 mail
servers? It's usually not a good idea to have several active mail
servers unless you really need them.
Yes, for load and redundancy reasons. The three servers receive about 400k messages per day, at peak.
Post by Lionel Bouton
It's a general recommendation but
in the case of SQLgrey some spammers try to send mails to all servers
at the same time and it can confuse SQLgrey which tries to insert 3
times the same entry because each instance can't known that the 2
others are trying too. In this case SQLgrey will handle this as a
database error (because it can't guess if it's not one) and try to
reconnect. With a single server this is not possible.
Okay, that's interesting. And you believe it's possible that's what happening here?
Post by Lionel Bouton
Post by Alex
resulting in a reconnect potentially reaching a different mail
server, otherwise resulting in another 5 minute greylisting delay.
I thought that was the correct way to address this situation?
No: db_cluster is only meant to solve rare performance issues, has
some drawbacks (replication delay of most replication solutions means
slaves are out of sync with the master which can confuse SQLgrey in
some cases) and requires a bit of DBA work to setup properly. Unless
your master database is so loaded that SQLgrey starts to timeout you
shouldn't use dbcluster.
Except if the master becomes inaccessible, no?
Post by Lionel Bouton
Post by Alex
Post by Lionel Bouton
Can you reproduce your problem with a single database and db_cluster off?
For the reason described above, I don't think it would be a good idea
to disconnect them from each other.
I don't understand what you mean by "disconnect them from each other".
Disconnecting databases from each other? Disconnecting mail servers
from database servers?
Your database should already be kept in sync by MySQL replication
(although I learned not to trust it myself) so disabling db_cluster
shouldn't have any impact on the database, you will simply write and
read from your single master (ns1.example.com) instead of writing to
it and reading from read_hosts. The slaves won't be used but you can
keep them if you really need to re-enable db_cluster later.
I think I've been confusing MySQL replication and dbcluster as being something that must be used together.
Post by Lionel Bouton
Disconnecting mail servers from the database servers (other than
ns1.example.com) should have zero impact: they are supposed to have
the same data anyway (in fact that's not true there's a replication
delay which can cause problems so disabling db_cluster can actually
solve problems there).
Okay, great. I'll disable dbcluster and follow up. I was following the README:

If [mail1] gets a new request, it'll greylist, respond "450" and stick client into the "connect" table.
Now heres the problem. Client backs off, and comes back later to try again, but there is no
garantee it'll get [mail1] again. Infact, if it doesnt, the greylisting will happen all over. The solution
is to use DBClustering. Each mailserver STILL has its own SQL-server, but we add a master-sql
and let the local sql-servers be replication slaves:

I don't have the load balancer, as shown in the drawing in the README.
I guess I don't fully understand the purpose of db_cluster.

Thanks so much,
Alex
Lionel Bouton
2015-08-20 10:33:44 UTC
Permalink
Post by Alex
Hi,
Post by Lionel Bouton
Post by Alex
Post by Lionel Bouton
I don't have answers to all your questions but could you explain why you
are using db_cluster and how/why you set the replication to 5 MySQL
databases? This seems overkill and needlessly complex.
It's actually three databases - two of them have multiple CNAMEs that
appear to be chosen on occasion.
You shouldn't use 5 entries in read_hosts but only 2 or 3 depending on
what kind of load distribution between your databases you want. But it's
probably not relevant in your case.
It's not five. It's really only two, unless you count itself. The
other two are localhost and a CNAME entry.
Then you shouldn't have 5 entries in read_hosts as I already said.
Seriously if you don't understand why 5 entries for 3 servers can cause
problems you shouldn't even think about touching db_cluster: you are
clearly using it wrong.
Post by Alex
Post by Lionel Bouton
Post by Alex
I'm using db_cluster because I have three mail servers for the same
domain with the same weight,
Do you have system load problems which mandates you having 3 mail
servers? It's usually not a good idea to have several active mail
servers unless you really need them.
Yes, for load and redundancy reasons. The three servers receive about
400k messages per day, at peak.
This is only <5 messages per second. Hardly something a single server
can't handle (unless you use costly antispam measures that you can't
delegate to backend servers).

This kind of redundancy is mostly superfluous with SMTP: if you have a
spare server ready to replace your mail server (ie: affect it the public
IP of your single MX to it or redirect it if you have some kind of
reverse proxy), you only have to start it to get the messages queued at
the origin delivered to it.
Post by Alex
Post by Lionel Bouton
It's a general recommendation but
in the case of SQLgrey some spammers try to send mails to all servers at
the same time and it can confuse SQLgrey which tries to insert 3 times
the same entry because each instance can't known that the 2 others are
trying too. In this case SQLgrey will handle this as a database error
(because it can't guess if it's not one) and try to reconnect. With a
single server this is not possible.
Okay, that's interesting. And you believe it's possible that's what
happening here?
It's a known problem so yes it's possible.
Post by Alex
Post by Lionel Bouton
Post by Alex
resulting in a reconnect potentially
reaching a different mail server, otherwise resulting in another 5
minute greylisting delay.
I thought that was the correct way to address this situation?
No: db_cluster is only meant to solve rare performance issues, has some
drawbacks (replication delay of most replication solutions means slaves
are out of sync with the master which can confuse SQLgrey in some cases)
and requires a bit of DBA work to setup properly. Unless your master
database is so loaded that SQLgrey starts to timeout you shouldn't use
dbcluster.
Except if the master becomes inaccessible, no?
Even with db_cluster on if the master becomes inaccessible SQLgrey stops
greylisting and let everything pass. If it can't write to the database
there's no way it can do its job.
Post by Alex
Post by Lionel Bouton
Post by Alex
Post by Lionel Bouton
Can you reproduce your problem with a single database and db_cluster off?
For the reason described above, I don't think it would be a good idea
to disconnect them from each other.
I don't understand what you mean by "disconnect them from each other".
Disconnecting databases from each other? Disconnecting mail servers from
database servers?
Your database should already be kept in sync by MySQL replication
(although I learned not to trust it myself) so disabling db_cluster
shouldn't have any impact on the database, you will simply write and
read from your single master (ns1.example.com) instead of writing to it
and reading from read_hosts. The slaves won't be used but you can keep
them if you really need to re-enable db_cluster later.
I think I've been confusing MySQL replication and dbcluster as being
something that must be used together.
I don't know why you think this after our exchanges. They don't have to
work *together* but if the replication doesn't work properly SQLgrey
with db_cluster will surely fail to work properly.
Post by Alex
Post by Lionel Bouton
Disconnecting mail servers from the database servers (other than
ns1.example.com) should have zero impact: they are supposed to have the
same data anyway (in fact that's not true there's a replication delay
which can cause problems so disabling db_cluster can actually solve
problems there).
If [mail1] gets a new request, it'll greylist, respond "450" and
stick client into the "connect" table.
Now heres the problem. Client backs off, and comes back later to
try again, but there is no
garantee it'll get [mail1] again. Infact, if it doesnt, the
greylisting will happen all over. The solution
is to use DBClustering. Each mailserver STILL has its own
SQL-server, but we add a master-sql
I don't have the load balancer, as shown in the drawing in the README.
I guess I don't fully understand the purpose of db_cluster.
db_cluster support wasn't written by me and I never used it so I just
read its README again to check.
I think it doesn't really explain when you should use it properly, just
ignore any advice in it not linked to database load problems.

Best regards,

Lionel

------------------------------------------------------------------------------
Jean-Michel Pouré - GOOZE
2015-08-15 11:18:02 UTC
Permalink
Post by Alex
Would rebuilding the database help?
MySQL is dead, has no community, no future (belongs to Oracle)and we cannot loose time debugging which does not and will not ever work. 
Maybe you should make a clean installation of PostgreSQL and use pgAdmin3 GUI to query the database remotely. 
This is a very clean interface which will give all information.
Kind regards,Jean-Michel
Karl O. Pinc
2015-08-15 15:49:40 UTC
Permalink
On Sat, 15 Aug 2015 13:18:02 +0200
Post by Jean-Michel Pouré - GOOZE
Post by Alex
Would rebuilding the database help?
MySQL is dead, has no community, no future (belongs to Oracle)and we
cannot loose time debugging which does not and will not ever work.
That's not entirely fair. There is MariaDB, the future of MySQL
moving forward.

I have technical issues with MySQL/MaraiaDB, but putting those aside
the serious problem that has always existed with them is
that their documentation is not free (as in Libre). A db is no
good without documentation, which matters if you care
about vendor lock-in.



Karl <***@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

------------------------------------------------------------------------------
Alex
2015-08-15 16:30:20 UTC
Permalink
Hi,

Thanks everyone for your help. It worked with mariadb for months and
months without this problem. I'm going to try and dump the database
and restore it, since that's the easiest.

Thanks again,
Alex
Post by Karl O. Pinc
On Sat, 15 Aug 2015 13:18:02 +0200
Post by Jean-Michel Pouré - GOOZE
Post by Alex
Would rebuilding the database help?
MySQL is dead, has no community, no future (belongs to Oracle)and we
cannot loose time debugging which does not and will not ever work.
That's not entirely fair. There is MariaDB, the future of MySQL
moving forward.
I have technical issues with MySQL/MaraiaDB, but putting those aside
the serious problem that has always existed with them is
that their documentation is not free (as in Libre). A db is no
good without documentation, which matters if you care
about vendor lock-in.
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
------------------------------------------------------------------------------
_______________________________________________
Sqlgrey-users mailing list
https://lists.sourceforge.net/lists/listinfo/sqlgrey-users
------------------------------------------------------------------------------
Jean-Michel Pouré - GOOZE
2015-08-15 17:56:06 UTC
Permalink
Post by Karl O. Pinc
There is MariaDB, the future of MySQL
moving forward.
Quoting website :***© 2015 MariaDB Corporation Ab except as noted. MariaDB is a trademark or registered trademarks of MariaDB Corporation Ab in the European Union and United States of America and/or other countries. This website is proud to run on MariaDB. MySQL is a trademark of Oracle Corporation Inc. Trademarks Statement | Privacy Policy***
IMHO, This is a company picking-up free software, not a real community relying on individuals, like Kernel.org, X.Org, PostgreSQL.org.
If you need reliable solutions, use a software made and released by communities. If you have access to a dedicated server, justinstall SQLgrey and PostgreSQL and you will be able to run thousands of queries per second, reliably.
The big issue with MySQL is that it dies silently and you don't know what happened to you data, because at some pointMySQL is using query caching and you cannot trace.
Anyway, I wron't start a flame war. Go and use MariaDB if you think this is the future. But don't expect to run more than 10.000 queries per second reliably.
Karl O. Pinc
2015-08-15 19:17:45 UTC
Permalink
On Sat, 15 Aug 2015 19:56:06 +0200
Post by Jean-Michel Pouré - GOOZE
If you need reliable solutions,
use a software made and released by communities.
I would say the following: If you need a reliable, by which
I mean long-term, solution use software which
has a community _and_ can be forked by that community.

Even better, use software with a copyleft license.
Copyleft prevents proprietary interests from taking
the software and competing with the community with
the intent to marginalize it. (E.g. The BSD license
let Sun and later Apple marginalize the BSD based OSs.)


Karl <***@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

------------------------------------------------------------------------------
Continue reading on narkive:
Loading...