Page MenuHomePhorge

Guam crashes frequently (might be SSL related)
Closed, ResolvedPublic

Description

rpm -qv guam
guam-0.7.1-3.2.el7.kolab_16.x86_64

/var/log/guam/console.log:

gen_fsm <0.14289.0> in state passthrough terminated with reason: buf_error in zlib:call/3
CRASH REPORT Process <0.14289.0> with 0 neighbours exited with reason: buf_error in zlib:call/3 in gen_fsm:terminate/7 line 611

/var/log/guam/crash.log

=CRASH REPORT====
  crasher:
    initial call: eimap:init/1
    pid: <0.14275.0>
    registered_name: []
    exception exit: {{buf_error,[{zlib,call,3,[]},{zlib,deflate,3,[]},{eimap,deflated,2,[{file,"src/eimap.erl"},{line,483}]},{eimap,passthrough,2,[{file,"src/eimap.erl"},{line,174}]},{gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,503}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]},[{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,611}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}
    ancestors: [<0.14151.0>,<0.91.0>,kolab_guam_sup,<0.86.0>]
    messages: []
    links: [#Port<0.11484>,<0.14151.0>,#Port<0.11483>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1598
    stack_size: 27
    reductions: 2858
  neighbours:
2016-02-26 12:34:11 =ERROR REPORT====
** State machine <0.14283.0> terminating 
** Last event in was {passthrough,<<>>} (for all states)
** When State == passthrough
**      Data  == {state,"127.0.0.1",9993,true,true,{sslsocket,{gen_tcp,#Port<0.11492>,tls_connection,undefined},<0.14284.0>},<<"post.woelkchen.xyz Cyrus IMAP 2.5.7-Kolab-2.5.7-1.2.el7.kolab_16 server ready">>,2,{[],[]},{command,<<"EG0001">>,undefined,<<"COMPRESS DEFLATE">>,<0.14283.0>,compress,#Fun<eimap_command_compress.parse.2>},undefined,none,true,<0.14166.0>,<<>>,#Port<0.11495>,#Port<0.11496>}
** Reason for termination = 
** {buf_error,[{zlib,call,3,[]},{zlib,deflate,3,[]},{eimap,deflated,2,[{file,"src/eimap.erl"},{line,483}]},{eimap,passthrough,2,[{file,"src/eimap.erl"},{line,174}]},{gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,503}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}

I exchanged a few lines with kanarip about the issue yesterday and it seems to be SSL related.
the corresponding snippet from /etc/guam/sys.config

tls_config, [
    { certfile, "/etc/letsencrypt/live/post.woelkchen.xyz/cert.pem" },
    { cacertfile, "/etc/letsencrypt/live/post.woelkchen.xyz/chain.pem" },
    { keyfile, "/etc/letsencrypt/live/post.woelkchen.xyz/privkey.pem" }

Initially I tried to concatenate my entire Certificate Chain into a single file (Key+Cert+Intermediate CA, in that order) but guam only delivered my cert and not the CA. This behaviour seems to be intentional, see: http://erlang.org/pipermail/erlang-questions/2014-August/080512.html

With my current configuration I do have a working chain that is accepted by my client, but it crashes constantly...

Details

Ticket Type
Task

Event Timeline

vanmeeuwen subscribed.

Can you disable the use of compression in the client, or revert rG534d19b37ec57efe9754c6cf8a22a59cbdcd7b6f?

turbomettwurst claimed this task.

Thx, it seems to have 'fixed' the issue for now..
While i am at it:
Would it be beneficial for you guys to find the offending piece of client software?

Should the problem be already fixed in version? :

rpm -qv guam
guam-0.7.2-2.4.el7.kolab_16.x86_64

I am asking because found this today in my log:

Apr 22 08:42:15 kolab.example.eu postfix/smtpd[1291]: connect from gateway[172.16.178.100]
Apr 22 08:42:15 kolab.example.eu postfix/smtpd[1291]: disconnect from gateway[172.16.178.100]
Apr 22 08:42:27 kolab.example.eu postfix/smtpd[1291]: connect from gateway[172.16.178.100]
Apr 22 08:42:27 kolab.example.eu postfix/smtpd[1291]: lost connection after AUTH from gateway[172.16.178.100]
Apr 22 08:42:27 kolab.example.eu postfix/smtpd[1291]: disconnect from gateway[172.16.178.100]
Apr 22 08:42:40 kolab.example.eu guam[1224]: 08:42:40.498 [error] gen_fsm <0.139.0> in state passthrough terminated with reason: buf_error in zlib:call/3
Apr 22 08:42:40 kolab.example.eu guam[1224]: 08:42:40.498 [error] CRASH REPORT Process <0.139.0> with 0 neighbours exited with reason: buf_error in zlib:call/3 in gen_fsm:terminate/7 line 611
Apr 22 08:42:40 kolab.example.eu imaps[1180]: inittls: Loading hard-coded DH parameters
Apr 22 08:42:40 kolab.example.eu imaps[1180]: starttls: TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits reused) no authentication
Apr 22 08:42:41 kolab.example.eu imaps[1180]: login: localhost [127.0.0.1] vorname.name@kolab.example.eu PLAIN+TLS User logged in SESSIONID=<kolab.example.eu-1180-1461307360-1-12789336562900413385>
Apr 22 08:42:44 kolab.example.eu guam[1224]: 08:42:44.458 [error] SSL: certify: ssl_alert.erl:92:Fatal error: certificate unknown
Apr 22 08:42:44 kolab.example.eu guam[1224]: 08:42:44.458 [error] gen_server <0.118.0> terminated with reason: no function clause matching inet:tcp_close({sslsocket,nil,{#Port<0.1369>,{config,{ssl_options,tls,[{3,3},{3,2},{3,1}],verify_none,{#Fun<ssl.7...>,...},...},...}}}) line 1504
Apr 22 08:42:44 kolab.example.eu guam[1224]: 08:42:44.458 [error] CRASH REPORT Process <0.118.0> with 0 neighbours exited with reason: no function clause matching inet:tcp_close({sslsocket,nil,{#Port<0.1369>,{config,{ssl_options,tls,[{3,3},{3,2},{3,1}],verify_none,{#Fun<ssl.7...>,...},...},...}}}) line 1504 in gen_server:terminate/7 line 792
Apr 22 08:42:44 kolab.example.eu guam[1224]: 08:42:44.458 [error] Supervisor {<0.91.0>,kolab_guam_listener} had child session started with {kolab_guam_session,start_link,undefined} at <0.118.0> exit with reason no function clause matching inet:tcp_close({sslsocket,nil,{#Port<0.1369>,{config,{ssl_options,tls,[{3,3},{3,2},{3,1}],verify_none,{#Fun<ssl.7...>,...},..
Apr 22 08:42:45 kolab.example.eu imaps[1145]: starttls: TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits reused) no authentication
Apr 22 08:42:46 kolab.example.eu imaps[1145]: login: localhost [127.0.0.1] vorname.name@kolab.example.eu PLAIN+TLS User logged in SESSIONID=<kolab.example.eu-1145-1461307365-1-16250946059461746899>
Apr 22 08:42:46 kolab.example.eu guam[1224]: 08:42:46.648 [error] gen_fsm <0.164.0> in state passthrough terminated with reason: buf_error in zlib:call/3
Apr 22 08:42:46 kolab.example.eu guam[1224]: 08:42:46.648 [error] CRASH REPORT Process <0.164.0> with 0 neighbours exited with reason: buf_error in zlib:call/3 in gen_fsm:terminate/7 line 611
Apr 22 08:42:46 kolab.example.eu guam[1224]: 08:42:46.775 [error] gen_fsm <0.158.0> in state passthrough terminated with reason: buf_error in zlib:call/3
Apr 22 08:42:46 kolab.example.eu guam[1224]: 08:42:46.775 [error] CRASH REPORT Process <0.158.0> with 0 neighbours exited with reason: buf_error in zlib:call/3 in gen_fsm:terminate/7 line 611
Apr 22 08:42:46 kolab.example.eu imaps[1172]: starttls: TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits reused) no authentication
Apr 22 08:42:47 kolab.example.eu imaps[1074]: starttls: TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits reused) no authentication
Apr 22 08:42:47 kolab.example.eu imaps[938]: starttls: TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits reused) no authentication
Apr 22 08:42:47 kolab.example.eu imaps[1172]: login: localhost [127.0.0.1] vorname.name@kolab.example.eu PLAIN+TLS User logged in SESSIONID=<kolab.example.eu-1172-1461307366-1-12864858444493004401>
Apr 22 08:42:47 kolab.example.eu imaps[938]: login: localhost [127.0.0.1] vorname.name@kolab.example.eu PLAIN+TLS User logged in SESSIONID=<kolab.example.eu-938-1461307367-1-10015394383486591173>
Apr 22 08:42:47 kolab.example.eu imaps[1074]: login: localhost [127.0.0.1] vorname.name@kolab.example.eu PLAIN+TLS User logged in SESSIONID=<kolab.example.eu-1074-1461307367-1-14608959089089413488>
Apr 22 08:42:52 kolab.example.eu guam[1224]: 08:42:52.052 [error] gen_fsm <0.175.0> in state passthrough terminated with reason: buf_error in zlib:call/3
Apr 22 08:42:52 kolab.example.eu guam[1224]: 08:42:52.052 [error] CRASH REPORT Process <0.175.0> with 0 neighbours exited with reason: buf_error in zlib:call/3 in gen_fsm:terminate/7 line 611

Read here:
https://git.kolab.org/T913#18604

Switching to STARTTLS solved my problem...

Currently deploying Kolab, I've found Guam crashing on SSL connections (and not working with iOS devices). In addition, the groupware folders filtering isn't working for some clients (Apple Mail to name it).

After digging around, I've finally build a custom package for Guam, based on the Winterfell package (guam-0.8-7.3), and with the addition of the patch found in https://git.kolab.org/T1144. That solved the problem with the groupware folders filtering.

Concerning the SSL crashes, I've setup HAProxy to handle SSL, and started Guam without SSL, only STARTTLS. Guam still crash a couple times, but it's restarted every 15 minutes, and HAProxy is setup to use direct Cyrus connection in case Guam is crashed. The concept was first implemented with a basic stunnel, and it works fine too.

I'll try to find time to come up with a better bug report, but in the meantime, maybe someone will find the informations useful, as my setup is working fine for now.

My Erlang skills tends to zero, but I'll also try to debug things on my test server (in may be faster for me to go for a py-guam and rewrite Guam in python, but I guess Erlang was carefully choosen so I'd rather try to help than rewrite things).

Thanks for Kolab, it kicks ass :)