Quantcast
Channel: Serverphorums.com - HAProxy
Viewing all articles
Browse latest Browse all 5112

HAProxy 1.6.3: 100% cpu utilization for >17 days with 1 connection (5 replies)

$
0
0
Hi,

First of all, thanks for a great product that is working extremely well for
Flipkart!

I saw many similar issues posted earlier by others, but could not find a
thread
where this is resolved or fixed in a newer release. We are using Ubuntu
16.04
with distro HAProxy (1.6.3), and see that HAProxy spins at 100% with 1-10
TCP
connections, sometimes just 1 - a stale connection that does not seem to
belong
to any frontend session. Strace with -T shows the folllowing:

epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000020>
epoll_wait(0, [], 200, 0) = 0 <0.000009>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [{EPOLLIN|EPOLLHUP|EPOLLRDHUP, {u32=2, u64=2}}], 200, 0) = 1
<0.000006>
epoll_wait(0, [{EPOLLIN, {u32=11, u64=11}}], 200, 0) = 1 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000029>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000021>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000011>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [{EPOLLIN, {u32=7, u64=7}}], 200, 0) = 1 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000007>
epoll_wait(0, [{EPOLLOUT, {u32=2, u64=2}}], 200, 0) = 1 <0.000015>
epoll_wait(0, [], 200, 0) = 0 <0.000007>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000016>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000008>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000017>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [{EPOLLIN, {u32=10, u64=10}}], 200, 0) = 1 <0.000009>
epoll_wait(0, [{EPOLLIN|EPOLLRDHUP, {u32=10, u64=10}}], 200, 0) = 1
<0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000016>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000006>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000005>
epoll_wait(0, [], 200, 0) = 0 <0.000017>

The single connection has this session information:
0xd1d790: [06/May/2017:02:44:37.373636] id=286529830 proto=tcpv4
source=a.a.a.a:35297
flags=0x1ce, conn_retries=0, srv_conn=0xca4000, pend_pos=(nil)
frontend=fe-fe-fe-fe-fe-fe (id=3 mode=tcp), listener=? (id=1)
addr=b.b.b.b:5667
backend=be-be-be-be-be-be (id=4 mode=tcp) addr=c.c.c.c:11870
server=d.d.d.d (id=4) addr=d.d.d.d:5667
task=0xd1d710 (state=0x04 nice=0 calls=1117789229 exp=<PAST>, running
age=12d11h)
si[0]=0xd1d988 (state=CLO flags=0x00 endp0=CONN:0xd771c0 exp=<NEVER>,
et=0x000)
si[1]=0xd1d9a8 (state=EST flags=0x10 endp1=CONN:0xccadb0 exp=<NEVER>,
et=0x000)
co0=0xd771c0 ctrl=NONE xprt=NONE data=STRM target=LISTENER:0xc76ae0
flags=0x002f9000 fd=55 fd.state=00 fd.cache=0 updt=0
co1=0xccadb0 ctrl=tcpv4 xprt=RAW data=STRM target=SERVER:0xca4000
flags=0x0020b310 fd=9 fd_spec_e=22 fd_spec_p=0 updt=0
req=0xd1d7a0 (f=0x80a020 an=0x0 pipe=0 tofwd=-1 total=0)
an_exp=<NEVER> rex=? wex=<NEVER>
buf=0x6e9120 data=0x6e9134 o=0 p=0 req.next=0 i=0 size=0
res=0xd1d7e0 (f=0x8000a020 an=0x0 pipe=0 tofwd=0 total=0)
an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
buf=0x6e9120 data=0x6e9134 o=0 p=0 rsp.next=0 i=0 size=0

Build options:
HA-Proxy version 1.6.3 2015/12/25
Copyright 2000-2015 Willy Tarreau <willy@haproxy.org>

Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -g -O2 -fstack-protector-strong -Wformat
-Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2
OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.2g-fips 1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.38 2015-11-23
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

We have 3 systems running the identical configuration and haproxy binary,
and
the 100% cpu is ongoing for the last 17 days on one system. The client
connection is no longer present. I am assuming that a haproxy reload would
solve this as the frontend connection is not present, but have not tested
it out yet. Since this box is in production, I am unable to do invasive
debugging
(e.g. gdb).

Please let me know if this is fixed in a latter release, or any more
information that
can help find the root cause.

thanks,
- Krishna

Viewing all articles
Browse latest Browse all 5112

Trending Articles