workerman/gateway-worker进程跟踪 restart_syscall(<... resuming interrupted restart_syscall ...>

Tinywan

运行状态

----------------------------------------------GLOBAL STATUS----------------------------------------------------
Workerman version:3.5.31          PHP version:7.4.16
start time:2021-08-17 14:07:15   run 336 days 21 hours   
load average: 0.02, 0, 0         event-loop:\Workerman\Events\Event
3 workers       7 processes
worker_name    exit_status      exit_count
Register       0                0
BusinessWorker 0                16
BusinessWorker 64000            5
Gateway        0                0
----------------------------------------------PROCESS STATUS---------------------------------------------------
pid memory  listening                  worker_name    connections send_fail timers  total_request qps    status
1957    6M      text://172.24.171.109:1236 Register       6           0         0       7762771       0      [idle]
1962    17.77M  websocket://0.0.0.0:9502   Gateway        30          913913    5       117133951     0      [idle]
1963    17.77M  websocket://0.0.0.0:9502   Gateway        21          920443    5       117666307     0      [idle]
8701    N/A     none         BusinessWorker N/A         N/A       N/A     N/A           N/A    [busy] 
9574    10M     none                       BusinessWorker 3           0         1       23086407      0      [idle]
9576    10M     none                       BusinessWorker 3           0         1       23071418      0      [idle]
10215   10M     none                       BusinessWorker 3           0         1       1024          0      [idle]
----------------------------------------------PROCESS STATUS---------------------------------------------------
Summary 70M     -                          -              66          1834356   13      288721878     0      [Summary] 

8701 进程跟踪

8701 N/A none BusinessWorker N/A N/A N/A N/A N/A [busy]

strace -ttp 8701
strace: Process 8701 attached
11:30:30.102970 restart_syscall(<... resuming interrupted restart_syscall ...>

) = 0
11:30:51.310705 poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:30:51.310797 poll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, 60000^Cstrace: Process 8701 detached
 <detached ...>

进程跟踪一直被阻塞上面这种情况,无任何响应

按道理应该是下面这种才是正常的

11:41:44.166569 recvfrom(12, "+OK\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 5
11:41:44.166655 poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:41:44.166744 sendto(12, "*3\r\n$4\r\nHGET\r\n$10\r\nlive_layer\r\n$"..., 45, MSG_DONTWAIT, NULL, 0) = 45
11:41:44.166830 poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:41:44.166908 poll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=12, revents=POLLIN}])
11:41:44.168704 recvfrom(12, "$257\r\n{\"msg_type\":\"layer_message"..., 8192, MSG_DONTWAIT, NULL, NULL) = 265
11:41:44.168806 poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:41:44.168887 sendto(12, "*3\r\n$4\r\nHGET\r\n$10\r\nlive_layer\r\n$"..., 45, MSG_DONTWAIT, NULL, 0) = 45
11:41:44.168989 poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:41:44.169069 poll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=12, revents=POLLIN}])
11:41:44.170843 recvfrom(12, "$254\r\n{\"msg_type\":\"layer_message"..., 8192, MSG_DONTWAIT, NULL, NULL) = 262
11:41:44.170977 sendto(11, "\0\0\1\271\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0j\1\0\0\0\0\0\0{\"ms"..., 441, 0, NULL, 0) = 441
11:41:44.171176 alarm(0)                = 30
11:41:44.172994 recvfrom(10, "\0\0\0l\3\254\30\253m\7\320\254\30\253m\341N\0\0\0\27\1%\36\0\0\0=a:2:"..., 65535, 0, NULL, NULL) = 108
11:41:44.173124 alarm(30)               = 0
11:41:44.173218 close(12)               = 0
11:41:44.173300 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=362, ...}) = 0
11:41:44.173387 openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 12
11:41:44.173467 fstat(12, {st_mode=S_IFREG|0644, st_size=277, ...}) = 0
11:41:44.173544 read(12, "127.0.0.1\tlocalhost\n\n# The follo"..., 4096) = 277
11:41:44.173624 read(12, "", 4096)      = 0
11:41:44.173699 close(12)               = 0
11:41:44.173807 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 12
11:41:44.173889 connect(12, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.53")}, 16) = 0
11:41:44.173991 poll([{fd=12, events=POLLOUT}], 1, 0) = 1 ([{fd=12, revents=POLLOUT}])
11:41:44.174072 sendto(12, ":\271\1\0\0\1\0\0\0\0\0\0\24r-bp1ob5zk2hcwfs58n"..., 61, MSG_NOSIGNAL, NULL, 0) = 61
11:41:44.174233 poll([{fd=12, events=POLLIN}], 1, 2000) = 1 ([{fd=12, revents=POLLIN}])
11:41:44.174327 ioctl(12, FIONREAD, [77]) = 0
415 4 1
4个回答

walkor

运行 lsof -np 8701 看下fd为12的资源是什么

  • Tinywan 2022-07-20

    业务紧急,我直接给重启了

Tinywan

workerman.log 查看日志

2022-07-20 11:29:39 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:39 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:40 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:40 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:41 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:41 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:41 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:41 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:41 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:42 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:42 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:42 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:42 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:42 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:43 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:44 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:44 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:44 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:45 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:45 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:45 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:46 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:47 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:47 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:47 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:48 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:48 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:48 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:49 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:49 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:50 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:50 pid:1962 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
2022-07-20 11:29:51 pid:1963 SendBufferToWorker fail. May be the send buffer are overflow. See http://doc2.workerman.net/send-buffer-overflow.html
  • walkor 2022-07-20

    这个应该是businessworker进程卡在poll([{fd=12,那里导致的,那个businessworker进程长时间无法接收gateway发过来的信息

  • Tinywan 2022-07-20

    嗯,我下次按照这个手册排查一下,https://www.workerman.net/doc/gateway-worker/send-buffer-overflow.html

  • Tinywan 18天前

    这个问题又出现了了

    lsof -np 10261
    COMMAND   PID USER   FD      TYPE    DEVICE SIZE/OFF      NODE NAME
    php     10261  www  rtd       DIR     252,1     4096         2 /
    php     10261  www  txt       REG     252,1 53003800   1452431 /usr/local/php-7.4/bin/php
    php     10261  www  mem       REG     252,1    97176    401620 /lib/x86_64-linux-gnu/libnsl-2.27.so
    php     10261  www  mem       REG     252,1    47576    401625 /lib/x86_64-linux-gnu/libnss_nis-2.27.so
    php     10261  www  mem       REG     252,1    39744    401621 /lib/x86_64-linux-gnu/libnss_compat-2.27.so
    php     10261  www  mem       REG     252,1    26936    401622 /lib/x86_64-linux-gnu/libnss_dns-2.27.so
    php     10261  www  mem       REG     252,1    47568    401623 /lib/x86_64-linux-gnu/libnss_files-2.27.so
    php     10261  www  mem       REG     252,1   913448   1580661 /usr/local/libevent-2.1.12/lib/libevent_core-2.1.so.7.0.1
    php     10261  www  mem       REG     252,1   527336   1580665 /usr/local/libevent-2.1.12/lib/libevent_extra-2.1.so.7.0.1
    php     10261  www  mem       REG     252,1    98952   1580673 /usr/local/libevent-2.1.12/lib/libevent_openssl-2.1.so.7.0.1
    php     10261  www  mem       REG     252,1  1033840   1322683 /usr/local/php-7.4/lib/php/extensions/no-debug-non-zts-20190902/event.so
    php     10261  www  mem       REG     252,1  2850040   1322630 /usr/local/php-7.4/lib/php/extensions/no-debug-non-zts-20190902/redis.so
    php     10261  www  mem       REG     252,1  3004224   1053795 /usr/lib/locale/locale-archive
    php     10261  www  mem       REG     252,1    39208    401615 /lib/x86_64-linux-gnu/libcrypt-2.27.so
    php     10261  www  mem       REG     252,1   300888   1062361 /usr/lib/x86_64-linux-gnu/libhx509.so.5.0.0
    php     10261  www  mem       REG     252,1    60400   1062355 /usr/lib/x86_64-linux-gnu/libheimbase.so.1.0.0
    php     10261  www  mem       REG     252,1   165880   1062359 /usr/lib/x86_64-linux-gnu/libwind.so.0.0.0
    php     10261  www  mem       REG     252,1    31032   1050749 /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4
    php     10261  www  mem       REG     252,1    88680   1062348 /usr/lib/x86_64-linux-gnu/libroken.so.18.1.0
    php     10261  www  mem       REG     252,1   217560   1062357 /usr/lib/x86_64-linux-gnu/libhcrypto.so.4.1.0
    php     10261  www  mem       REG     252,1   661696   1062353 /usr/lib/x86_64-linux-gnu/libasn1.so.8.0.0
    php     10261  www  mem       REG     252,1   573464   1062363 /usr/lib/x86_64-linux-gnu/libkrb5.so.26.0.0
    php     10261  www  mem       REG     252,1    35360   1062365 /usr/lib/x86_64-linux-gnu/libheimntlm.so.0.1.0
    php     10261  www  mem       REG     252,1    14256    401498 /lib/x86_64-linux-gnu/libkeyutils.so.1.5
    php     10261  www  mem       REG     252,1    75776   1050819 /usr/lib/x86_64-linux-gnu/libtasn1.so.6.5.5
    php     10261  www  mem       REG     252,1  1237640   1049002 /usr/lib/x86_64-linux-gnu/libp11-kit.so.0.3.0
    php     10261  www  mem       REG     252,1    84032    393675 /lib/x86_64-linux-gnu/libgpg-error.so.0.22.0
    php     10261  www  mem       REG     252,1   265712   1062367 /usr/lib/x86_64-linux-gnu/libgssapi.so.3.0.0
    php     10261  www  mem       REG     252,1   109296   1062369 /usr/lib/x86_64-linux-gnu/libsasl2.so.2.0.25
    php     10261  www  mem       REG     252,1    43616   1051640 /usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1
    php     10261  www  mem       REG     252,1    14248    393352 /lib/x86_64-linux-gnu/libcom_err.so.2.1
    php     10261  www  mem       REG     252,1   199104   1051642 /usr/lib/x86_64-linux-gnu/libk5crypto.so.3.1
    php     10261  www  mem       REG     252,1   877056   1051638 /usr/lib/x86_64-linux-gnu/libkrb5.so.3.3
    php     10261  www  mem       REG     252,1   526688   1050765 /usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2
    php     10261  www  mem       REG     252,1   219304   1050801 /usr/lib/x86_64-linux-gnu/libnettle.so.6.4
    php     10261  www  mem       REG     252,1   211704   1050773 /usr/lib/x86_64-linux-gnu/libhogweed.so.4.4
    php     10261  www  mem       REG     252,1  1461856   1049007 /usr/lib/x86_64-linux-gnu/libgnutls.so.30.14.10
    php     10261  www  mem       REG     252,1  1562664   1050823 /usr/lib/x86_64-linux-gnu/libunistring.so.2.1.0
    php     10261  www  mem       REG     252,1  1159864    393318 /lib/x86_64-linux-gnu/libgcrypt.so.20.2.1
    php     10261  www  mem       REG     252,1 26904112   1050728 /usr/lib/x86_64-linux-gnu/libicudata.so.60.2
    php     10261  www  mem       REG     252,1    55544   1050152 /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2.10.8
    php     10261  www  mem       REG     252,1   327024   1050153 /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.10.8
    php     10261  www  mem       REG     252,1   305456   1051635 /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2
    php     10261  www  mem       REG     252,1    55136   1061308 /usr/lib/x86_64-linux-gnu/libpsl.so.5.2.0
    php     10261  www  mem       REG     252,1   113584   1062378 /usr/lib/x86_64-linux-gnu/librtmp.so.1
    php     10261  www  mem       REG     252,1   116656   1048672 /usr/lib/x86_64-linux-gnu/libidn2.so.0.3.3
    php     10261  www  mem       REG     252,1   153352   1062376 /usr/lib/x86_64-linux-gnu/libnghttp2.so.14.15.2
    php     10261  www  mem       REG     252,1   153984    393687 /lib/x86_64-linux-gnu/liblzma.so.5.2.2
    php     10261  www  mem       REG     252,1   144976    401628 /lib/x86_64-linux-gnu/libpthread-2.27.so
    php     10261  www  mem       REG     252,1  2030928    401613 /lib/x86_64-linux-gnu/libc-2.27.so
    php     10261  www  mem       REG     252,1    96616    401608 /lib/x86_64-linux-gnu/libgcc_s.so.1
    php     10261  www  mem       REG     252,1    80328   1063298 /usr/lib/x86_64-linux-gnu/libzip.so.4.0.0
    php     10261  www  mem       REG     252,1    87912   1063117 /usr/lib/x86_64-linux-gnu/libexslt.so.0.8.17
    php     10261  www  mem       REG     252,1   247952   1063118 /usr/lib/x86_64-linux-gnu/libxslt.so.1.1.29
    php     10261  www  mem       REG     252,1  1082648   1050774 /usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6
    php     10261  www  mem       REG     252,1   496784   1063281 /usr/lib/x86_64-linux-gnu/libonig.so.4.0.0
    php     10261  www  mem       REG     252,1  1796104   1050770 /usr/lib/x86_64-linux-gnu/libicuuc.so.60.2
    php     10261  www  mem       REG     252,1  2754872   1050756 /usr/lib/x86_64-linux-gnu/libicui18n.so.60.2
    php     10261  www  mem       REG     252,1    55304   1050760 /usr/lib/x86_64-linux-gnu/libicuio.so.60.2
    php     10261  www  mem       REG     252,1   735704   1050817 /usr/lib/x86_64-linux-gnu/libfreetype.so.6.15.0
    php     10261  www  mem       REG     252,1   424648   1063485 /usr/lib/x86_64-linux-gnu/libjpeg.so.8.1.2
    php     10261  www  mem       REG     252,1   202672   1061305 /usr/lib/x86_64-linux-gnu/libpng16.so.16.34.0
    php     10261  www  mem       REG     252,1   518600   1062379 /usr/lib/x86_64-linux-gnu/libcurl.so.4.5.0
    php     10261  www  mem       REG     252,1   116960    393760 /lib/x86_64-linux-gnu/libz.so.1.2.11
    php     10261  www  mem       REG     252,1  2917216   1062948 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
    php     10261  www  mem       REG     252,1   577312   1062957 /usr/lib/x86_64-linux-gnu/libssl.so.1.1
    php     10261  www  mem       REG     252,1  1834232   1050779 /usr/lib/x86_64-linux-gnu/libxml2.so.2.9.4
    php     10261  www  mem       REG     252,1    14560    401616 /lib/x86_64-linux-gnu/libdl-2.27.so
    php     10261  www  mem       REG     252,1  1700792    401617 /lib/x86_64-linux-gnu/libm-2.27.so
    php     10261  www  mem       REG     252,1    66728    393641 /lib/x86_64-linux-gnu/libbz2.so.1.0.4
    php     10261  www  mem       REG     252,1  1594864   1050839 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
    php     10261  www  mem       REG     252,1    31680    401630 /lib/x86_64-linux-gnu/librt-2.27.so
    php     10261  www  mem       REG     252,1    97072    401629 /lib/x86_64-linux-gnu/libresolv-2.27.so
    php     10261  www  mem       REG     252,1   179152    401609 /lib/x86_64-linux-gnu/ld-2.27.so
    php     10261  www  mem       REG     252,1    26376   1051162 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
    php     10261  www    0u      CHR     136,2      0t0         5 /dev/pts/2 (deleted)
    php     10261  www    1w      CHR       1,3      0t0         6 /dev/null
    php     10261  www    2w      CHR       1,3      0t0         6 /dev/null
    php     10261  www    3u     IPv4 668010317      0t0       TCP 172.24.171.109:37484->172.30.237.31:6379 (ESTABLISHED)
    php     10261  www    4u      CHR     136,2      0t0         5 /dev/pts/2 (deleted)
    php     10261  www    5u  a_inode      0,13        0     10598 [eventpoll]
    php     10261  www    6r     FIFO      0,12      0t0 668010669 pipe
    php     10261  www    7w     FIFO      0,12      0t0 668010669 pipe
    php     10261  www    8u     IPv4 668010670      0t0       TCP 172.24.171.109:52992->172.24.171.109:rmtcfg (ESTABLISHED)
    php     10261  www    9u     IPv4 668010678      0t0       TCP 172.24.171.109:37496->172.30.237.31:6379 (ESTABLISHED)
    php     10261  www   10u     IPv4 668010685      0t0       TCP 172.24.171.109:57894->172.24.171.109:cisco-sccp (ESTABLISHED)
    php     10261  www   11u     IPv4 668010686      0t0       TCP 172.24.171.109:39910->172.24.171.109:2001 (ESTABLISHED)
    php     10261  www   12u     IPv4 999893004      0t0       TCP 172.24.171.109:52572->172.30.237.31:6379 (ESTABLISHED)
  • Tinywan 18天前

    内存和CPU使用情况

    Tasks: 105 total,   1 running,  68 sleeping,   0 stopped,   0 zombie
    %Cpu0  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    KiB Mem :  8021284 total,  3142400 free,   358348 used,  4520536 buff/cache
    KiB Swap:        0 total,        0 free,        0 used.  7333556 avail Mem 
  • Tinywan 18天前

    进程跟踪

    # sudo strace -ttp 10261
    strace: Process 10261 attached
    13:52:26.597217 restart_syscall(<... resuming interrupted poll ...>
    
    ) = 0
    13:53:13.815227 poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
    13:53:13.815302 poll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, 60000
    
  • Tinywan 18天前

    用lsof -p $pid | grep $fd 查看进程在等待哪个外部资源的返回

    lsof -p 10261 | grep 12
    php     10261  www  mem       REG     252,1   913448   1580661 /usr/local/libevent-2.1.12/lib/libevent_core-2.1.so.7.0.1
    php     10261  www  mem       REG     252,1   527336   1580665 /usr/local/libevent-2.1.12/lib/libevent_extra-2.1.so.7.0.1
    php     10261  www  mem       REG     252,1    98952   1580673 /usr/local/libevent-2.1.12/lib/libevent_openssl-2.1.so.7.0.1
    php     10261  www  mem       REG     252,1  1237640   1049002 /usr/lib/x86_64-linux-gnu/libp11-kit.so.0.3.0
    php     10261  www  mem       REG     252,1   265712   1062367 /usr/lib/x86_64-linux-gnu/libgssapi.so.3.0.0
    php     10261  www  mem       REG     252,1 26904112   1050728 /usr/lib/x86_64-linux-gnu/libicudata.so.60.2
    php     10261  www  mem       REG     252,1    87912   1063117 /usr/lib/x86_64-linux-gnu/libexslt.so.0.8.17
    php     10261  www  mem       REG     252,1   577312   1062957 /usr/lib/x86_64-linux-gnu/libssl.so.1.1
    php     10261  www    6r     FIFO      0,12      0t0 668010669 pipe
    php     10261  www    7w     FIFO      0,12      0t0 668010669 pipe
    php     10261  www   12u     IPv4 999893004      0t0       TCP iZbp1hoizi4geybteffl6zZ:52572->172.30.237.31:6379 (ESTABLISHED)
  • walkor 18天前

    看起来是在等172.30.237.31:6379这个地址返回数据。redis用得是自建redis还是阿里云

  • Tinywan 18天前

    阿里云的

  • walkor 18天前

    阿里云的redis最好用个定时器,定时发一个心跳数据,维持连接,时间下小于60秒,比如55。
    阿里云redis可能会在redis连接空闲1分钟后清理连接,不发fin包通知那种,导致redis扩展无法知道连接已经不可用,认为连接仍然存活,但是数据一直收不到。

  • Tinywan 18天前

    嗯。目前排查阿里云Redis那边确实是过一会就会释放连接,但是gateway这边还一直在尝试连接。感谢!群主

  • yohe 18天前

    1

  • changepll 18天前

    请问阿里云redis加定时器, 是指在定时任务中加一个每55秒连一下随意查个key意思吗

  • walkor 18天前

Tinywan

找到相关的帖子了:在阿里云上遇到一个奇怪的 Redis 连接问题,每隔十来分钟,服务里的 Redis client 库就报告连接 Redis server 超时,当时花了很大功夫,发现是阿里云会断开长时间闲置的 TCP 连接,不给两头发 FIN or RST 包

差不多一两年前,在阿里云上遇到一个奇怪的 Redis 连接问题,每隔十来分钟,服务里的 Redis client 库就报告连接 Redis server 超时,当时花了很大功夫,发现是阿里云会断开长时间闲置的 TCP 连接,不给两头发 FIN or RST 包,而当时我们的 Redis server 没有打开 tcp_keepalive 选项,于是 Redis server 侧那个连接还存在于 Linux conntrack table 里,而 Redis client 侧由于连接池重用连接进行 get、set 发现连接坏掉就关闭了,所以 client 侧的对应 local port 回收了,当接下来 Redis 重用这个 local port 向 Redis server 发起连接时,由于 Redis server 侧的 conntrack table 里 <client_ip, client_port, redis-server, 6379> 四元组对应状态是 ESTABLISHED,所以自然客户端发来的 TCP SYN packet 被丢弃,Redis client 看到的现象就是连接超时。

https://zhuanlan.zhihu.com/p/52622856

  • 暂无评论
adminv

真好,利用大佬的教训,免费给我们培训

🔝