busy如何找到对应问题, 找了好多相关的帖子都没有解决

z

strace -ttp

11:34:51.761589 recvfrom(7, "VV\345\204\300K\227\0\335\7\0\0\0\0\0\0\0T\0\0\0\3cursor\0;\0\0"..., 101, 0, NULL, NULL) = 101
11:34:51.761738 poll([{fd=8, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:34:51.761814 sendto(8, "*2\r\n$6\r\nEXISTS\r\n$19\r\ndbdata_co"..., 42, MSG_DONTWAIT, NULL, 0) = 42
11:34:51.761909 poll([{fd=8, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:34:51.761979 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}])
11:34:51.762032 recvfrom(8, ":1\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4
11:34:51.762113 poll([{fd=8, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:34:51.762160 sendto(8, "*2\r\n$3\r\nGET\r\n$19\r\ndbdata_count"..., 39, MSG_DONTWAIT, NULL, 0) = 39
11:34:51.762219 poll([{fd=8, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:34:51.762260 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}])
11:34:51.762439 recvfrom(8, "$7\r\n3096842\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 13
11:34:51.762616 sendmsg(7, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\354\0\0\0", iov_len=4}, {iov_base="\302K\227\0", iov_len=4}, {iov_base="\0\0\0\0", iov_len=4}, {iov_base="\335\7\0\0", iov_len=4}, {iov_base="\0\0\0\0", iov_len=4}, {iov_base="\0", iov_len=1}, {iov_base="\327\0\0\0\2find\0\10\0\0\0product\0\3filter\0\16\0"..., iov_len=215}], msg_iovlen=7, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 236
11:34:51.762746 recvfrom(7, 0x2c6c0c0, 4, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
11:34:51.762819 poll([{fd=7, events=POLLIN|POLLERR|POLLHUP}], 1, 299999) = 1 ([{fd=7, revents=POLLIN}])
11:34:51.763003 recvfrom(7, "i\0\0\0", 4, 0, NULL, NULL) = 4
11:34:51.763067 recvfrom(7, "yV\345\204\302K\227\0\335\7\0\0\0\0\0\0\0T\0\0\0\3cursor\0;\0\0"..., 101, 0, NULL, NULL) = 101
11:34:51.763195 poll([{fd=8, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
11:34:51.763272 sendto(8, "*2\r\n$6\r\nEXISTS\r\n$19\r\ndbdata_co"..., 42, MSG_DONTWAIT, NULL, 0) = 42
11:34:51.763368 poll([{fd=8, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 1 ([{fd=8, revents=POLLIN}])
11:34:51.763424 recvfrom(8, ":", 1, MSG_PEEK, NULL, NULL) = 1
11:34:51.763505 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}])
11:34:51.763574 recvfrom(8, ":1\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4
11:34:51.763652 poll([{fd=8, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0strace: Process 22913 detached
^C[root@s218663 ]# lsof -nPp 22913

kill -SIGALRM 上面的没反应,还是一直在刷,要ctrl+c后好一会才会退出

lsof -nPp

php     22913  www  mem    REG     253,0     19248   67224700 /usr/lib64/libdl-2.17.so
php     22913  www  mem    REG     253,0   1136944   67224702 /usr/lib64/libm-2.17.so
php     22913  www  mem    REG     253,0     14424   67224769 /usr/lib64/libutil-2.17.so
php     22913  www  mem    REG     253,0    995840   67225072 /usr/lib64/libstdc++.so.6.0.19
php     22913  www  mem    REG     253,0     43712   67224765 /usr/lib64/librt-2.17.so
php     22913  www  mem    REG     253,0    109976   67224761 /usr/lib64/libresolv-2.17.so
php     22913  www  mem    REG     253,0     40600   67224698 /usr/lib64/libcrypt-2.17.so
php     22913  www  mem    REG     253,0    163312   67224685 /usr/lib64/ld-2.17.so
php     22913  www    0r  FIFO       0,9       0t0  904733306 pipe
php     22913  www    1w  FIFO       0,9       0t0  904733307 pipe
php     22913  www    2w  FIFO       0,9       0t0  904733308 pipe
php     22913  www    3u   REG     253,0         0  100664776 /tmp/.ZendSem.dOIuh6 (deleted)
php     22913  www    4r   REG     253,0        92  108152689 /www/wwwroot/webman/start.php
php     22913  www    5u  sock       0,7       0t0  905615217 protocol: TCP
php     22913  www    6u  IPv4 904723315       0t0        TCP *:16701 (LISTEN)
php     22913  www    7u  IPv4 904741326       0t0        TCP 127.0.0.1:49320->127.0.0.1:27017 (ESTABLISHED)
php     22913  www    8u  IPv4 904748195       0t0        TCP 127.0.0.1:53248->127.0.0.1:6379 (ESTABLISHED)
[root@s218663 ]# php start.php status
Workerman[start.php] status 
----------------------------------------------GLOBAL STATUS----------------------------------------------------
Workerman version:4.1.14          PHP version:8.1.13
start time:2024-03-08 10:12:30   run 0 days 1 hours   
load average: 20.51, 21.3, 21.7  event-loop:\Workerman\Events\Select
2 workers       17 processes
worker_name  exit_status      exit_count
webman       0                0
monitor      0                0
----------------------------------------------PROCESS STATUS---------------------------------------------------
pid     memory  listening            worker_name  connections send_fail timers  total_request qps    status
22898   N/A     http://0.0.0.0:16701 webman       N/A         N/A       N/A     N/A           N/A    [busy] 
22899   2.43M   http://0.0.0.0:16701 webman       0           0         2       1401          0      [idle]
22900   2.43M   http://0.0.0.0:16701 webman       0           0         2       1439          0      [idle]
22901   2.43M   http://0.0.0.0:16701 webman       0           0         2       1427          0      [idle]
22902   N/A     http://0.0.0.0:16701 webman       N/A         N/A       N/A     N/A           N/A    [busy] 
22903   2.62M   http://0.0.0.0:16701 webman       0           0         2       1447          0      [idle]
22904   2.37M   http://0.0.0.0:16701 webman       0           0         2       1645          0      [idle]
22905   N/A     http://0.0.0.0:16701 webman       N/A         N/A       N/A     N/A           N/A    [busy] 
22906   2.43M   http://0.0.0.0:16701 webman       0           0         2       1660          0      [idle]
22907   2.43M   http://0.0.0.0:16701 webman       0           0         2       1526          0      [idle]
22908   N/A     http://0.0.0.0:16701 webman       N/A         N/A       N/A     N/A           N/A    [busy] 
22909   N/A     http://0.0.0.0:16701 webman       N/A         N/A       N/A     N/A           N/A    [busy] 
22910   2.44M   http://0.0.0.0:16701 webman       0           0         2       1567          0      [idle]
22911   2.43M   http://0.0.0.0:16701 webman       0           0         2       1627          0      [idle]
22912   2.44M   http://0.0.0.0:16701 webman       1           0         2       1499          0      [idle]
22913   N/A     http://0.0.0.0:16701 webman       N/A         N/A       N/A     N/A           N/A    [busy] 
22914   1.47M   none                 monitor      0           0         2       0             0      [idle]
----------------------------------------------PROCESS STATUS---------------------------------------------------
Summary 21M     -                    -            1           0         22      15238         0      [Summary] 

poll([{fd=8, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout) 这个timeout是导致的问题吗,如果是要怎么知道是什么问提,如果是fd=8 的 sendto(8, "*2\r\n$6\r\nEXISTS\r\n$19\r\ndbdata_co"..., 42, MSG_DONTWAIT, NULL, 0) = 42这个吗,但是这个是存到redis的统计数据量,会是redis查询慢导致的吗,但是查询redis又正常。。,运行第二天这个start的网站打开就都504了,每次都要重启,应该如何找到导致的问题

253 3 0
3个回答

walkor

Timeout没问题。
sendto(8, "*2\r\n$6\r\nEXISTS\r\n$19\r\ndbdata_co 对应的就是 fd=8 127.0.0.1:6379 redis操作。

从strace上看是业务有死循环或者大的循环一直在读写 127.0.0.1:27017 和 127.0.0.1:6379,进程在处理这个循环的时候无法处理其它事情就进入busy状态了

  • 晚安。 2024-03-08

    老大 这些信息看起来像乱码。是根据什么来看出来他是在做什么操作的吗

  • z 2024-03-08

    因为他是大佬,一看就明白:)

z
public static function countItemsByDb($dbname): array
    {
        $key=$dbname.'_countItems';
        $count=0;
        if (Redis::exists($key)) {
            $count=Redis::get($key);
        }else{
            $manager = self::createManager($dbname);
            $command = new Command(["count" => self::COLLECTION_PRODUCT]);
            $result = $manager->executeCommand($dbname, $command);
            if($result) {
                $count = current($result->toArray())->n;
                Redis::set($key, $count);
            }
        }
        $result=[];
        $result['total']=$count;

        return $result;
    }

这样统计一次然后就直接取redis了,不应该会导致busy,Redis::exists会判断失败导致多次统计吗

  • walkor 2024-03-08

    27017 是什么端口?

  • z 2024-03-08

    mongo的

  • walkor 2024-03-08

    找下哪里有同时操作mongo和redis的

  • z 2024-03-08

    只有这一处,其他是redis和文件的, 改成文件保存了,,这个查找失败次数是不是就会导致没读到
    Redisv7.2.4
    字段 当前值 说明
    uptime_in_days 42 已运行天数
    tcp_port 6379 当前监听端口
    connected_clients 2845 连接的客户端数量
    used_memory_rss 19.43 MB Redis当前占用的系统内存总量
    used_memory 16.66 MB Redis历史分配内存的峰值
    mem_fragmentation_ratio 1.17 内存碎片比率
    total_connections_received 23851 运行以来连接过的客户端的总数量
    total_commands_processed 17216116232 运行以来执行过的命令的总数量
    instantaneous_ops_per_sec 39006 服务器每秒钟执行的命令数量
    keyspace_hits 17048828399 查找数据库键成功的次数
    keyspace_misses 165370811 查找数据库键失败的次数
    hit 99.04 查找数据库键命中率
    latest_fork_usec 1343 最近一次 fork() 操作耗费的微秒数

  • z 2024-03-08

    用文件缓解了,间隔多次查询出现一个busy

z

结帖,经过几天观察,是predis导致的,去掉 ,用默认的redis就正常

//'client' => 'predis',
    'default' => [
        'host' => getenv('REDIS_HOST') ?: '127.0.0.1',
        'password' => getenv('REDIS_PASSWORD'),
        'port' => getenv('REDIS_PORT')?: 6379,
        'database' => getenv('REDIS_DATABASE')?: 0,
        //'maxmemory-policy' => getenv('redis_maxmemory_policy')?:'volatile-lru',// 设置内存清理策略
        //'compress' => true, // 开启压缩
    ],
  • 暂无评论
🔝