Skip to content

Optimize the opera log storage logic through queue #750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 7, 2025

Conversation

IAseven
Copy link
Contributor

@IAseven IAseven commented Aug 4, 2025

存在问题

原有的日志中间件在每次接收请求时都会调用数据库插入接口,当高并发场景可能会导致数据库写入瓶颈

解决方案

使用本地内存消息队列来实现超时批量插入

压测性能

压测脚本

-- login.lua
wrk.method = "POST"
wrk.headers["Content-Type"] = "application/json"
wrk.headers["User-Agent"] = "wrk"
wrk.body = '{"username":"string", "password":"string", "captcha":"string"}'

改造前

## 服务端启动命令、输出结果
$ uvicorn main:app --no-access-log --log-level critical | grep 处理日志

## 压测结果
$ wrk -t10 -c10 -d20s -s login.lua http://127.0.0.1:8000/api/v1/auth/login
Running 20s test @ http://127.0.0.1:8000/api/v1/auth/login
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    71.93ms  108.58ms 864.02ms   94.49%
    Req/Sec    21.08      5.37    30.00     72.83%
  4229 requests in 21.53s, 1.29MB read
  Non-2xx or 3xx responses: 4229
Requests/sec:    196.45
Transfer/sec:     61.58KB

## 插入日志数量
fba=# select count(*) from sys_opera_log;
 count
-------
  4211
(1 row)

改造后

## 服务端启动命令、输出结果
$ uvicorn main:app --no-access-log --log-level critical | grep 处理日志
2025-08-06 09:21:33.948 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:35.434 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:36.856 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:38.367 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:39.850 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:41.226 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:42.702 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:44.017 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:45.394 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:46.772 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:48.188 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:49.499 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:51.021 | INFO     | - | 处理日志: 500 条.
2025-08-06 09:21:53.022 | INFO     | - | 处理日志: 423 条.

## 压测结果
$ wrk -t10 -c10 -d20s -s login.lua http://127.0.0.1:8000/api/v1/auth/login
Running 20s test @ http://127.0.0.1:8000/api/v1/auth/login
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    30.11ms   13.22ms 137.71ms   94.18%
    Req/Sec    34.73      7.54    50.00     90.16%
  6913 requests in 20.03s, 2.12MB read
  Non-2xx or 3xx responses: 6913
Requests/sec:    345.13
Transfer/sec:    108.19KB

## 插入日志数量
fba=# select count(*) from sys_opera_log;
 count
-------
  6923
(1 row)

@wu-clan wu-clan self-requested a review August 5, 2025 05:51
Copy link
Member

@wu-clan wu-clan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while True 事件是否合理?我不确定它是否会占用大量资源,当无用户操作时,此程序无限期运行,轮询时间将取决于 timeout 配置

@downdawn 可以看看这个吗?

@wu-clan wu-clan changed the title ✨ feat: 操作日志中间件添加批量插入功能 Optimize the opera log storage logic through queue Aug 5, 2025
@downdawn
Copy link
Collaborator

downdawn commented Aug 5, 2025

while True 事件是否合理?我不确定它是否会占用大量资源,当无用户操作时,此程序无限期运行,轮询时间将取决于 timeout 配置

@downdawn 可以看看这个吗?

这个while True肯定不合理的,这个有很多解决方案,但是感觉都容易弄成套娃。

最豪的就是缓存redis+定时任务

我的建议是:内存缓冲 + 批量更新

@IAseven
Copy link
Contributor Author

IAseven commented Aug 5, 2025

@downdawn 能说明 whie True 不合理的地方吗,我认为这里就是一个永久运行的协程任务,该任务会等待指定 timeout 指定超时时间或者满足内存队列中数据达到指定size数据后,批量插入数据到数据库,这里并不会导致cpu满载,超时时间默认是1秒,也可以指定更长超时时间,我认为一秒的计时频率并不会给程序带来多大的cpu开销

@downdawn
Copy link
Collaborator

downdawn commented Aug 5, 2025

@downdawn 能说明 whie True 不合理的地方吗,我认为这里就是一个永久运行的协程任务,该任务会等待指定 timeout 指定超时时间或者满足内存队列中数据达到指定size数据后,批量插入数据到数据库,这里并不会导致cpu满载,超时时间默认是1秒,也可以指定更长超时时间,我认为一秒的计时频率并不会给程序带来多大的cpu开销

仔细看了一下,这里用whie True也是正常的,只是个人对whie True很谨慎。

内存缓冲 + 批量更新的方案,缺点就是可能会丢失部分日志。

如果要实现分布式,依赖redis做缓冲和分布式锁的方案我觉得会更加好一点。

@downdawn
Copy link
Collaborator

downdawn commented Aug 5, 2025

而且既然要做,这里可能要做一下对比,压测一下性能,然后再做选择

@IAseven
Copy link
Contributor Author

IAseven commented Aug 6, 2025

@downdawn 感谢回复,我后面会对性能出一份压测数据,

  1. 内存缓冲 + 批量更新的方案 确实在一些场景会丢失部分数据
  2. 没有采用redis的原因,
    开始有考虑采用redis和其他分布式消息中间件来代替内存队列,没有采用分布式消息中间件的原因
    • 操作日志记录应该不是强一致的,可以容忍部分数据丢失和延迟展示
    • 分布式消息队列会增加组件间的网络传输开销
      最终采用内存队列方式,性能应该是最好的,即使批量插入报错,也可以把错误记录写入本地文件、其他的消息中间件或者退化成单条插入

@IAseven
Copy link
Contributor Author

IAseven commented Aug 6, 2025

@downdawn @wu-clan 已更新压测数据

@wu-clan
Copy link
Member

wu-clan commented Aug 6, 2025

@IAseven 十分感谢,我将在今晚审核这个

@wu-clan
Copy link
Member

wu-clan commented Aug 6, 2025

@IAseven 我对部分代码进行了更新,如果你有任何疑问,请随时答复,如果没有其他问题,我将稍后合并此 PR.

@@ -177,6 +177,8 @@ class Settings(BaseSettings):
'new_password',
'confirm_password',
]
OPERA_LOG_QUEUE_MAX: int = 100
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. OPERA_LOG_QUEUE_MAX 参数名有点歧义,看上去是整个queue队列的maxsize,应该是单次从queue批量获取数据量
  2. OPERA_LOG_QUEUE_TIMEOUT 一分钟是否有点太长?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Agree
  2. 1 分钟感觉还好,空闲状态下,不会过于频繁执行任务,并发情况下,1分钟则并不重要,因为 items 会很快填满

while len(results) < max_items:
item = await queue.get()
results.append(item)
while len(items) < max_items:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我不太明白这里使用自定义计时器优势有哪些
asyncio.wait_for 可以等该任意可等待对象,
queue.get 默认会一直阻塞,直到队列有数据可以弹出
文档 https://docs.python.org/zh-cn/3.13/library/asyncio-task.html#asyncio.wait_for

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我仔细想了下,collector 实现确实更符合目标,总体超时,而不是多开携程

@wu-clan wu-clan merged commit e09062e into fastapi-practices:master Aug 7, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants