Skip to content

[漏洞] Article.get_all()报错KeyError: 'readInfo',返回数据结构已变更 #994

@jishux2

Description

@jishux2

Python 版本: 3.13.11 (CI环境) / 3.11.3 (本地环境)

模块版本: Upstream main branch (commit 3dab7ad) / 17.4.1

运行环境: Windows / Linux (GitHub Actions)

模块路径: bilibili_api.article

解释器: cpython

使用的网络请求库: httpx (CI环境) / aiohttp (本地环境)

报错信息:

Traceback (most recent call last):
  File "D:\bilibili-api\test.py", line 87, in <module>
    asyncio.run(main())
  File "D:\Python\Lib\asyncio\runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\asyncio\base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "D:\bilibili-api\test.py", line 71, in main
    info = await fetch_article_info(cvid)  # 异步获取文章信息
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\bilibili-api\test.py", line 34, in fetch_article_info
    info = await a.get_all()  # 异步获取文章的所有信息
           ^^^^^^^^^^^^^^^^^
  File "D:\bilibili-api\bilibili_api\article.py", line 618, in get_all
    cache_pool.article2dynamic[self.__cvid] = self.__get_all_data["readInfo"][
                                              ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: 'readInfo'

报错代码:

import os
import asyncio
import bilibili_api.article

async def fetch_article_info(cvid):
    # 仅展示核心复现逻辑,省略Cookie处理部分
    credential = bilibili_api.Credential() 
    a = bilibili_api.article.Article(cvid=cvid, credential=credential)
    info = await a.get_all()
    return info

async def main():
    # 示例文章ID:17152176
    await fetch_article_info("17152176")

if __name__ == "__main__":
    asyncio.run(main())

问题描述:

我部署在GitHub Actions上的工作流每三天执行一次上述脚本,用于抓取专栏数据。该工作流在三天前(及更早之前)的运行中均表现正常,但在今天的自动运行中出现了错误。

详细情况:

  1. 报错现象演变
    最初在CI环境自动运行时,报出的错误为bilibili_api.exceptions.ApiException.ApiException: 未找到相关信息。但随后无论是手动重新触发CI工作流,还是在本地环境进行复现(涵盖了旧版本及同步上游最新代码的版本),错误信息均稳定表现为KeyError: 'readInfo'

  2. 数据结构排查
    经排查,get_initial_state目前返回的数据字典中确实不存在readInfo键。将抓取到的self.__get_all_data导出后,发现其顶层结构如下,似乎变成了Opus相关的结构:

{
    "bmgDefDomain": "",
    "isMac": true,
    "modern": true,
    "features": [
        "onlyfansVote",
        "onlyfansAssetsV2",
        "decorationCard",
        "htmlNewStyle",
        "ugcDelete",
        "editable",
        "opusPrivateVisible",
        "tribeeEdit",
        "avatarAutoTheme",
        "avatarTypeOpus",
        "sunflowerStyle",
        "articleEnhance",
        "cardsEnhance",
        "eva3CardOpus",
        "eva3CardVideo",
        "eva3CardComment",
        "eva3CardVote"
    ],
    "id": "672744835385393175",
    "detail": {
        "basic": {
            "collection_id": 577852,
            "comment_id_str": "17152176",
            "comment_type": 12,
            "title": "[高中数学知识点汇总] 调和点列(2) - 哔哩哔哩",
            "uid": 179921772
        },
        "id_str": "672744835385393175",
        "modules": [
            // ... 省略具体模块内容 ...
        ],
        "type": 1
    },
    "isClient": false,
    "isPreview": false,
    // ... 其他字段
}

目前该问题导致article.py中依赖readInfo字段的逻辑无法继续执行。

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug漏洞

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions