WeChat, as the most popular mobile IM app in China, doesn't provide any methods to read its message history. Customers are not able to analyze their own chat messages or interact with other data analysis tools.
We provide this tool that can parse WeChat messages from a rooted android phone. This is necessary to provide interoperability between WeChat messages and other message analysis tools. As examples, we provide sample scripts to obtain statistics of the message history and render the messages into self-contained html files including voice messages, images, emojis, videos, etc. Users can also write custom programs based on this tool to manage their chat messages.
The tool is last verified to work with latest version of WeChat on 2025/01/01. If the tool works for you, please take a moment to add your phone/OS to the wiki.
- adb and rooted android phone connected to a Linux/Mac OSX/Win10+Bash.
- Python >= 3.8
- sox (command line tools)
- Silk audio decoder (included; build it with
./third-party/compile_silk.sh) - Other python dependencies:
pip install -r requirements.txt.
- Pull database file and (for older WeChat versions) avatar index:
- Automatic:
./android-interact.sh db. It may use an incorrect userid. - Manual:
- Figure out your
${userid}by inspecting the contents of/data/data/com.tencent.mm/MicroMsgon the root filesystem of the device. It should be a 32-character-long name consisting of hexadecimal digits. - Get
/data/data/com.tencent.mm/MicroMsg/${userid}/EnMicroMsg.dbfrom the device.
- Figure out your
-
Decode
EnMicroMsg.db. We do not provide instructions to do that. -
Copy the unencrypted WeChat user resource directory
/data/data/com.tencent.mm/MicroMsg/${userid}/{avatar,emoji,image2,sfs,video,voice2}from the phone to theresourcedirectory:./android-interact.sh res- Change
RES_DIRin the script if the location of these directories is different on your phone. For older version of WeChat, the directory may be/mnt/sdcard/tencent/MicroMsg/ - This can take a while. It can be faster to first archive it with
tarwith or without compression, and then copy the archive,busybox taris recommended as the Android system'starmay choke on long paths. - In the end, we need a
resourcedirectory with the following subdir:avatar,emoji,image2,sfs,video,voice2.
-
(Optional) Install and start a WXGF decoder server on an android device. Without this, certain WXGF images will not be rendered or will be rendered in low resolution. See WXGFDecoder for instructions.
-
(Optional) Download the emoji cache from here and decompress it under
wechat-dump. This will avoid downloading too many emojis during rendering.wget -c https://github.com/ppwwyyxx/wechat-dump/releases/download/0.1/emoji.cache.tar.bz2 tar xf emoji.cache.tar.bz2
-
Parse and dump text messages of every chat (requires decoded database):
./dump-msg.py decoded.db output_dir -
List all chats (required decoded database):
./list-chats.py decoded.db -
Generate statistics report on text messages (requires
output_dirfrom./dump-msg.py):./count-message.sh output_dir -
Dump messages of one contact to html, containing voice messages, emojis, and images (requires decoded database and
resource):./dump-html.py "<contact_display_name>"- The output file is
output.html. Check./dump-html.py -hto use different input/output paths. - Add
--wxgf-server ws://xx.xx.xx.xx:xxxxto use a WXGF decoder server.
- The output file is
Screenshots of generated html:
See here for an example html.
- After chat history migration, some emojis in the
EmojiInfotable don't have corresponding URLs but only a md5 - they are not downloaded by WeChat until the message needs to be displayed. We don't know how to manually download these emojis. - Decoding WXGF images using an android app is too complex. Looking for an easier way (e.g. qemu).
- Fix rare unhandled message types: > 10000 and < 0
- Better user experiences... see
grep 'TODO' wechat -R
