-
-
Notifications
You must be signed in to change notification settings - Fork 287
Description
Hello Andris,
We've been happy users of mailparser for about 8 years now (time flies!). Running a large scale email ingestion production system on it.
2 days ago, we started experiencing event loops blockages that lasted from 2 to 6 minutes per micro-service running mailparser. Recurringly. We've managed to extract the recurring email that causes this issue, in EML format.
The email has an enormous text content, we're talking about 2MB encoded in this part:
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=UTF-8;
format=flowed
The email eventually gets parsed by the library, but it takes a tremendous time.
Could I suggest adding some configurable/opt-in safety limits on the text/HTML parts size that get parsed?
I unfortunately cannot enforce those limits based on the total size of the raw mail Buffer being passed to the library, before running the actual parsing, since most emails would be MB heavy due to attachments. I believe those limits will have to be enforced within the mailparser library.
Let me know if I should send you the original EML causing this hang issue over email for reproduction (I cannot share it here since it comes from one of our user, and thus is a private email).
Adding some more context, parsing of this EML has been tested to take:
- 15s on 1x Apple M1 Pro core
- 2 minutes on loaded 1x Intel Xeon core (circa 2019)
Valerian.