Skip to content

Commit 4ff8d9e

Browse files
committed
Update README.md
1 parent d4b4fcd commit 4ff8d9e

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,21 @@
11
MHTML-to-HTML-Decoding-in-C-
22
============================
33

4+
MHTML (short for MIME HTML) is a web archive that stores a web page’s HTML and (normally remote) resources in one file. It is composed in a manner similar to an HTML email, using the content-type ‘multipart-related’. The data is split into parts and base64 encoded.
5+
46
Although this code will decode .mht and .mhtml files, in it’s current state it will only decode the base64 content-transfer encoding. It has been tested on .mhtml files exported from SQL Server Reporting Service (SSRS). It features it’s own logging and a way return valid HTML (with images)
7+
8+
The return of the decompression value is a List<string[]>. Each List element is a section of the MHTML, and the contents of each List element is as follows:
9+
string[0] is the Content-Type
10+
string[1] is the Content-Name
11+
string[2] is the converted data
12+
13+
Using the getHTMLText() method will return the full HTML and will use the cid:’s to insert the base64 image data (valid in newer browsers).
14+
15+
16+
17+
And here is how to use it
18+
19+
string mhtml = "This is your MHTML string"; // Make sure the string is in UTF-8 encoding
20+
MHTMLParser parser = new MHTMLParser(mhtml);
21+
string html = parser.getHTMLText(); // This is the converted HTML

0 commit comments

Comments
 (0)