From b3d3f523f243bc4f93ececf6307aab7fcb3ffe04 Mon Sep 17 00:00:00 2001 From: Julia Date: Sun, 23 Apr 2017 17:06:06 -0700 Subject: [PATCH] update readme with instructions to get wayback capture url --- wayback-cdx-server/README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/wayback-cdx-server/README.md b/wayback-cdx-server/README.md index 3e20a1809f..1224ed288d 100644 --- a/wayback-cdx-server/README.md +++ b/wayback-cdx-server/README.md @@ -82,6 +82,10 @@ At this time, the following cdx fields are publicly available: `["urlkey","timestamp","original","mimetype","statuscode","digest","length"]` +To get the HTML of the capture, the URL is formatted as follows: `http://web.archive.org/web//` + +To get the original page back (without the Wayback Machine rewriting URLs on the page to point into the Archive), you can suffix the timestamp with `id_`, as follows: `http://web.archive.org/web/id_/` + It is possible to customize the [Field Order](#field-order) as well. The the **url=** value should be [url encoded](http://en.wikipedia.org/wiki/Percent-encoding) if the url itself contains a query.