diff --git a/readme.textile b/readme.textile index 7264c76..29e7392 100644 --- a/readme.textile +++ b/readme.textile @@ -285,7 +285,7 @@ Here we know from looking at the markup of the page that headlines might match a (zipmap [:headline :byline :summary] (map #(str/replace % #"\n" "") result)))) -Here we take a node and extract the match. Note that we have to call first on the result of html/select because html/select always returns a sequence of nodes and not a single node. zipmap is a handy function, it allows us to take two sequences and zip them up into a hash-map. So here we take only the text nodes from the matches and remove any newline characters before we finally zip it up into a tidy hash-map. +Here we take a node and extract the match. Note that we have to call first on the result of html/select because html/select always returns a sequence of nodes and not a single node. zipmap is a handy function, it allows us to take two sequences and zip them up into a hash-map. So here we take only the text nodes from the matches and remove any newline characters before we finally zip it up into a tidy hash-map. Because this scrape is not comprehensive we might match empty stories, so we define a function empty-story? that checks for that. We use this to filter out any empty stories: