Kouhei Sutou
null+****@clear*****
Thu Mar 2 00:10:27 JST 2017
Kouhei Sutou 2017-03-02 00:10:27 +0900 (Thu, 02 Mar 2017) New Revision: 3bc8173a513a102005b3761fa4da5816ff2bc865 https://github.com/ranguba/chupa-text-decomposer-html/commit/3bc8173a513a102005b3761fa4da5816ff2bc865 Message: Scrub invalid characters Modified files: lib/chupa-text/decomposers/html.rb Modified: lib/chupa-text/decomposers/html.rb (+1 -1) =================================================================== --- lib/chupa-text/decomposers/html.rb 2017-03-02 00:03:41 +0900 (42d1bef) +++ lib/chupa-text/decomposers/html.rb 2017-03-02 00:10:27 +0900 (3b0095c) @@ -37,7 +37,7 @@ module ChupaText doc = Nokogiri::HTML.parse(html, nil, guess_encoding(html)) body_element = (doc % "body") if body_element - body = body_element.text.gsub(/^\s+|\s+$/, '') + body = body_element.text.scrub.gsub(/^\s+|\s+$/, '') else body = "" end -------------- next part -------------- HTML����������������������������...Download