[Groonga-commit] ranguba/chupa-text at d2e0aae [master] Add XML decomposer

Back to archive index

Kouhei Sutou null+****@clear*****
Sat Jan 4 20:52:15 JST 2014


Kouhei Sutou	2014-01-04 20:52:15 +0900 (Sat, 04 Jan 2014)

  New Revision: d2e0aae1e0c62a01b9c8b55c2d3a98fac6a7ed1c
  https://github.com/ranguba/chupa-text/commit/d2e0aae1e0c62a01b9c8b55c2d3a98fac6a7ed1c

  Message:
    Add XML decomposer

  Added files:
    lib/chupa-text/decomposers/xml.rb
    test/decomposers/test-xml.rb

  Added: lib/chupa-text/decomposers/xml.rb (+55 -0) 100644
===================================================================
--- /dev/null
+++ lib/chupa-text/decomposers/xml.rb    2014-01-04 20:52:15 +0900 (537806c)
@@ -0,0 +1,55 @@
+# Copyright (C) 2013  Kouhei Sutou <kou �� clear-code.com>
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+
+require "rexml/document"
+require "rexml/streamlistener"
+
+module ChupaText
+  module Decomposers
+    class XML < Decomposer
+      registry.register("xml", self)
+
+      def target?(data)
+        data.extension == "xml" or
+          data.mime_type == "text/xml"
+      end
+
+      def decompose(data)
+        text = ""
+        listener = Listener.new(text)
+        data.open do |input|
+          parser = REXML::Parsers::StreamParser.new(input, listener)
+          parser.parse
+        end
+        text_data = TextData.new(text)
+        text_data.uri = data.uri
+        yield(text_data)
+      end
+
+      class Listener
+        include REXML::StreamListener
+
+        def initialize(output)
+          @output = output
+        end
+
+        def text(text)
+          @output << text
+        end
+      end
+    end
+  end
+end

  Added: test/decomposers/test-xml.rb (+58 -0) 100644
===================================================================
--- /dev/null
+++ test/decomposers/test-xml.rb    2014-01-04 20:52:15 +0900 (bf53e64)
@@ -0,0 +1,58 @@
+# Copyright (C) 2013  Kouhei Sutou <kou �� clear-code.com>
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+
+class TestDecomposersXML < Test::Unit::TestCase
+  include Helper
+
+  def setup
+    @decomposer = ChupaText::Decomposers::XML.new({})
+  end
+
+  sub_test_case("decompose") do
+    def test_body
+      xml = <<-XML
+<root>
+  Hello
+  <sub-element attribute="value">&amp;</sub-element>
+  World
+</root>
+      XML
+      text = <<-TEXT
+
+  Hello
+  &
+  World
+
+      TEXT
+      assert_equal([text],
+                   decompose(xml).collect(&:body))
+    end
+
+    private
+    def decompose(xml)
+      data = ChupaText::Data.new
+      data.path = "hello.xml"
+      data.mime_type = "text/xml"
+      data.body = xml
+
+      decomposed = []
+      @decomposer.decompose(data) do |decomposed_data|
+        decomposed << decomposed_data
+      end
+      decomposed
+    end
+  end
+end
-------------- next part --------------
HTML����������������������������...
Download 



More information about the Groonga-commit mailing list
Back to archive index