[Groonga-commit] groonga/groonga at dd11f39 [master] doc: document new regular expression behavior

Back to archive index

Kouhei Sutou null+****@clear*****
Tue Sep 8 12:38:27 JST 2015


Kouhei Sutou	2015-09-08 12:38:27 +0900 (Tue, 08 Sep 2015)

  New Revision: dd11f396c9b86728ef2a84d6be0dd5e9ede049ee
  https://github.com/groonga/groonga/commit/dd11f396c9b86728ef2a84d6be0dd5e9ede049ee

  Message:
    doc: document new regular expression behavior

  Modified files:
    doc/locale/ja/LC_MESSAGES/reference.po
    doc/source/example/reference/regular_expression/character_class_characters.log
    doc/source/example/reference/regular_expression/usage_filter.log
    doc/source/example/reference/regular_expression/usage_query.log
    doc/source/reference/regular_expression.rst

  Modified: doc/locale/ja/LC_MESSAGES/reference.po (+45 -3)
===================================================================
--- doc/locale/ja/LC_MESSAGES/reference.po    2015-09-08 12:06:36 +0900 (7417083)
+++ doc/locale/ja/LC_MESSAGES/reference.po    2015-09-08 12:38:27 +0900 (12a41b2)
@@ -7,7 +7,7 @@ msgid ""
 msgstr ""
 "Project-Id-Version: 1.2.1\n"
 "Report-Msgid-Bugs-To: \n"
-"PO-Revision-Date: 2015-08-30 23:37+0900\n"
+"PO-Revision-Date: 2015-09-08 12:38+0900\n"
 "Last-Translator: Masafumi Yokoyama <yokoyama �� clear-code.com>\n"
 "Language-Team: Japanese\n"
 "Language: ja\n"
@@ -17261,6 +17261,41 @@ msgstr ""
 "します。これは逐次検索よりも非常に高速です。インデックスを使って評価できるパ"
 "ターンについては後述します。"
 
+msgid ""
+"Groonga normalizes match target text by :ref:`normalizer-auto` normalizer "
+"when Groonga doesn't use index for regular expression search. It means that "
+"regular expression that has upper case such as ``Groonga`` never match. "
+"Because :ref:`normalizer-auto` normalizer normalizes all alphabets to lower "
+"case. ``groonga`` matches to both ``Groonga`` and ``groonga``."
+msgstr "Groongaは正規表現検索にインデックスを使わないときは、 :ref:`normalizer-auto` ノーまライザーでマッチ対象のテキストを正規化します。これは、 ``Groonga`` というような大文字を使った正規表現は必ずマッチに失敗するということです。なぜなら、 :ref:`normalizer-auto` ノーマライザーはすべてのアルファベットを小文字に正規化するからです。 ``groonga`` は ``Groonga`` にも ``groonga`` にも両方にマッチします。"
+
+msgid ""
+"Why is match target text normalizered? It's for increasing index search-able "
+"patterns. If Groonga doesn't normalize match target text, you need to write "
+"complex regular expression such as ``[Dd][Ii][Ss][Kk]`` and ``(?i)disk`` for "
+"case-insensitive match. Groonga can't use index against complex regular "
+"expression."
+msgstr ""
+"なぜマッチ対象のテキストを正規化するのでしょうか?それは、インデックスを使っ"
+"て検索できるパターンを増やすためです。もし、Groongaがマッチ対象のテキストを正"
+"規化しなかった場合、大文字小文字を区別しないマッチをするために、 ``[Dd][Ii]"
+"[Ss][Kk]`` や ``(?i)disk`` のような複雑な正規表現を書く必要があります。"
+"Groonga複雑な正規表現に対してはインデックスを使うことができません。"
+
+msgid ""
+"If you write ``disk`` regular expression for case-insensitive match, Groonga "
+"can search the pattern with index. It's fast."
+msgstr ""
+"もし、大文字小文字を区別しないマッチに ``disk`` という正規表現を使うなら、"
+"Groongaはインデックスを使ってこのパターンを検索できます。これは高速です。"
+
+msgid ""
+"You may feel the behavior is strange. But fast search based on this behavior "
+"will help you."
+msgstr ""
+"この挙動を奇妙に思うかもしれません。しかし、この挙動のおかげで高速に検索でき"
+"ることはきっとあなたの役に立つはずです。"
+
 # 1d4514e7a6904e77802262bf59e4d4c1
 msgid ""
 "There are many regular expression syntaxes. Groonga uses the same syntax in "
@@ -17290,6 +17325,13 @@ msgstr ""
 "\\n`` にマッチするということです。"
 
 msgid ""
+"But it's meaningless. Because ``\\n`` is removed by :ref:`normalizer-auto` "
+"normalizer."
+msgstr ""
+"しかし、この挙動は意味がありません。なぜなら、 ``\\n`` は :ref:`normalizer-"
+"auto` ノーマライザーが削除するからです。"
+
+msgid ""
 "You can use regular expression in :ref:`select-query` and :ref:`select-"
 "filter` options of :doc:`/reference/commands/select` command."
 msgstr ""
@@ -17716,8 +17758,8 @@ msgstr ""
 "きに便利です。"
 
 # c6b7f6856d7249ba996db3e06687031a
-msgid "For example, ``[Dd]`` matches ``D`` or ``d``."
-msgstr "たとえば、 ``[Dd]`` は ``D`` または ``d`` にマッチします。"
+msgid "For example, ``[12]`` matches ``1`` or ``2``."
+msgstr "たとえば、 ``[12]`` は ``1`` または ``2`` にマッチします。"
 
 # f68244a7b7d84930b12fabd2e18f5c44
 msgid ""

  Modified: doc/source/example/reference/regular_expression/character_class_characters.log (+14 -2)
===================================================================
--- doc/source/example/reference/regular_expression/character_class_characters.log    2015-09-08 12:06:36 +0900 (e23d5b2)
+++ doc/source/example/reference/regular_expression/character_class_characters.log    2015-09-08 12:38:27 +0900 (a5cd522)
@@ -1,6 +1,6 @@
 Execution example::
 
-  select Logs --filter 'message @~ "[Dd]isk"'
+  select Logs --filter 'message @~ "host[12]"'
   # [
   #   [
   #     0, 
@@ -10,7 +10,7 @@ Execution example::
   #   [
   #     [
   #       [
-  #         2
+  #         5
   #       ], 
   #       [
   #         [
@@ -23,12 +23,24 @@ Execution example::
   #         ]
   #       ], 
   #       [
+  #         1, 
+  #         "host1:[error]: No memory"
+  #       ], 
+  #       [
   #         2, 
   #         "host1:[warning]: Remained disk space is less than 30%"
   #       ], 
   #       [
   #         3, 
   #         "host1:[error]: Disk full"
+  #       ], 
+  #       [
+  #         4, 
+  #         "host2:[error]: No memory"
+  #       ], 
+  #       [
+  #         5, 
+  #         "host2:[info]: Shutdown"
   #       ]
   #     ]
   #   ]

  Modified: doc/source/example/reference/regular_expression/usage_filter.log (+1 -1)
===================================================================
--- doc/source/example/reference/regular_expression/usage_filter.log    2015-09-08 12:06:36 +0900 (e23d5b2)
+++ doc/source/example/reference/regular_expression/usage_filter.log    2015-09-08 12:38:27 +0900 (72df348)
@@ -1,6 +1,6 @@
 Execution example::
 
-  select Logs --filter 'message @~ "[Dd]isk"'
+  select Logs --filter 'message @~ "disk (space|full)"'
   # [
   #   [
   #     0, 

  Modified: doc/source/example/reference/regular_expression/usage_query.log (+1 -1)
===================================================================
--- doc/source/example/reference/regular_expression/usage_query.log    2015-09-08 12:06:36 +0900 (e63185b)
+++ doc/source/example/reference/regular_expression/usage_query.log    2015-09-08 12:38:27 +0900 (49121cf)
@@ -1,6 +1,6 @@
 Execution example::
 
-  select Logs --query 'message:~"[Dd]isk"'
+  select Logs --query 'message:~"disk (space|full)"'
   # [
   #   [
   #     0, 

  Modified: doc/source/reference/regular_expression.rst (+32 -6)
===================================================================
--- doc/source/reference/regular_expression.rst    2015-09-08 12:06:36 +0900 (7ac2c38)
+++ doc/source/reference/regular_expression.rst    2015-09-08 12:38:27 +0900 (5064017)
@@ -28,6 +28,27 @@ In some cases, pattern match by regular expression can be evaluated
 by index. It's very fast rather than sequential search. Patterns
 that can be evaluated by index are described later.
 
+.. versionadded:: 5.0.7
+
+   Groonga normalizes match target text by :ref:`normalizer-auto`
+   normalizer when Groonga doesn't use index for regular expression
+   search. It means that regular expression that has upper case such
+   as ``Groonga`` never match. Because :ref:`normalizer-auto`
+   normalizer normalizes all alphabets to lower case. ``groonga``
+   matches to both ``Groonga`` and ``groonga``.
+
+   Why is match target text normalizered? It's for increasing index
+   search-able patterns. If Groonga doesn't normalize match target
+   text, you need to write complex regular expression such as
+   ``[Dd][Ii][Ss][Kk]`` and ``(?i)disk`` for case-insensitive match.
+   Groonga can't use index against complex regular expression.
+
+   If you write ``disk`` regular expression for case-insensitive
+   match, Groonga can search the pattern with index. It's fast.
+
+   You may feel the behavior is strange. But fast search based on this
+   behavior will help you.
+
 There are many regular expression syntaxes. Groonga uses the same
 syntax in Ruby. Because Groonga uses the same regular expression
 engine as Ruby. The regular expression engine is `Onigmo
@@ -39,8 +60,13 @@ means the end of text in other most regular expression syntaxes. The regular
 expression syntax in Ruby uses ``\A`` for the beginning of text and
 ``\z`` for the end of text.
 
-Groonga uses multiline mode since 5.0.6. It means that ``.`` matches
-on ``\n``.
+.. versionadded:: 5.0.6
+
+   Groonga uses multiline mode since 5.0.6. It means that ``.``
+   matches on ``\n``.
+
+   But it's meaningless. Because ``\n`` is removed by
+   :ref:`normalizer-auto` normalizer.
 
 You can use regular expression in :ref:`select-query` and
 :ref:`select-filter` options of :doc:`/reference/commands/select`
@@ -73,7 +99,7 @@ Here is an example that uses regular expression in
 
 .. groonga-command
 .. include:: ../example/reference/regular_expression/usage_query.log
-.. select Logs --query 'message:~"[Dd]isk"'
+.. select Logs --query 'message:~"disk (space|full)"'
 
 Here is an example that uses regular expression in
 :ref:`select-filter`. You need to use
@@ -81,7 +107,7 @@ Here is an example that uses regular expression in
 
 .. groonga-command
 .. include:: ../example/reference/regular_expression/usage_filter.log
-.. select Logs --filter 'message @~ "[Dd]isk"'
+.. select Logs --filter 'message @~ "disk (space|full)"'
 
 Index
 -----
@@ -295,11 +321,11 @@ Character class
 Character class syntax is ``[...]``. Character class is useful to
 specify multiple characters to be matched.
 
-For example, ``[Dd]`` matches ``D`` or ``d``.
+For example, ``[12]`` matches ``1`` or ``2``.
 
 .. groonga-command
 .. include:: ../example/reference/regular_expression/character_class_characters.log
-.. select Logs --filter 'message @~ "[Dd]isk"'
+.. select Logs --filter 'message @~ "host[12]"'
 
 You can specify characters by range. For example, ``[0-9]`` matches
 one digit.
-------------- next part --------------
HTML����������������������������...
Download 



More information about the Groonga-commit mailing list
Back to archive index