Kouhei Sutou
null+****@clear*****
Tue Sep 8 12:38:27 JST 2015
Kouhei Sutou 2015-09-08 12:38:27 +0900 (Tue, 08 Sep 2015) New Revision: dd11f396c9b86728ef2a84d6be0dd5e9ede049ee https://github.com/groonga/groonga/commit/dd11f396c9b86728ef2a84d6be0dd5e9ede049ee Message: doc: document new regular expression behavior Modified files: doc/locale/ja/LC_MESSAGES/reference.po doc/source/example/reference/regular_expression/character_class_characters.log doc/source/example/reference/regular_expression/usage_filter.log doc/source/example/reference/regular_expression/usage_query.log doc/source/reference/regular_expression.rst Modified: doc/locale/ja/LC_MESSAGES/reference.po (+45 -3) =================================================================== --- doc/locale/ja/LC_MESSAGES/reference.po 2015-09-08 12:06:36 +0900 (7417083) +++ doc/locale/ja/LC_MESSAGES/reference.po 2015-09-08 12:38:27 +0900 (12a41b2) @@ -7,7 +7,7 @@ msgid "" msgstr "" "Project-Id-Version: 1.2.1\n" "Report-Msgid-Bugs-To: \n" -"PO-Revision-Date: 2015-08-30 23:37+0900\n" +"PO-Revision-Date: 2015-09-08 12:38+0900\n" "Last-Translator: Masafumi Yokoyama <yokoyama �� clear-code.com>\n" "Language-Team: Japanese\n" "Language: ja\n" @@ -17261,6 +17261,41 @@ msgstr "" "します。これは逐次検索よりも非常に高速です。インデックスを使って評価できるパ" "ターンについては後述します。" +msgid "" +"Groonga normalizes match target text by :ref:`normalizer-auto` normalizer " +"when Groonga doesn't use index for regular expression search. It means that " +"regular expression that has upper case such as ``Groonga`` never match. " +"Because :ref:`normalizer-auto` normalizer normalizes all alphabets to lower " +"case. ``groonga`` matches to both ``Groonga`` and ``groonga``." +msgstr "Groongaは正規表現検索にインデックスを使わないときは、 :ref:`normalizer-auto` ノーまライザーでマッチ対象のテキストを正規化します。これは、 ``Groonga`` というような大文字を使った正規表現は必ずマッチに失敗するということです。なぜなら、 :ref:`normalizer-auto` ノーマライザーはすべてのアルファベットを小文字に正規化するからです。 ``groonga`` は ``Groonga`` にも ``groonga`` にも両方にマッチします。" + +msgid "" +"Why is match target text normalizered? It's for increasing index search-able " +"patterns. If Groonga doesn't normalize match target text, you need to write " +"complex regular expression such as ``[Dd][Ii][Ss][Kk]`` and ``(?i)disk`` for " +"case-insensitive match. Groonga can't use index against complex regular " +"expression." +msgstr "" +"なぜマッチ対象のテキストを正規化するのでしょうか?それは、インデックスを使っ" +"て検索できるパターンを増やすためです。もし、Groongaがマッチ対象のテキストを正" +"規化しなかった場合、大文字小文字を区別しないマッチをするために、 ``[Dd][Ii]" +"[Ss][Kk]`` や ``(?i)disk`` のような複雑な正規表現を書く必要があります。" +"Groonga複雑な正規表現に対してはインデックスを使うことができません。" + +msgid "" +"If you write ``disk`` regular expression for case-insensitive match, Groonga " +"can search the pattern with index. It's fast." +msgstr "" +"もし、大文字小文字を区別しないマッチに ``disk`` という正規表現を使うなら、" +"Groongaはインデックスを使ってこのパターンを検索できます。これは高速です。" + +msgid "" +"You may feel the behavior is strange. But fast search based on this behavior " +"will help you." +msgstr "" +"この挙動を奇妙に思うかもしれません。しかし、この挙動のおかげで高速に検索でき" +"ることはきっとあなたの役に立つはずです。" + # 1d4514e7a6904e77802262bf59e4d4c1 msgid "" "There are many regular expression syntaxes. Groonga uses the same syntax in " @@ -17290,6 +17325,13 @@ msgstr "" "\\n`` にマッチするということです。" msgid "" +"But it's meaningless. Because ``\\n`` is removed by :ref:`normalizer-auto` " +"normalizer." +msgstr "" +"しかし、この挙動は意味がありません。なぜなら、 ``\\n`` は :ref:`normalizer-" +"auto` ノーマライザーが削除するからです。" + +msgid "" "You can use regular expression in :ref:`select-query` and :ref:`select-" "filter` options of :doc:`/reference/commands/select` command." msgstr "" @@ -17716,8 +17758,8 @@ msgstr "" "きに便利です。" # c6b7f6856d7249ba996db3e06687031a -msgid "For example, ``[Dd]`` matches ``D`` or ``d``." -msgstr "たとえば、 ``[Dd]`` は ``D`` または ``d`` にマッチします。" +msgid "For example, ``[12]`` matches ``1`` or ``2``." +msgstr "たとえば、 ``[12]`` は ``1`` または ``2`` にマッチします。" # f68244a7b7d84930b12fabd2e18f5c44 msgid "" Modified: doc/source/example/reference/regular_expression/character_class_characters.log (+14 -2) =================================================================== --- doc/source/example/reference/regular_expression/character_class_characters.log 2015-09-08 12:06:36 +0900 (e23d5b2) +++ doc/source/example/reference/regular_expression/character_class_characters.log 2015-09-08 12:38:27 +0900 (a5cd522) @@ -1,6 +1,6 @@ Execution example:: - select Logs --filter 'message @~ "[Dd]isk"' + select Logs --filter 'message @~ "host[12]"' # [ # [ # 0, @@ -10,7 +10,7 @@ Execution example:: # [ # [ # [ - # 2 + # 5 # ], # [ # [ @@ -23,12 +23,24 @@ Execution example:: # ] # ], # [ + # 1, + # "host1:[error]: No memory" + # ], + # [ # 2, # "host1:[warning]: Remained disk space is less than 30%" # ], # [ # 3, # "host1:[error]: Disk full" + # ], + # [ + # 4, + # "host2:[error]: No memory" + # ], + # [ + # 5, + # "host2:[info]: Shutdown" # ] # ] # ] Modified: doc/source/example/reference/regular_expression/usage_filter.log (+1 -1) =================================================================== --- doc/source/example/reference/regular_expression/usage_filter.log 2015-09-08 12:06:36 +0900 (e23d5b2) +++ doc/source/example/reference/regular_expression/usage_filter.log 2015-09-08 12:38:27 +0900 (72df348) @@ -1,6 +1,6 @@ Execution example:: - select Logs --filter 'message @~ "[Dd]isk"' + select Logs --filter 'message @~ "disk (space|full)"' # [ # [ # 0, Modified: doc/source/example/reference/regular_expression/usage_query.log (+1 -1) =================================================================== --- doc/source/example/reference/regular_expression/usage_query.log 2015-09-08 12:06:36 +0900 (e63185b) +++ doc/source/example/reference/regular_expression/usage_query.log 2015-09-08 12:38:27 +0900 (49121cf) @@ -1,6 +1,6 @@ Execution example:: - select Logs --query 'message:~"[Dd]isk"' + select Logs --query 'message:~"disk (space|full)"' # [ # [ # 0, Modified: doc/source/reference/regular_expression.rst (+32 -6) =================================================================== --- doc/source/reference/regular_expression.rst 2015-09-08 12:06:36 +0900 (7ac2c38) +++ doc/source/reference/regular_expression.rst 2015-09-08 12:38:27 +0900 (5064017) @@ -28,6 +28,27 @@ In some cases, pattern match by regular expression can be evaluated by index. It's very fast rather than sequential search. Patterns that can be evaluated by index are described later. +.. versionadded:: 5.0.7 + + Groonga normalizes match target text by :ref:`normalizer-auto` + normalizer when Groonga doesn't use index for regular expression + search. It means that regular expression that has upper case such + as ``Groonga`` never match. Because :ref:`normalizer-auto` + normalizer normalizes all alphabets to lower case. ``groonga`` + matches to both ``Groonga`` and ``groonga``. + + Why is match target text normalizered? It's for increasing index + search-able patterns. If Groonga doesn't normalize match target + text, you need to write complex regular expression such as + ``[Dd][Ii][Ss][Kk]`` and ``(?i)disk`` for case-insensitive match. + Groonga can't use index against complex regular expression. + + If you write ``disk`` regular expression for case-insensitive + match, Groonga can search the pattern with index. It's fast. + + You may feel the behavior is strange. But fast search based on this + behavior will help you. + There are many regular expression syntaxes. Groonga uses the same syntax in Ruby. Because Groonga uses the same regular expression engine as Ruby. The regular expression engine is `Onigmo @@ -39,8 +60,13 @@ means the end of text in other most regular expression syntaxes. The regular expression syntax in Ruby uses ``\A`` for the beginning of text and ``\z`` for the end of text. -Groonga uses multiline mode since 5.0.6. It means that ``.`` matches -on ``\n``. +.. versionadded:: 5.0.6 + + Groonga uses multiline mode since 5.0.6. It means that ``.`` + matches on ``\n``. + + But it's meaningless. Because ``\n`` is removed by + :ref:`normalizer-auto` normalizer. You can use regular expression in :ref:`select-query` and :ref:`select-filter` options of :doc:`/reference/commands/select` @@ -73,7 +99,7 @@ Here is an example that uses regular expression in .. groonga-command .. include:: ../example/reference/regular_expression/usage_query.log -.. select Logs --query 'message:~"[Dd]isk"' +.. select Logs --query 'message:~"disk (space|full)"' Here is an example that uses regular expression in :ref:`select-filter`. You need to use @@ -81,7 +107,7 @@ Here is an example that uses regular expression in .. groonga-command .. include:: ../example/reference/regular_expression/usage_filter.log -.. select Logs --filter 'message @~ "[Dd]isk"' +.. select Logs --filter 'message @~ "disk (space|full)"' Index ----- @@ -295,11 +321,11 @@ Character class Character class syntax is ``[...]``. Character class is useful to specify multiple characters to be matched. -For example, ``[Dd]`` matches ``D`` or ``d``. +For example, ``[12]`` matches ``1`` or ``2``. .. groonga-command .. include:: ../example/reference/regular_expression/character_class_characters.log -.. select Logs --filter 'message @~ "[Dd]isk"' +.. select Logs --filter 'message @~ "host[12]"' You can specify characters by range. For example, ``[0-9]`` matches one digit. -------------- next part -------------- HTML����������������������������... Download