[Groonga-commit] groonga/groonga at 01a5414 [master] doc: Separate from tokenizers page

Back to archive index
Yasuhiro Horimoto null+****@clear*****
Fri Jan 4 12:28:55 JST 2019


Yasuhiro Horimoto	2019-01-04 12:28:55 +0900 (Fri, 04 Jan 2019)

  Revision: 01a541474b06e62bd022b135a8e40a7a644770e9
  https://github.com/groonga/groonga/commit/01a541474b06e62bd022b135a8e40a7a644770e9

  Message:
    doc: Separate from tokenizers page

  Added files:
    doc/source/reference/tokenizers/token_bigram_ignore_blank_split_symbol_alpha.rst
  Modified files:
    doc/locale/ja/LC_MESSAGES/reference.po
    doc/source/reference/tokenizers.rst

  Modified: doc/locale/ja/LC_MESSAGES/reference.po (+114 -111)
===================================================================
--- doc/locale/ja/LC_MESSAGES/reference.po    2019-01-04 12:04:44 +0900 (193a481a7)
+++ doc/locale/ja/LC_MESSAGES/reference.po    2019-01-04 12:28:55 +0900 (bfdbe7b16)
@@ -26964,27 +26964,6 @@ msgstr "組み込みトークナイザー"
 msgid "Here is a list of built-in tokenizers:"
 msgstr "以下は組み込みのトークナイザーのリストです。"
 
-msgid "``TokenBigram``"
-msgstr ""
-
-msgid "``TokenBigramSplitSymbol``"
-msgstr ""
-
-msgid "``TokenBigramSplitSymbolAlpha``"
-msgstr ""
-
-msgid "``TokenBigramSplitSymbolAlphaDigit``"
-msgstr ""
-
-msgid "``TokenBigramIgnoreBlank``"
-msgstr ""
-
-msgid "``TokenBigramIgnoreBlankSplitSymbol``"
-msgstr ""
-
-msgid "``TokenBigramIgnoreBlankSplitSymbolAlpha``"
-msgstr ""
-
 msgid "``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``"
 msgstr ""
 
@@ -27007,86 +26986,15 @@ msgid "``TokenRegexp``"
 msgstr ""
 
 msgid ""
-"``TokenBigramIgnoreBlankSplitSymbol`` is similar to :ref:`token-bigram`. The "
-"differences between them are the followings:"
-msgstr ""
-"``TokenBigramIgnoreBlankSplitSymbol`` は :ref:`token-bigram` と似ています。違"
-"いは次の通りです。"
-
-msgid "Blank handling"
-msgstr "空白文字の扱い"
-
-msgid "Symbol handling"
-msgstr "記号の扱い"
-
-msgid ""
-"``TokenBigramIgnoreBlankSplitSymbol`` ignores white-spaces in continuous "
-"symbols and non-ASCII characters."
-msgstr ""
-"``TokenBigramIgnoreBlankSplitSymbol`` は連続した記号と非ASCII文字の間の空白文"
-"字を無視します。"
-
-msgid ""
-"``TokenBigramIgnoreBlankSplitSymbol`` tokenizes symbols by bigram tokenize "
-"method."
-msgstr ""
-"``TokenBigramIgnoreBlankSplitSymbol`` は記号をバイグラムでトークナイズしま"
-"す。"
-
-msgid ""
-"You can find difference of them by ``日 本 語 ! ! !`` text because it has "
-"symbols and non-ASCII characters."
-msgstr ""
-"``日 本 語 ! ! !`` というテキストを使うと違いがわかります。なぜならこのテキス"
-"トは記号と非ASCII文字を両方含んでいるからです。"
-
-msgid "Here is a result by :ref:`token-bigram` :"
-msgstr ":ref:`token-bigram` での実行結果です。"
-
-msgid "Here is a result by ``TokenBigramIgnoreBlankSplitSymbol``:"
-msgstr "``TokenBigramIgnoreBlankSplitSymbol`` の実行結果です。"
-
-msgid ""
-"``TokenBigramIgnoreBlankSplitSymbolAlpha`` is similar to :ref:`token-"
-"bigram`. The differences between them are the followings:"
-msgstr ""
-"``TokenBigramIgnoreBlankSplitSymbolAlpha`` は :ref:`token-bigram` と似ていま"
-"す。違いは次の通りです。"
-
-msgid "Symbol and alphabet handling"
-msgstr "記号とアルファベットの扱い"
-
-msgid ""
-"``TokenBigramIgnoreBlankSplitSymbolAlpha`` ignores white-spaces in "
-"continuous symbols and non-ASCII characters."
-msgstr ""
-"``TokenBigramIgnoreBlankSplitSymbolAlpha`` は連続した記号と非ASCII文字の間の"
-"空白文字を無視します。"
-
-msgid ""
-"``TokenBigramIgnoreBlankSplitSymbolAlpha`` tokenizes symbols and alphabets "
-"by bigram tokenize method."
-msgstr ""
-"``TokenBigramIgnoreBlankSplitSymbolAlpha`` は記号とアルファベットをバイグラム"
-"でトークナイズします。"
-
-msgid ""
-"You can find difference of them by ``Hello 日 本 語 ! ! !`` text because it "
-"has symbols and non-ASCII characters with white spaces and alphabets."
-msgstr ""
-"``Hello 日 本 語 ! ! !`` というテキストを使うと違いがわかります。なぜなら空白"
-"文字入りの記号と非ASCII文字だけでなく、アルファベットも含んでいるからです。"
-
-msgid "Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlpha``:"
-msgstr "``TokenBigramIgnoreBlankSplitSymbolAlpha`` の実行結果です。"
-
-msgid ""
 "``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` is similar to :ref:`token-"
 "bigram`. The differences between them are the followings:"
 msgstr ""
 "``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` は :ref:`token-bigram` と似て"
 "います。違いは次の通りです。"
 
+msgid "Blank handling"
+msgstr "空白文字の扱い"
+
 msgid "Symbol, alphabet and digit handling"
 msgstr "記号とアルファベットと数字の扱い"
 
@@ -27115,6 +27023,9 @@ msgstr ""
 "ら、このテキストは空白文字入りの記号と非ASCII文字だけでなく、アルファベットと"
 "数字も含んでいるからです。"
 
+msgid "Here is a result by :ref:`token-bigram` :"
+msgstr ":ref:`token-bigram` での実行結果です。"
+
 msgid "Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``:"
 msgstr "``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` の実行結果です。"
 
@@ -27200,6 +27111,9 @@ msgstr ""
 "入れ、テキストの最後にテキストの最後であるというマーク( ``U+FFF0`` )を入れ"
 "ます。"
 
+msgid "``TokenBigram``"
+msgstr ""
+
 msgid ""
 "``TokenBigram`` is a bigram based tokenizer. It's recommended to use this "
 "tokenizer for most cases."
@@ -27364,6 +27278,9 @@ msgstr ""
 "以下は ``TokenBigram`` が非ASCII文字にはトークナイズ方法としてバイグラムを使"
 "う例です。"
 
+msgid "``TokenBigramIgnoreBlank``"
+msgstr ""
+
 msgid ""
 "``TokenBigramIgnoreBlank`` is similar to :ref:`token-bigram`. The difference "
 "between them is blank handling. ``TokenBigramIgnoreBlank`` ignores white-"
@@ -27376,13 +27293,93 @@ msgstr ""
 msgid "``TokenBigramIgnoreBlank`` hasn't parameter::"
 msgstr "``TokenBigramIgnoreBlank`` には、引数がありません。"
 
+msgid ""
+"You can find difference of them by ``日 本 語 ! ! !`` text because it has "
+"symbols and non-ASCII characters."
+msgstr ""
+"``日 本 語 ! ! !`` というテキストを使うと違いがわかります。なぜならこのテキス"
+"トは記号と非ASCII文字を両方含んでいるからです。"
+
 msgid "Here is a result by ``TokenBigramIgnoreBlank``:"
 msgstr "``TokenBigramIgnoreBlank`` での実行結果です。"
 
+msgid "``TokenBigramIgnoreBlankSplitSymbol``"
+msgstr ""
+
+msgid ""
+"``TokenBigramIgnoreBlankSplitSymbol`` is similar to :ref:`token-bigram`. The "
+"differences between them are the followings:"
+msgstr ""
+"``TokenBigramIgnoreBlankSplitSymbol`` は :ref:`token-bigram` と似ています。違"
+"いは次の通りです。"
+
+msgid "Symbol handling"
+msgstr "記号の扱い"
+
 msgid "``TokenBigramIgnoreBlankSplitSymbol`` hasn't parameter::"
 msgstr "``TokenBigramIgnoreBlankSplitSymbol`` には、引数がありません。"
 
 msgid ""
+"``TokenBigramIgnoreBlankSplitSymbol`` ignores white-spaces in continuous "
+"symbols and non-ASCII characters."
+msgstr ""
+"``TokenBigramIgnoreBlankSplitSymbol`` は連続した記号と非ASCII文字の間の空白文"
+"字を無視します。"
+
+msgid ""
+"``TokenBigramIgnoreBlankSplitSymbol`` tokenizes symbols by bigram tokenize "
+"method."
+msgstr ""
+"``TokenBigramIgnoreBlankSplitSymbol`` は記号をバイグラムでトークナイズしま"
+"す。"
+
+msgid "Here is a result by ``TokenBigramIgnoreBlankSplitSymbol``:"
+msgstr "``TokenBigramIgnoreBlankSplitSymbol`` の実行結果です。"
+
+msgid "``TokenBigramIgnoreBlankSplitSymbolAlpha``"
+msgstr ""
+
+msgid ""
+"``TokenBigramIgnoreBlankSplitSymbolAlpha`` is similar to :ref:`token-"
+"bigram`. The differences between them are the followings:"
+msgstr ""
+"``TokenBigramIgnoreBlankSplitSymbolAlpha`` は :ref:`token-bigram` と似ていま"
+"す。違いは次の通りです。"
+
+msgid "Symbol and alphabet handling"
+msgstr "記号とアルファベットの扱い"
+
+msgid "``TokenBigramIgnoreBlankSplitSymbolAlpha`` hasn't parameter::"
+msgstr "``TokenBigramIgnoreBlankSplitSymbolAlpha`` には、引数がありません。"
+
+msgid ""
+"``TokenBigramIgnoreBlankSplitSymbolAlpha`` ignores white-spaces in "
+"continuous symbols and non-ASCII characters."
+msgstr ""
+"``TokenBigramIgnoreBlankSplitSymbolAlpha`` は連続した記号と非ASCII文字の間の"
+"空白文字を無視します。"
+
+msgid ""
+"``TokenBigramIgnoreBlankSplitSymbolAlpha`` tokenizes symbols and alphabets "
+"by bigram tokenize method."
+msgstr ""
+"``TokenBigramIgnoreBlankSplitSymbolAlpha`` は記号とアルファベットをバイグラム"
+"でトークナイズします。"
+
+msgid ""
+"You can find difference of them by ``Hello 日 本 語 ! ! !`` text because it "
+"has symbols and non-ASCII characters with white spaces and alphabets."
+msgstr ""
+"``Hello 日 本 語 ! ! !`` というテキストを使うと違いがわかります。なぜなら空白"
+"文字入りの記号と非ASCII文字だけでなく、アルファベットも含んでいるからです。"
+
+msgid "Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlpha``:"
+msgstr "``TokenBigramIgnoreBlankSplitSymbolAlpha`` の実行結果です。"
+
+msgid "``TokenBigramSplitSymbol``"
+msgstr ""
+
+msgid ""
 "``TokenBigramSplitSymbol`` is similar to :ref:`token-bigram`. The difference "
 "between them is symbol handling."
 msgstr ""
@@ -27396,6 +27393,9 @@ msgid "``TokenBigramSplitSymbol`` tokenizes symbols by bigram tokenize method:"
 msgstr ""
 "``TokenBigramSplitSymbol`` は記号のトークナイズ方法にバイグラムを使います。"
 
+msgid "``TokenBigramSplitSymbolAlpha``"
+msgstr ""
+
 msgid ""
 "``TokenBigramSplitSymbolAlpha`` is similar to :ref:`token-bigram`. The "
 "difference between them is symbol and alphabet handling."
@@ -27413,6 +27413,9 @@ msgstr ""
 "``TokenBigramSplitSymbolAlpha`` は記号とアルファベットのトークナイズ方法にバ"
 "イグラムを使います。"
 
+msgid "``TokenBigramSplitSymbolAlphaDigit``"
+msgstr ""
+
 msgid ""
 "``TokenBigramSplitSymbolAlphaDigit`` is similar to :ref:`token-bigram`. The "
 "difference between them is symbol, alphabet and digit handling."
@@ -28264,13 +28267,17 @@ msgid "``window_sum``"
 msgstr ""
 
 #~ msgid ""
-#~ "``TokenBigramSplitSymbol`` is similar to :ref:`token-bigram`. The "
-#~ "difference between them is symbol handling. ``TokenBigramSplitSymbol`` "
-#~ "tokenizes symbols by bigram tokenize method:"
+#~ "``TokenBigramSplitSymbolAlphaDigit`` is similar to :ref:`token-bigram`. "
+#~ "The difference between them is symbol, alphabet and digit handling. "
+#~ "``TokenBigramSplitSymbolAlphaDigit`` tokenizes symbols, alphabets and "
+#~ "digits by bigram tokenize method. It means that all characters are "
+#~ "tokenized by bigram tokenize method:"
 #~ msgstr ""
-#~ "``TokenBigramSplitSymbol`` は :ref:`token-bigram` と似ています。違いは記号"
-#~ "の扱いです。 ``TokenBigramSplitSymbol`` は記号のトークナイズ方法にバイグラ"
-#~ "ムを使います。"
+#~ "``TokenBigramSplitSymbolAlphaDigit`` は :ref:`token-bigram` と似ています。"
+#~ "違いは記号とアルファベットと数字の扱いです。 "
+#~ "``TokenBigramSplitSymbolAlphaDigit`` は記号とアルファベット数字のトークナ"
+#~ "イズ方法にバイグラムを使います。つまり、すべての文字をバイグラムでトークナ"
+#~ "イズします。"
 
 #~ msgid ""
 #~ "``TokenBigramSplitSymbolAlpha`` is similar to :ref:`token-bigram`. The "
@@ -28283,14 +28290,10 @@ msgstr ""
 #~ "とアルファベットのトークナイズ方法にバイグラムを使います。"
 
 #~ msgid ""
-#~ "``TokenBigramSplitSymbolAlphaDigit`` is similar to :ref:`token-bigram`. "
-#~ "The difference between them is symbol, alphabet and digit handling. "
-#~ "``TokenBigramSplitSymbolAlphaDigit`` tokenizes symbols, alphabets and "
-#~ "digits by bigram tokenize method. It means that all characters are "
-#~ "tokenized by bigram tokenize method:"
+#~ "``TokenBigramSplitSymbol`` is similar to :ref:`token-bigram`. The "
+#~ "difference between them is symbol handling. ``TokenBigramSplitSymbol`` "
+#~ "tokenizes symbols by bigram tokenize method:"
 #~ msgstr ""
-#~ "``TokenBigramSplitSymbolAlphaDigit`` は :ref:`token-bigram` と似ています。"
-#~ "違いは記号とアルファベットと数字の扱いです。 "
-#~ "``TokenBigramSplitSymbolAlphaDigit`` は記号とアルファベット数字のトークナ"
-#~ "イズ方法にバイグラムを使います。つまり、すべての文字をバイグラムでトークナ"
-#~ "イズします。"
+#~ "``TokenBigramSplitSymbol`` は :ref:`token-bigram` と似ています。違いは記号"
+#~ "の扱いです。 ``TokenBigramSplitSymbol`` は記号のトークナイズ方法にバイグラ"
+#~ "ムを使います。"

  Modified: doc/source/reference/tokenizers.rst (+0 -33)
===================================================================
--- doc/source/reference/tokenizers.rst    2019-01-04 12:04:44 +0900 (35887342e)
+++ doc/source/reference/tokenizers.rst    2019-01-04 12:28:55 +0900 (38dbc6c05)
@@ -107,7 +107,6 @@ Built-in tokenizsers
 
 Here is a list of built-in tokenizers:
 
-  * ``TokenBigramIgnoreBlankSplitSymbolAlpha``
   * ``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``
   * ``TokenUnigram``
   * ``TokenTrigram``
@@ -122,38 +121,6 @@ Here is a list of built-in tokenizers:
 
    tokenizers/*
 
-.. _token-bigram-ignore-blank-split-symbol-alpha:
-
-``TokenBigramIgnoreBlankSplitSymbolAlpha``
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-``TokenBigramIgnoreBlankSplitSymbolAlpha`` is similar to
-:ref:`token-bigram`. The differences between them are the followings:
-
-  * Blank handling
-  * Symbol and alphabet handling
-
-``TokenBigramIgnoreBlankSplitSymbolAlpha`` ignores white-spaces in
-continuous symbols and non-ASCII characters.
-
-``TokenBigramIgnoreBlankSplitSymbolAlpha`` tokenizes symbols and
-alphabets by bigram tokenize method.
-
-You can find difference of them by ``Hello 日 本 語 ! ! !`` text because it
-has symbols and non-ASCII characters with white spaces and alphabets.
-
-Here is a result by :ref:`token-bigram` :
-
-.. groonga-command
-.. include:: ../example/reference/tokenizers/token-bigram-with-white-spaces-and-symbol-and-alphabet.log
-.. tokenize TokenBigram "Hello 日 本 語 ! ! !" NormalizerAuto
-
-Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlpha``:
-
-.. groonga-command
-.. include:: ../example/reference/tokenizers/token-bigram-ignore-blank-split-symbol-with-white-spaces-and-symbol-and-alphabet.log
-.. tokenize TokenBigramIgnoreBlankSplitSymbolAlpha "Hello 日 本 語 ! ! !" NormalizerAuto
-
 .. _token-bigram-ignore-blank-split-symbol-alpha-digit:
 
 ``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``

  Added: doc/source/reference/tokenizers/token_bigram_ignore_blank_split_symbol_alpha.rst (+51 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/reference/tokenizers/token_bigram_ignore_blank_split_symbol_alpha.rst    2019-01-04 12:28:55 +0900 (ec662ef5c)
@@ -0,0 +1,51 @@
+.. -*- rst -*-
+
+.. highlightlang:: none
+
+.. groonga-command
+.. database: tokenizers
+
+.. _token-bigram-ignore-blank-split-symbol-alpha:
+
+``TokenBigramIgnoreBlankSplitSymbolAlpha``
+==========================================
+
+Summary
+-------
+
+``TokenBigramIgnoreBlankSplitSymbolAlpha`` is similar to
+:ref:`token-bigram`. The differences between them are the followings:
+
+  * Blank handling
+  * Symbol and alphabet handling
+
+Syntax
+------
+
+``TokenBigramIgnoreBlankSplitSymbolAlpha`` hasn't parameter::
+
+  TokenBigramIgnoreBlankSplitSymbolAlpha
+
+Usage
+-----
+
+``TokenBigramIgnoreBlankSplitSymbolAlpha`` ignores white-spaces in
+continuous symbols and non-ASCII characters.
+
+``TokenBigramIgnoreBlankSplitSymbolAlpha`` tokenizes symbols and
+alphabets by bigram tokenize method.
+
+You can find difference of them by ``Hello 日 本 語 ! ! !`` text because it
+has symbols and non-ASCII characters with white spaces and alphabets.
+
+Here is a result by :ref:`token-bigram` :
+
+.. groonga-command
+.. include:: ../../example/reference/tokenizers/token-bigram-with-white-spaces-and-symbol-and-alphabet.log
+.. tokenize TokenBigram "Hello 日 本 語 ! ! !" NormalizerAuto
+
+Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlpha``:
+
+.. groonga-command
+.. include:: ../../example/reference/tokenizers/token-bigram-ignore-blank-split-symbol-with-white-spaces-and-symbol-and-alphabet.log
+.. tokenize TokenBigramIgnoreBlankSplitSymbolAlpha "Hello 日 本 語 ! ! !" NormalizerAuto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.osdn.me/mailman/archives/groonga-commit/attachments/20190104/24c05a39/attachment-0001.html>


More information about the Groonga-commit mailing list
Back to archive index