[Groonga-commit] groonga/groonga at 72be225 [master] doc: Separate from tokenizers page

Back to archive index
Yasuhiro Horimoto null+****@clear*****
Fri Jan 4 12:43:09 JST 2019


Yasuhiro Horimoto	2019-01-04 12:43:09 +0900 (Fri, 04 Jan 2019)

  Revision: 72be2250bedb7b315e6542b3faf8728ece09f311
  https://github.com/groonga/groonga/commit/72be2250bedb7b315e6542b3faf8728ece09f311

  Message:
    doc: Separate from tokenizers page

  Added files:
    doc/source/reference/tokenizers/token_bigram_ignore_blank_split_symbol_alpha_digit.rst
  Modified files:
    doc/locale/ja/LC_MESSAGES/reference.po
    doc/source/reference/tokenizers.rst

  Modified: doc/locale/ja/LC_MESSAGES/reference.po (+4 -32)
===================================================================
--- doc/locale/ja/LC_MESSAGES/reference.po    2019-01-04 12:28:55 +0900 (bfdbe7b16)
+++ doc/locale/ja/LC_MESSAGES/reference.po    2019-01-04 12:43:09 +0900 (841cd0d66)
@@ -27376,6 +27376,10 @@ msgstr ""
 msgid "Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlpha``:"
 msgstr "``TokenBigramIgnoreBlankSplitSymbolAlpha`` の実行結果です。"
 
+msgid "``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` hasn't parameter::"
+msgstr ""
+"``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` には、引数がありません。"
+
 msgid "``TokenBigramSplitSymbol``"
 msgstr ""
 
@@ -28265,35 +28269,3 @@ msgstr ""
 
 msgid "``window_sum``"
 msgstr ""
-
-#~ msgid ""
-#~ "``TokenBigramSplitSymbolAlphaDigit`` is similar to :ref:`token-bigram`. "
-#~ "The difference between them is symbol, alphabet and digit handling. "
-#~ "``TokenBigramSplitSymbolAlphaDigit`` tokenizes symbols, alphabets and "
-#~ "digits by bigram tokenize method. It means that all characters are "
-#~ "tokenized by bigram tokenize method:"
-#~ msgstr ""
-#~ "``TokenBigramSplitSymbolAlphaDigit`` は :ref:`token-bigram` と似ています。"
-#~ "違いは記号とアルファベットと数字の扱いです。 "
-#~ "``TokenBigramSplitSymbolAlphaDigit`` は記号とアルファベット数字のトークナ"
-#~ "イズ方法にバイグラムを使います。つまり、すべての文字をバイグラムでトークナ"
-#~ "イズします。"
-
-#~ msgid ""
-#~ "``TokenBigramSplitSymbolAlpha`` is similar to :ref:`token-bigram`. The "
-#~ "difference between them is symbol and alphabet handling. "
-#~ "``TokenBigramSplitSymbolAlpha`` tokenizes symbols and alphabets by bigram "
-#~ "tokenize method:"
-#~ msgstr ""
-#~ "``TokenBigramSplitSymbolAlpha`` は :ref:`token-bigram` と似ています。違い"
-#~ "は記号とアルファベットの扱いです。 ``TokenBigramSplitSymbolAlpha`` は記号"
-#~ "とアルファベットのトークナイズ方法にバイグラムを使います。"
-
-#~ msgid ""
-#~ "``TokenBigramSplitSymbol`` is similar to :ref:`token-bigram`. The "
-#~ "difference between them is symbol handling. ``TokenBigramSplitSymbol`` "
-#~ "tokenizes symbols by bigram tokenize method:"
-#~ msgstr ""
-#~ "``TokenBigramSplitSymbol`` は :ref:`token-bigram` と似ています。違いは記号"
-#~ "の扱いです。 ``TokenBigramSplitSymbol`` は記号のトークナイズ方法にバイグラ"
-#~ "ムを使います。"

  Modified: doc/source/reference/tokenizers.rst (+0 -35)
===================================================================
--- doc/source/reference/tokenizers.rst    2019-01-04 12:28:55 +0900 (38dbc6c05)
+++ doc/source/reference/tokenizers.rst    2019-01-04 12:43:09 +0900 (8be95a2f8)
@@ -107,7 +107,6 @@ Built-in tokenizsers
 
 Here is a list of built-in tokenizers:
 
-  * ``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``
   * ``TokenUnigram``
   * ``TokenTrigram``
   * ``TokenDelimit``
@@ -121,40 +120,6 @@ Here is a list of built-in tokenizers:
 
    tokenizers/*
 
-.. _token-bigram-ignore-blank-split-symbol-alpha-digit:
-
-``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` is similar to
-:ref:`token-bigram`. The differences between them are the followings:
-
-  * Blank handling
-  * Symbol, alphabet and digit handling
-
-``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` ignores white-spaces
-in continuous symbols and non-ASCII characters.
-
-``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` tokenizes symbols,
-alphabets and digits by bigram tokenize method. It means that all
-characters are tokenized by bigram tokenize method.
-
-You can find difference of them by ``Hello 日 本 語 ! ! ! 777`` text
-because it has symbols and non-ASCII characters with white spaces,
-alphabets and digits.
-
-Here is a result by :ref:`token-bigram` :
-
-.. groonga-command
-.. include:: ../example/reference/tokenizers/token-bigram-with-white-spaces-and-symbol-and-alphabet-and-digit.log
-.. tokenize TokenBigram "Hello 日 本 語 ! ! ! 777" NormalizerAuto
-
-Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``:
-
-.. groonga-command
-.. include:: ../example/reference/tokenizers/token-bigram-ignore-blank-split-symbol-with-white-spaces-and-symbol-and-alphabet-digit.log
-.. tokenize TokenBigramIgnoreBlankSplitSymbolAlphaDigit "Hello 日 本 語 ! ! ! 777" NormalizerAuto
-
 .. _token-unigram:
 
 ``TokenUnigram``

  Added: doc/source/reference/tokenizers/token_bigram_ignore_blank_split_symbol_alpha_digit.rst (+53 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/reference/tokenizers/token_bigram_ignore_blank_split_symbol_alpha_digit.rst    2019-01-04 12:43:09 +0900 (87df21eae)
@@ -0,0 +1,53 @@
+.. -*- rst -*-
+
+.. highlightlang:: none
+
+.. groonga-command
+.. database: tokenizers
+
+.. _token-bigram-ignore-blank-split-symbol-alpha-digit:
+
+``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``
+===============================================
+
+Summary
+-------
+
+``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` is similar to
+:ref:`token-bigram`. The differences between them are the followings:
+
+  * Blank handling
+  * Symbol, alphabet and digit handling
+
+Syntax
+------
+
+``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` hasn't parameter::
+
+  TokenBigramIgnoreBlankSplitSymbolAlphaDigit
+
+Usage
+-----
+
+``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` ignores white-spaces
+in continuous symbols and non-ASCII characters.
+
+``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` tokenizes symbols,
+alphabets and digits by bigram tokenize method. It means that all
+characters are tokenized by bigram tokenize method.
+
+You can find difference of them by ``Hello 日 本 語 ! ! ! 777`` text
+because it has symbols and non-ASCII characters with white spaces,
+alphabets and digits.
+
+Here is a result by :ref:`token-bigram` :
+
+.. groonga-command
+.. include:: ../../example/reference/tokenizers/token-bigram-with-white-spaces-and-symbol-and-alphabet-and-digit.log
+.. tokenize TokenBigram "Hello 日 本 語 ! ! ! 777" NormalizerAuto
+
+Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``:
+
+.. groonga-command
+.. include:: ../../example/reference/tokenizers/token-bigram-ignore-blank-split-symbol-with-white-spaces-and-symbol-and-alphabet-digit.log
+.. tokenize TokenBigramIgnoreBlankSplitSymbolAlphaDigit "Hello 日 本 語 ! ! ! 777" NormalizerAuto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.osdn.me/mailman/archives/groonga-commit/attachments/20190104/18d9ccdc/attachment-0001.html>


More information about the Groonga-commit mailing list
Back to archive index