Yasuhiro Horimoto 2019-01-04 14:57:23 +0900 (Fri, 04 Jan 2019) Revision: 9cebccd98704b10d478c8fb553c43ea906dbd16c https://github.com/groonga/groonga/commit/9cebccd98704b10d478c8fb553c43ea906dbd16c Message: doc: Separate from tokenizers page Added files: doc/source/reference/tokenizers/token_delimit_null.rst Modified files: doc/locale/ja/LC_MESSAGES/reference.po doc/source/reference/tokenizers.rst Modified: doc/locale/ja/LC_MESSAGES/reference.po (+21 -18) =================================================================== --- doc/locale/ja/LC_MESSAGES/reference.po 2019-01-04 14:19:02 +0900 (b54322a95) +++ doc/locale/ja/LC_MESSAGES/reference.po 2019-01-04 14:57:23 +0900 (c13d71c72) @@ -26964,31 +26964,13 @@ msgstr "組み込みトークナイザー" msgid "Here is a list of built-in tokenizers:" msgstr "以下は組み込みのトークナイザーのリストです。" -msgid "``TokenTrigram``" -msgstr "" - -msgid "``TokenDelimit``" -msgstr "" - msgid "``TokenDelimitNull``" msgstr "" -msgid "``TokenMecab``" -msgstr "" - msgid "``TokenRegexp``" msgstr "" msgid "" -"``TokenTrigram`` is similar to :ref:`token-bigram`. The differences between " -"them is token unit. :ref:`token-bigram` uses 2 characters per token. " -"``TokenTrigram`` uses 3 characters per token." -msgstr "" -"``TokenTrigram`` は :ref:`token-bigram` に似ています。違いはトークンの単位で" -"す。 :ref:`token-bigram` は各トークンが2文字ですが、 ``TokenTrigram`` は各" -"トークンが3文字です。" - -msgid "" "``TokenDelimitNull`` is similar to :ref:`token-delimit`. The difference " "between them is separator character. :ref:`token-delimit` uses space " "character (``U+0020``) but ``TokenDelimitNull`` uses NUL character (``U" @@ -27427,6 +27409,9 @@ msgstr "" "ズ方法にバイグラムを使います。つまり、すべての文字をバイグラムでトークナイズ" "します。" +msgid "``TokenDelimit``" +msgstr "" + msgid "" "``TokenDelimit`` extracts token by splitting one or more space characters " "(``U+0020``). For example, ``Hello World`` is tokenized to ``Hello`` and " @@ -27613,6 +27598,12 @@ msgstr "正規表現を使って、トークンを分割します。" msgid ":doc:`../commands/tokenize`" msgstr "" +msgid "``TokenDelimitNull`` hasn't parameter::" +msgstr "``TokenDelimitNull`` には、引数がありません。" + +msgid "``TokenMecab``" +msgstr "" + msgid "" "``TokenMecab`` is a tokenizer based on `MeCab <https://taku910.github.io/" "mecab/>`_ part-of-speech and morphological analyzer." @@ -27776,6 +27767,9 @@ msgstr "" msgid "Outputs reading of token." msgstr "トークンの読みがなを出力します。" +msgid "``TokenTrigram``" +msgstr "" + msgid "" "``TokenTrigram`` is similar to :ref:`token-bigram`. The differences between " "them is token unit." @@ -28294,3 +28288,12 @@ msgstr "" msgid "``window_sum``" msgstr "" + +#~ msgid "" +#~ "``TokenTrigram`` is similar to :ref:`token-bigram`. The differences " +#~ "between them is token unit. :ref:`token-bigram` uses 2 characters per " +#~ "token. ``TokenTrigram`` uses 3 characters per token." +#~ msgstr "" +#~ "``TokenTrigram`` は :ref:`token-bigram` に似ています。違いはトークンの単位" +#~ "です。 :ref:`token-bigram` は各トークンが2文字ですが、 ``TokenTrigram`` は" +#~ "各トークンが3文字です。" Modified: doc/source/reference/tokenizers.rst (+0 -19) =================================================================== --- doc/source/reference/tokenizers.rst 2019-01-04 14:19:02 +0900 (3abf02d04) +++ doc/source/reference/tokenizers.rst 2019-01-04 14:57:23 +0900 (0d92a6602) @@ -107,7 +107,6 @@ Built-in tokenizsers Here is a list of built-in tokenizers: - * ``TokenDelimitNull`` * ``TokenRegexp`` .. toctree:: @@ -116,24 +115,6 @@ Here is a list of built-in tokenizers: tokenizers/* -.. _token-delimit-null: - -``TokenDelimitNull`` -^^^^^^^^^^^^^^^^^^^^ - -``TokenDelimitNull`` is similar to :ref:`token-delimit`. The -difference between them is separator character. :ref:`token-delimit` -uses space character (``U+0020``) but ``TokenDelimitNull`` uses NUL -character (``U+0000``). - -``TokenDelimitNull`` is also suitable for tag text. - -Here is an example of ``TokenDelimitNull``: - -.. groonga-command -.. include:: ../example/reference/tokenizers/token-delimit-null.log -.. tokenize TokenDelimitNull "Groonga\u0000full-text-search\u0000HTTP" NormalizerAuto - .. _token-regexp: ``TokenRegexp`` Added: doc/source/reference/tokenizers/token_delimit_null.rst (+40 -0) 100644 =================================================================== --- /dev/null +++ doc/source/reference/tokenizers/token_delimit_null.rst 2019-01-04 14:57:23 +0900 (2652cebb4) @@ -0,0 +1,40 @@ +.. -*- rst -*- + +.. highlightlang:: none + +.. groonga-command +.. database: tokenizers + +.. _token-delimit-null: + +``TokenDelimitNull`` +==================== + +Summary +------- + +``TokenDelimitNull`` is similar to :ref:`token-delimit`. The +difference between them is separator character. :ref:`token-delimit` +uses space character (``U+0020``) but ``TokenDelimitNull`` uses NUL +character (``U+0000``). + +Syntax +------ + +``TokenDelimitNull`` hasn't parameter:: + + TokenDelimitNull + +Usage +----- + + +``TokenDelimitNull`` is also suitable for tag text. + +Here is an example of ``TokenDelimitNull``: + +.. groonga-command +.. include:: ../../example/reference/tokenizers/token-delimit-null.log +.. tokenize TokenDelimitNull "Groonga\u0000full-text-search\u0000HTTP" NormalizerAuto + + -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.osdn.me/mailman/archives/groonga-commit/attachments/20190104/ce01715a/attachment-0001.html>