Kouhei Sutou
null+****@clear*****
Mon Mar 16 19:07:49 JST 2015
Kouhei Sutou 2015-03-16 19:07:49 +0900 (Mon, 16 Mar 2015) New Revision: d8d41db3a5aba26e7e2767825b8899c0628fb084 https://github.com/groonga/groonga/commit/d8d41db3a5aba26e7e2767825b8899c0628fb084 Message: doc: fix markups Modified files: doc/locale/ja/LC_MESSAGES/reference.po doc/source/reference/tokenizers.rst Modified: doc/locale/ja/LC_MESSAGES/reference.po (+20 -32) =================================================================== --- doc/locale/ja/LC_MESSAGES/reference.po 2015-03-16 19:00:27 +0900 (e2e6d0e) +++ doc/locale/ja/LC_MESSAGES/reference.po 2015-03-16 19:07:49 +0900 (f6a1932) @@ -7,7 +7,7 @@ msgid "" msgstr "" "Project-Id-Version: 1.2.1\n" "Report-Msgid-Bugs-To: \n" -"PO-Revision-Date: 2015-03-16 18:59+0900\n" +"PO-Revision-Date: 2015-03-16 19:07+0900\n" "Last-Translator: Takatsugu <nokubi �� gmail.com>\n" "Language-Team: Japanese\n" "Language: ja\n" @@ -15925,14 +15925,14 @@ msgstr "" # 19310933a11f48a183a5d9ac4417afee msgid "" -"You can try a tokenizer by :doc:`/reference/commands/tokenize.rst` and :doc:" -"`/reference/commands/table_tokenize.rst`. Here is an example to try :ref:" -"`token-bigram` tokenizer by :doc:`/reference/commands/tokenize.rst`:" +"You can try a tokenizer by :doc:`/reference/commands/tokenize` and :doc:`/" +"reference/commands/table_tokenize`. Here is an example to try :ref:`token-" +"bigram` tokenizer by :doc:`/reference/commands/tokenize`:" msgstr "" -":doc:`/reference/commands/tokenize.rst` コマンドと :doc:`/reference/commands/" -"table_tokenize.rst` コマンドを使うことでトークナイザーを試すことができま" -"す。 :doc:`/reference/commands/tokenize.rst` コマンドを使って :ref:`token-" -"bigram` トークナイザーを試す例を以下に示します。" +":doc:`/reference/commands/tokenize` コマンドと :doc:`/reference/commands/" +"table_tokenize` コマンドを使うことでトークナイザーを試すことができます。 :" +"doc:`/reference/commands/tokenize` コマンドを使って :ref:`token-bigram` トー" +"クナイザーを試す例を以下に示します。" # 3f22817593594be0a270788d6c341065 msgid "What is \"tokenize\"?" @@ -16204,11 +16204,11 @@ msgstr "" # 15632a52d1d64313ae46f38e384c9e1f msgid "" -"``TokenBigram`` behavior is different when it's worked with any :doc:/" -"reference/normalizers ." +"``TokenBigram`` behavior is different when it's worked with any :doc:`/" +"reference/normalizers`." msgstr "" -"``TokenBigram`` の挙動は :doc:/reference/normalizers を使うかどうかで変わりま" -"す。" +"``TokenBigram`` の挙動は :doc:`/reference/normalizers` を使うかどうかで変わり" +"ます。" # f20ee8155a874b289cfec1ac55ede3a7 msgid "" @@ -16678,14 +16678,12 @@ msgid "Literal only case such as ``hello``" msgstr "``hello`` のようにリテラルしかないケース" # 76790620652c4dbea2d133b27efae9c0 -msgid "The beginning of text and literal case such as ``\\\\A/home/alice``" -msgstr "" -"``\\\\A/home/alice`` のようにテキストの最初でのマッチとリテラルのみのケース" +msgid "The beginning of text and literal case such as ``\\A/home/alice``" +msgstr "``\\A/home/alice`` のようにテキストの最初でのマッチとリテラルのみのケース" # 25d9e8cdaba945a19fcbc010c62f887f -msgid "The end of text and literal case such as ``\\\\.txt\\\\z``" -msgstr "" -"``\\\\.txt\\\\z`` のようにテキストの最後でのマッチとリテラルのみのケース" +msgid "The end of text and literal case such as ``\\.txt\\z``" +msgstr "``\\.txt\\z`` のようにテキストの最後でのマッチとリテラルのみのケース" # 62b97ef76dba4bdcb0ca8fa6d912e041 msgid "In most cases, index search is faster than sequential search." @@ -16705,30 +16703,20 @@ msgstr "" # 3fd096f4ca0546478dc4e6d6f2c18a70 msgid "" -"The beginning of text mark is used for the beginning of text search by ``\\" +"The beginning of text mark is used for the beginning of text search by ``" "\\A``. If you use ``TokenRegexp`` for tokenizing query, ``TokenRegexp`` adds " "the beginning of text mark (``U+FFEF``) as the first token. The beginning of " "text mark must be appeared at the first, you can get results of the " "beginning of text search." -msgstr "" -"``\\\\A`` で検索したとき、テキストの先頭であるというマークを使います。クエ" -"リーをトークナイズするために ``TokenRegexp`` を使うときは、 ``TokenRegexp`` " -"は最初のトークンとしてテキストの先頭であるというマーク( ``U+FFEF`` )を追加" -"します。テキストの先頭であるというマークは先頭にしか存在しないはずなので、テ" -"キストの先頭であるという検索結果を得ることができます。" +msgstr "``\\A`` で検索したとき、テキストの先頭であるというマークを使います。クエリーをトークナイズするために ``TokenRegexp`` を使うときは、 ``TokenRegexp`` は最初のトークンとしてテキストの先頭であるというマーク( ``U+FFEF`` )を追加します。テキストの先頭であるというマークは先頭にしか存在しないはずなので、テキストの先頭であるという検索結果を得ることができます。" # d83dc0889fe240d0824e46814d5ef547 msgid "" -"The end of text mark is used for the end of text search by ``\\\\z``. If you " +"The end of text mark is used for the end of text search by ``\\z``. If you " "use ``TokenRegexp`` for tokenizing query, ``TokenRegexp`` adds the end of " "text mark (``U+FFF0``) as the last token. The end of text mark must be " "appeared at the end, you can get results of the end of text search." -msgstr "" -"``\\\\z`` で検索したとき、テキストの最後であるというマークを使います。クエ" -"リーをトークナイズするために ``TokenRegexp`` を使うときは、 ``TokenRegexp`` " -"は最後のトークンとしてテキストの最後であるというマーク( ``U+FFF0`` )を追加" -"します。テキストの最後であるというマークは最後にしか存在しないはずなので、テ" -"キストの最後であるという検索結果を得ることができます。" +msgstr "``\\z`` で検索したとき、テキストの最後であるというマークを使います。クエリーをトークナイズするために ``TokenRegexp`` を使うときは、 ``TokenRegexp`` は最後のトークンとしてテキストの最後であるというマーク( ``U+FFF0`` )を追加します。テキストの最後であるというマークは最後にしか存在しないはずなので、テキストの最後であるという検索結果を得ることができます。" msgid "Tuning" msgstr "" Modified: doc/source/reference/tokenizers.rst (+17 -15) =================================================================== --- doc/source/reference/tokenizers.rst 2015-03-16 19:00:27 +0900 (392ffc8) +++ doc/source/reference/tokenizers.rst 2015-03-16 19:07:49 +0900 (b7f0768) @@ -39,10 +39,10 @@ Normally, :ref:`token-bigram` is a suitable tokenizer. If you don't know much about tokenizer, it's recommended that you choose :ref:`token-bigram`. -You can try a tokenizer by :doc:`/reference/commands/tokenize.rst` and -:doc:`/reference/commands/table_tokenize.rst`. Here is an example to +You can try a tokenizer by :doc:`/reference/commands/tokenize` and +:doc:`/reference/commands/table_tokenize`. Here is an example to try :ref:`token-bigram` tokenizer by -:doc:`/reference/commands/tokenize.rst`: +:doc:`/reference/commands/tokenize`: .. groonga-command .. include:: ../example/reference/tokenizers/tokenize-example.log @@ -154,7 +154,7 @@ non-ASCII languages. ``TokenBigram`` has solution for this problem described in the bellow. ``TokenBigram`` behavior is different when it's worked with any -:doc:/reference/normalizers . +:doc:`/reference/normalizers`. If no normalizer is used, ``TokenBigram`` uses pure bigram (all tokens except the last token have two characters) tokenize method: @@ -255,7 +255,8 @@ alphabets by bigram tokenize method: ``TokenBigramSplitSymbolAlphaDigit`` is similar to :ref:`token-bigram`. The difference between them is symbol, alphabet and digit handling. ``TokenBigramSplitSymbolAlphaDigit`` tokenizes -symbols, alphabets and digits by bigram tokenize method: +symbols, alphabets and digits by bigram tokenize method. It means that +all characters are tokenized by bigram tokenize method: .. groonga-command .. include:: ../example/reference/tokenizers/token-bigram-split-symbol-alpha-digit-with-normalizer.log @@ -364,7 +365,8 @@ Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlpha``: in continuous symbols and non-ASCII characters. ``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` tokenizes symbols, -alphabets and digits by bigram tokenize method. +alphabets and digits by bigram tokenize method. It means that all +characters are tokenized by bigram tokenize method. You can find difference of them by ``Hello 日 本 語 ! ! ! 777`` text because it has symbols and non-ASCII characters with white spaces, @@ -465,13 +467,13 @@ for Japanese. ``京都`` text by ``京都`` query with ``TokenMecab``. If you want to support neologisms, you need to keep updating your -MeCab dictionary. It needs maintain cost. (:ref:`token-bigrma` doesn't +MeCab dictionary. It needs maintain cost. (:ref:`token-bigram` doesn't require dictionary maintenance because :ref:`token-bigram` doesn't use dictionary.) `mecab-ipadic-NEologd : Neologism dictionary for MeCab <https://github.com/neologd/mecab-ipadic-neologd>`_ may help you. -Here is an example of ``TokenMeCab``. ``東京都`` is tokenized to ``東 -京`` and ``都``. They don't include ``京都``: +Here is an example of ``TokenMeCab``. ``東京都`` is tokenized to ``東京`` +and ``都``. They don't include ``京都``: .. groonga-command .. include:: ../example/reference/tokenizers/token-mecab.log @@ -493,15 +495,15 @@ Here is an example of ``TokenMeCab``. ``東京都`` is tokenized to ``東 This tokenizer can be used only with UTF-8. You can't use this tokenizer with EUC-JP, Shift_JIS and so on. -``TokenRegexp`` is a tokenizer for supporting subset of regular -expression search by index. +``TokenRegexp`` is a tokenizer for supporting regular expression +search by index. In general, regular expression search is evaluated as sequential search. But the following cases can be evaluated as index search: * Literal only case such as ``hello`` - * The beginning of text and literal case such as ``\\A/home/alice`` - * The end of text and literal case such as ``\\.txt\\z`` + * The beginning of text and literal case such as ``\A/home/alice`` + * The end of text and literal case such as ``\.txt\z`` In most cases, index search is faster than sequential search. @@ -515,7 +517,7 @@ index text: .. tokenize TokenRegexp "/home/alice/test.txt" NormalizerAuto --mode ADD The beginning of text mark is used for the beginning of text search by -``\\A``. If you use ``TokenRegexp`` for tokenizing query, +``\A``. If you use ``TokenRegexp`` for tokenizing query, ``TokenRegexp`` adds the beginning of text mark (``U+FFEF``) as the first token. The beginning of text mark must be appeared at the first, you can get results of the beginning of text search. @@ -524,7 +526,7 @@ you can get results of the beginning of text search. .. include:: ../example/reference/tokenizers/token-regexp-get-beginning-of-text.log .. tokenize TokenRegexp "\\A/home/alice/" NormalizerAuto --mode GET -The end of text mark is used for the end of text search by ``\\z``. +The end of text mark is used for the end of text search by ``\z``. If you use ``TokenRegexp`` for tokenizing query, ``TokenRegexp`` adds the end of text mark (``U+FFF0``) as the last token. The end of text mark must be appeared at the end, you can get results of the end of -------------- next part -------------- HTML����������������������������...Download