Yasuhiro Horimoto 2018-12-26 17:29:00 +0900 (Wed, 26 Dec 2018) Revision: 2dfa7dd4b58254a98942e65075aa677b9b4de1e2 https://github.com/groonga/groonga/commit/2dfa7dd4b58254a98942e65075aa677b9b4de1e2 Message: doc: add explain for "target_class" option of TokenMecab Added files: doc/source/example/reference/tokenizers/token-mecab-target-class-option-complex.log doc/source/example/reference/tokenizers/token-mecab-target-class-option.log Modified files: doc/locale/ja/LC_MESSAGES/reference.po doc/source/reference/tokenizers.rst Modified: doc/locale/ja/LC_MESSAGES/reference.po (+29 -0) =================================================================== --- doc/locale/ja/LC_MESSAGES/reference.po 2018-12-26 10:48:10 +0900 (f5c59ed32) +++ doc/locale/ja/LC_MESSAGES/reference.po 2018-12-26 17:29:00 +0900 (3a6b91a5e) @@ -27455,6 +27455,35 @@ msgstr "" "以下は ``TokenMeCab`` の例です。 ``東京都`` は ``東京`` と ``都`` にトークナ" "イズされています。 ``京都`` というトークンはありません。" +msgid "" +"``TokenMecab`` can also specify options. ``TokenMecab`` has ``target_class`` " +"option, ``include_class`` option, ``include_reading`` option, " +"``include_form`` option and ``use_reading``." +msgstr "" +"``TokenMecab`` はオプションを指定することもできます。``TokenMecab`` は、" +"``target_class`` オプション、 ``include_class`` オプション、 " +"``include_reading`` オプション、 ``include_form`` オプションと " +"``use_reading`` オプションがあります。" + +msgid "" +"``target_class`` option searches a token of specifying a part-of-speech. For " +"example, you can search only a noun as below." +msgstr "" +"``target_class`` オプションは、指定した品詞のトークンを検索します。例えば、以" +"下のように名詞のみを検索できます。" + +msgid "" +"``target_class`` option can also specify subclasses and exclude or add " +"specific part-of-speech of specific using + or -. So, you can also search a " +"noun with excluding non-independent word and suffix of person name as below." +msgstr "" +"``target_class`` オプションは、サブクラスを指定することや、 + や - を使って、" +"特定の品詞を追加または、除外することもできます。したがって、以下のように人名" +"の接尾語と非自立語を除いた名詞を検索することもできます。" + +msgid "In this way you can search exclude the noise of token." +msgstr "このようにして、ノイズとなるトークンを除外して検索できます。" + msgid "This tokenizer is experimental. Specification may be changed." msgstr "このトークナイザーは実験的です。仕様が変わる可能性があります。" Added: doc/source/example/reference/tokenizers/token-mecab-target-class-option-complex.log (+30 -0) 100644 =================================================================== --- /dev/null +++ doc/source/example/reference/tokenizers/token-mecab-target-class-option-complex.log 2018-12-26 17:29:00 +0900 (e5d0ed51c) @@ -0,0 +1,30 @@ +Execution example:: + + tokenize 'TokenMecab("target_class", "-名詞/非自立", "target_class", "-名詞/接尾/人名", "target_class", "名詞")' '彼の名前は山田さんのはずです。' + # [ + # [ + # 0, + # 1545810363.771334, + # 0.0003197193145751953 + # ], + # [ + # { + # "value": "彼", + # "position": 0, + # "force_prefix": false, + # "force_prefix_search": false + # }, + # { + # "value": "名前", + # "position": 1, + # "force_prefix": false, + # "force_prefix_search": false + # }, + # { + # "value": "山田", + # "position": 2, + # "force_prefix": false, + # "force_prefix_search": false + # } + # ] + # ] Added: doc/source/example/reference/tokenizers/token-mecab-target-class-option.log (+42 -0) 100644 =================================================================== --- /dev/null +++ doc/source/example/reference/tokenizers/token-mecab-target-class-option.log 2018-12-26 17:29:00 +0900 (6dbf2a6e5) @@ -0,0 +1,42 @@ +Execution example:: + + tokenize 'TokenMecab("target_class", "名詞")' '彼の名前は山田さんのはずです。' + # [ + # [ + # 0, + # 1545810238.195525, + # 0.0003066062927246094 + # ], + # [ + # { + # "value": "彼", + # "position": 0, + # "force_prefix": false, + # "force_prefix_search": false + # }, + # { + # "value": "名前", + # "position": 1, + # "force_prefix": false, + # "force_prefix_search": false + # }, + # { + # "value": "山田", + # "position": 2, + # "force_prefix": false, + # "force_prefix_search": false + # }, + # { + # "value": "さん", + # "position": 3, + # "force_prefix": false, + # "force_prefix_search": false + # }, + # { + # "value": "はず", + # "position": 4, + # "force_prefix": false, + # "force_prefix_search": false + # } + # ] + # ] Modified: doc/source/reference/tokenizers.rst (+22 -0) =================================================================== --- doc/source/reference/tokenizers.rst 2018-12-26 10:48:10 +0900 (78a132c1f) +++ doc/source/reference/tokenizers.rst 2018-12-26 17:29:00 +0900 (81870f89e) @@ -547,6 +547,28 @@ and ``都``. They don't include ``京都``: .. include:: ../example/reference/tokenizers/token-mecab.log .. tokenize TokenMecab "東京都" +``TokenMecab`` can also specify options. +``TokenMecab`` has ``target_class`` option, ``include_class`` option, +``include_reading`` option, ``include_form`` option and ``use_reading``. + +``target_class`` option searches a token of specifying a part-of-speech. +For example, you can search only a noun as below. + +.. groonga-command +.. include:: ../example/reference/tokenizers/token-mecab-target-class-option.log +.. tokenize 'TokenMecab("target_class", "名詞")' '彼の名前は山田さんのはずです。' + +``target_class`` option can also specify subclasses and exclude or add specific +part-of-speech of specific using + or -. +So, you can also search a noun with excluding non-independent word and suffix of +person name as below. + +In this way you can search exclude the noise of token. + +.. groonga-command +.. include:: ../example/reference/tokenizers/token-mecab-target-class-option-complex.log +.. tokenize 'TokenMecab("target_class", "-名詞/非自立", "target_class", "-名詞/接尾/人名", "target_class", "名詞")' '彼の名前は山田さんのはずです。' + .. _token-regexp: ``TokenRegexp`` -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.osdn.me/mailman/archives/groonga-commit/attachments/20181226/9bc287cc/attachment-0001.html>