Kouhei Sutou
null+****@clear*****
Thu Oct 25 17:37:20 JST 2012
Kouhei Sutou 2012-10-25 17:37:20 +0900 (Thu, 25 Oct 2012) New Revision: 9cf082bc33db144c2018064ab5939aeb755bc06e https://github.com/groonga/groonga/commit/9cf082bc33db144c2018064ab5939aeb755bc06e Log: doc: add description about QueryExpanderTSV Added files: doc/source/reference/query_expanders.txt doc/source/reference/query_expanders/tsv.txt Modified files: doc/source/reference.txt doc/source/reference/commands/select.txt Modified: doc/source/reference.txt (+1 -0) =================================================================== --- doc/source/reference.txt 2012-10-25 17:36:07 +0900 (f1aa068) +++ doc/source/reference.txt 2012-10-25 17:37:20 +0900 (e5af3fc) @@ -14,6 +14,7 @@ reference/commands reference/type reference/tokenizers + reference/query_expanders reference/pseudo_column reference/grn_expr reference/functions Modified: doc/source/reference/commands/select.txt (+2 -0) =================================================================== --- doc/source/reference/commands/select.txt 2012-10-25 17:36:07 +0900 (1399357) +++ doc/source/reference/commands/select.txt 2012-10-25 17:37:20 +0900 (7c324e0) @@ -442,6 +442,8 @@ storategy escalation is not used because the number of matched records (0) is larger than ``match_escalation_threshold`` (-1). So no more searches aren't executed. And no records are matched. +.. _query-expansion: + ``query_expansion`` """"""""""""""""""" Added: doc/source/reference/query_expanders.txt (+12 -0) 100644 =================================================================== --- /dev/null +++ doc/source/reference/query_expanders.txt 2012-10-25 17:37:20 +0900 (94d92bc) @@ -0,0 +1,12 @@ +.. -*- rst -*- + +.. highlightlang:: none + +Query expanders +=============== + +.. toctree:: + :maxdepth: 1 + :glob: + + query_expanders/* Added: doc/source/reference/query_expanders/tsv.txt (+153 -0) 100644 =================================================================== --- /dev/null +++ doc/source/reference/query_expanders/tsv.txt 2012-10-25 17:37:20 +0900 (a7910df) @@ -0,0 +1,153 @@ +.. -*- rst -*- + +.. highlightlang:: none + +QueryExpanderTSV +================ + +Summary +------- + +``QueryExpanderTSV`` is a query expander plugin that reads synonyms +from TSV (Tab Separated Value) file. This plugin provides poor feature +than the embedded query expansion feature. For example, it doesn't +support word normalization. But it may be easy to use because you can +manage your synonyms by TSV. You can edit your synonyms by spreadsheet +application such as Excel. With the embedded query expansion feature, +you manage your synonyms by groonga's table. + +Install +------- + +You need to register ``query_expanders/tsv`` as a plugin before you +use ``QueryExpanderTSV``:: + + register query_expanders/tsv + +Usage +----- + +You just add ``--query_expansion QueryExpanderTSV`` parameter to +``select`` command:: + + select --query "QUERY" --query_expansion QueryExpanderTSV + +If ``QUERY`` has registered synonyms, they are expanded. For example, +there are the following synonyms. + ++----------------------------+------------------------+----------------------+ +| word | synonym 1 | synonym 2 | ++============================+========================+======================+ +| groonga | groonga | Senna | ++----------------------------+------------------------+----------------------+ +| mroonga | mroonga | groonga MySQL | ++----------------------------+------------------------+----------------------+ + +The table means that ``synonym 1`` and ``synonym 2`` are synonyms of +``word``. For example, ``groonga`` and ``Senna`` are synonyms of +``groonga``. And ``mroonga`` and ``groonga MySQL`` are synonyms of +``mroonga``. + +Here is an example of query expnasion that uses ``groonga`` as query:: + + select --query "groonga" --query_expansion QueryExpanderTSV + +The above command equals to the following command:: + + select --query "groonga OR Senna" --query_expansion QueryExpanderTSV + +Here is another example of query expnasion that uses ``mroonga +search`` as query:: + + select --query "mroonga search" --query_expansion QueryExpanderTSV + +The above command equals to the following command:: + + select --query "(mroonga OR (groonga MySQL)) search" --query_expansion QueryExpanderTSV + +It is important that registered words (``groonga`` and ``mroonga``) +are only expanded to synonyms and not registered words (``search``) +are not expanded. Query expansion isn't occurred +recursively. ``groonga`` is appeared in ``(mroonga OR (groonga +MySQL))`` as query expansion result but it isn't expanded. + +Normally, you need to include ``word`` itself into synonyms. For +example, ``groonga`` and ``mroonga`` are included in synonyms of +themselves. If you want to ignore ``word`` itself, you don't include +``word`` itself into synonyms. For example, if you want to use query +expansion as spelling correction, you should use the following +synonyms. + ++----------------------------+------------------------+ +| word | synonym | ++============================+========================+ +| gronga | groonga | ++----------------------------+------------------------+ + +``gronga`` in ``word`` has a typo. A ``o`` is missing. ``groonga`` in +``synonym`` is the correct word. + +Here is an example of using query expnasion as spelling correction:: + + select --query "gronga" --query_expansion QueryExpanderTSV + +The above command equals to the following command:: + + select --query "groonga" --query_expansion QueryExpanderTSV + +The former command has a typo in ``--query`` value but the latter +command doesn't have any typos. + +TSV File +-------- + +Synonyms are defined in TSV format file. This section describes about +it. + +Location +^^^^^^^^ + +The file name should be ``synonyms.tsv`` and it is located at +configuration directory. For example, ``/etc/groonga/synonyms.tsv`` is +a TSV file location. The location is decided at build time. + +You can change the location by environment variable +``GRN_QUERY_EXPANDER_TSV_SYNONYMS_FILE`` at run time:: + + % env GRN_QUERY_EXPANDER_TSV_SYNONYMS_FILE=/tmp/synonyms.tsv groonga + +With the above command, ``/tmp/synonyms.tsv`` file is used. + +Format +^^^^^^ + +You can define zero or more synonyms in a TSV file. You define a +``word`` and ``synonyms`` pair by a line. ``word`` is expanded to +``synonyms`` in ``--query`` value. ``Synonyms`` are combined by +``OR``. For example, ``groonga`` and ``Senna`` synonyms are expanded +as ``groonga OR Senna``. + +The first column is ``word`` and the rest columns are ``synonyms`` of +the ``word``. Here is a sample line for ``word`` is ``groonga`` and +``synonyms`` are ``groonga`` and ``Senna``. ``(TAB)`` means a tab +character (``U+0009``):: + + groonga(TAB)groonga(TAB)Senna + +Comment line is supported. Lines that start with ``#`` are ignored. +Here is an example for comment line. ``groonga`` line is ignored as +comment line:: + + #groonga(TAB)groonga(TAB)Senna + mroonga(TAB)mroonga(TAB)groonga MySQL + +Limitation +---------- + +You need to restart groonga to reload your synonyms. TSV file is +loaded only at the plugin load time. + +See also +-------- + + * :ref:`query-expansion` -------------- next part -------------- HTML����������������������������...Download