[Groonga-commit] groonga/groonga [master] doc: add description about QueryExpanderTSV

Back to archive index

Kouhei Sutou null+****@clear*****
Thu Oct 25 17:37:20 JST 2012


Kouhei Sutou	2012-10-25 17:37:20 +0900 (Thu, 25 Oct 2012)

  New Revision: 9cf082bc33db144c2018064ab5939aeb755bc06e
  https://github.com/groonga/groonga/commit/9cf082bc33db144c2018064ab5939aeb755bc06e

  Log:
    doc: add description about QueryExpanderTSV

  Added files:
    doc/source/reference/query_expanders.txt
    doc/source/reference/query_expanders/tsv.txt
  Modified files:
    doc/source/reference.txt
    doc/source/reference/commands/select.txt

  Modified: doc/source/reference.txt (+1 -0)
===================================================================
--- doc/source/reference.txt    2012-10-25 17:36:07 +0900 (f1aa068)
+++ doc/source/reference.txt    2012-10-25 17:37:20 +0900 (e5af3fc)
@@ -14,6 +14,7 @@
    reference/commands
    reference/type
    reference/tokenizers
+   reference/query_expanders
    reference/pseudo_column
    reference/grn_expr
    reference/functions

  Modified: doc/source/reference/commands/select.txt (+2 -0)
===================================================================
--- doc/source/reference/commands/select.txt    2012-10-25 17:36:07 +0900 (1399357)
+++ doc/source/reference/commands/select.txt    2012-10-25 17:37:20 +0900 (7c324e0)
@@ -442,6 +442,8 @@ storategy escalation is not used because the number of matched
 records (0) is larger than ``match_escalation_threshold`` (-1). So no
 more searches aren't executed. And no records are matched.
 
+.. _query-expansion:
+
 ``query_expansion``
 """""""""""""""""""
 

  Added: doc/source/reference/query_expanders.txt (+12 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/reference/query_expanders.txt    2012-10-25 17:37:20 +0900 (94d92bc)
@@ -0,0 +1,12 @@
+.. -*- rst -*-
+
+.. highlightlang:: none
+
+Query expanders
+===============
+
+.. toctree::
+   :maxdepth: 1
+   :glob:
+
+   query_expanders/*

  Added: doc/source/reference/query_expanders/tsv.txt (+153 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/reference/query_expanders/tsv.txt    2012-10-25 17:37:20 +0900 (a7910df)
@@ -0,0 +1,153 @@
+.. -*- rst -*-
+
+.. highlightlang:: none
+
+QueryExpanderTSV
+================
+
+Summary
+-------
+
+``QueryExpanderTSV`` is a query expander plugin that reads synonyms
+from TSV (Tab Separated Value) file. This plugin provides poor feature
+than the embedded query expansion feature. For example, it doesn't
+support word normalization. But it may be easy to use because you can
+manage your synonyms by TSV. You can edit your synonyms by spreadsheet
+application such as Excel. With the embedded query expansion feature,
+you manage your synonyms by groonga's table.
+
+Install
+-------
+
+You need to register ``query_expanders/tsv`` as a plugin before you
+use ``QueryExpanderTSV``::
+
+  register query_expanders/tsv
+
+Usage
+-----
+
+You just add ``--query_expansion QueryExpanderTSV`` parameter to
+``select`` command::
+
+  select --query "QUERY" --query_expansion QueryExpanderTSV
+
+If ``QUERY`` has registered synonyms, they are expanded. For example,
+there are the following synonyms.
+
++----------------------------+------------------------+----------------------+
+|            word            |       synonym 1        |      synonym 2       |
++============================+========================+======================+
+|          groonga           |        groonga         |        Senna         |
++----------------------------+------------------------+----------------------+
+|          mroonga           |        mroonga         |    groonga MySQL     |
++----------------------------+------------------------+----------------------+
+
+The table means that ``synonym 1`` and ``synonym 2`` are synonyms of
+``word``. For example, ``groonga`` and ``Senna`` are synonyms of
+``groonga``. And ``mroonga`` and ``groonga MySQL`` are synonyms of
+``mroonga``.
+
+Here is an example of query expnasion that uses ``groonga`` as query::
+
+  select --query "groonga" --query_expansion QueryExpanderTSV
+
+The above command equals to the following command::
+
+  select --query "groonga OR Senna" --query_expansion QueryExpanderTSV
+
+Here is another example of query expnasion that uses ``mroonga
+search`` as query::
+
+  select --query "mroonga search" --query_expansion QueryExpanderTSV
+
+The above command equals to the following command::
+
+  select --query "(mroonga OR (groonga MySQL)) search" --query_expansion QueryExpanderTSV
+
+It is important that registered words (``groonga`` and ``mroonga``)
+are only expanded to synonyms and not registered words (``search``)
+are not expanded. Query expansion isn't occurred
+recursively. ``groonga`` is appeared in ``(mroonga OR (groonga
+MySQL))`` as query expansion result but it isn't expanded.
+
+Normally, you need to include ``word`` itself into synonyms. For
+example, ``groonga`` and ``mroonga`` are included in synonyms of
+themselves. If you want to ignore ``word`` itself, you don't include
+``word`` itself into synonyms. For example, if you want to use query
+expansion as spelling correction, you should use the following
+synonyms.
+
++----------------------------+------------------------+
+|            word            |        synonym         |
++============================+========================+
+|           gronga           |        groonga         |
++----------------------------+------------------------+
+
+``gronga`` in ``word`` has a typo. A ``o`` is missing. ``groonga`` in
+``synonym`` is the correct word.
+
+Here is an example of using query expnasion as spelling correction::
+
+  select --query "gronga" --query_expansion QueryExpanderTSV
+
+The above command equals to the following command::
+
+  select --query "groonga" --query_expansion QueryExpanderTSV
+
+The former command has a typo in ``--query`` value but the latter
+command doesn't have any typos.
+
+TSV File
+--------
+
+Synonyms are defined in TSV format file. This section describes about
+it.
+
+Location
+^^^^^^^^
+
+The file name should be ``synonyms.tsv`` and it is located at
+configuration directory. For example, ``/etc/groonga/synonyms.tsv`` is
+a TSV file location. The location is decided at build time.
+
+You can change the location by environment variable
+``GRN_QUERY_EXPANDER_TSV_SYNONYMS_FILE`` at run time::
+
+  % env GRN_QUERY_EXPANDER_TSV_SYNONYMS_FILE=/tmp/synonyms.tsv groonga
+
+With the above command, ``/tmp/synonyms.tsv`` file is used.
+
+Format
+^^^^^^
+
+You can define zero or more synonyms in a TSV file. You define a
+``word`` and ``synonyms`` pair by a line. ``word`` is expanded to
+``synonyms`` in ``--query`` value. ``Synonyms`` are combined by
+``OR``. For example, ``groonga`` and ``Senna`` synonyms are expanded
+as ``groonga OR Senna``.
+
+The first column is ``word`` and the rest columns are ``synonyms`` of
+the ``word``. Here is a sample line for ``word`` is ``groonga`` and
+``synonyms`` are ``groonga`` and ``Senna``. ``(TAB)`` means a tab
+character (``U+0009``)::
+
+  groonga(TAB)groonga(TAB)Senna
+
+Comment line is supported. Lines that start with ``#`` are ignored.
+Here is an example for comment line. ``groonga`` line is ignored as
+comment line::
+
+  #groonga(TAB)groonga(TAB)Senna
+  mroonga(TAB)mroonga(TAB)groonga MySQL
+
+Limitation
+----------
+
+You need to restart groonga to reload your synonyms. TSV file is
+loaded only at the plugin load time.
+
+See also
+--------
+
+  * :ref:`query-expansion`
-------------- next part --------------
HTML����������������������������...
Download 



More information about the Groonga-commit mailing list
Back to archive index