[Groonga-commit] groonga/groonga at 16c9b1a [master] doc: add explain for TokenDelimit options

Back to archive index
Yasuhiro Horimoto null+****@clear*****
Tue Dec 18 16:58:47 JST 2018


Yasuhiro Horimoto	2018-12-18 16:58:47 +0900 (Tue, 18 Dec 2018)

  Revision: 16c9b1a3ae2136a1b7a18612ff81f62f8de78e23
  https://github.com/groonga/groonga/commit/16c9b1a3ae2136a1b7a18612ff81f62f8de78e23

  Message:
    doc: add explain for TokenDelimit options

  Added files:
    doc/source/example/reference/tokenizers/token-delimit-delimiter-option.log
    doc/source/example/reference/tokenizers/token-delimit-pattern-option.log
  Modified files:
    doc/source/reference/tokenizers.rst

  Added: doc/source/example/reference/tokenizers/token-delimit-delimiter-option.log (+24 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/example/reference/tokenizers/token-delimit-delimiter-option.log    2018-12-18 16:58:47 +0900 (eaf79fcb1)
@@ -0,0 +1,24 @@
+Execution example::
+
+  tokenize 'TokenDelimit("delimiter", ",")' "Hello,Wold"
+  # [
+  #   [
+  #     0, 
+  #     1337566253.89858, 
+  #     0.000355720520019531
+  #   ], 
+  #   [
+  #     {
+  #       "value": "Hello",
+  #       "position": 0,
+  #       "force_prefix": false,
+  #       "force_prefix_search": false
+  #     },
+  #     {
+  #       "value": "Wold",
+  #       "position": 1,
+  #       "force_prefix": false,
+  #       "force_prefix_search": false
+  #     }
+  #   ]
+  # ]

  Added: doc/source/example/reference/tokenizers/token-delimit-pattern-option.log (+24 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/example/reference/tokenizers/token-delimit-pattern-option.log    2018-12-18 16:58:47 +0900 (505c59546)
@@ -0,0 +1,24 @@
+Execution example::
+
+  tokenize 'TokenDelimit("pattern", "\\.\\s*")' "This is a pen. This is an apple."
+  # [
+  #   [
+  #     0, 
+  #     1337566253.89858, 
+  #     0.000355720520019531
+  #   ], 
+  #   [
+  #     {
+  #       "value": "This is a pen.",
+  #       "position": 0,
+  #       "force_prefix": false,
+  #       "force_prefix_search": false
+  #     },
+  #     {
+  #       "value": "This is an apple.",
+  #       "position": 1,
+  #       "force_prefix": false,
+  #       "force_prefix_search": false
+  #     }
+  #   ]
+  # ]

  Modified: doc/source/reference/tokenizers.rst (+26 -0)
===================================================================
--- doc/source/reference/tokenizers.rst    2018-12-18 16:30:19 +0900 (b46f4cce3)
+++ doc/source/reference/tokenizers.rst    2018-12-18 16:58:47 +0900 (9fc076e05)
@@ -429,6 +429,32 @@ Here is an example of ``TokenDelimit``:
 .. include:: ../example/reference/tokenizers/token-delimit.log
 .. tokenize TokenDelimit "Groonga full-text-search HTTP" NormalizerAuto
 
+``TokenDelimit`` can also specify options.
+``TokenDelimit`` has ``delimiter`` option and ``pattern`` option.
+``delimiter`` option can split token with a specified characters.
+
+For example, ``Hello,Wold`` is tokenize to ``Hello`` and ``Wold``
+with ``delimiter`` option as below.
+
+.. groonga-command
+.. include:: ../example/reference/tokenizers/token-delimit-delimiter-option.log
+.. tokenize 'TokenDelimit("delimiter", ",")' "Hello,Wold"
+
+``pattern`` option can split token with a regular expression.
+You can except needless space by ``pattern`` option.
+
+For example, ``This is a pen. This is an apple`` is tokenize to ``This is a pen`` and
+``This is an apple`` with ``pattern`` option as below.
+
+Normally, when ``This is a pen. This is an apple.`` is splitted by ``.``,
+needless spaces are included at the beginning of "This is an apple.".
+
+You can except the needless spaces by a ``pattern`` option as below example.
+
+.. groonga-command
+.. include:: ../example/reference/tokenizers/token-delimit-pattern-option.log
+.. tokenize 'TokenDelimit("pattern", "\\.\\s*")' "This is a pen. This is an apple."
+
 .. _token-delimit-null:
 
 ``TokenDelimitNull``
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.osdn.me/mailman/archives/groonga-commit/attachments/20181218/6554f9a5/attachment-0001.html>


More information about the Groonga-commit mailing list
Back to archive index