null+****@clear*****
null+****@clear*****
2012年 2月 27日 (月) 19:41:58 JST
Kouhei Sutou 2012-02-27 19:41:58 +0900 (Mon, 27 Feb 2012)
New Revision: 9f8e21127a356f12d581c7807d67bda281cfa4b1
Log:
[doc] Add a document about offline index construction.
Added files:
doc/example/indexing-data.log
doc/example/indexing-offline-index-construction.log
doc/example/indexing-online-index-construction.log
doc/example/indexing-schema.log
doc/example/indexing-search-after-offline-index-construction.log
doc/example/indexing-search-after-online-index-construction.log
doc/example/indexing-search-without-index.log
doc/source/indexing.txt
Modified files:
doc/source/reference.txt
Added: doc/example/indexing-data.log (+10 -0) 100644
===================================================================
--- /dev/null
+++ doc/example/indexing-data.log 2012-02-27 19:41:58 +0900 (493f355)
@@ -0,0 +1,10 @@
+Execution example::
+
+ > load --table Tweets
+ > [
+ > {"content":"Hello!"},
+ > {"content":"I just start it!"},
+ > {"content":"I'm sleepy... Have a nice day... Good night..."}
+ > ]
+ [[0,1330339028.22155,1.00183534622192],3]
+
\ No newline at end of file
Added: doc/example/indexing-offline-index-construction.log (+5 -0) 100644
===================================================================
--- /dev/null
+++ doc/example/indexing-offline-index-construction.log 2012-02-27 19:41:58 +0900 (3d8c4ec)
@@ -0,0 +1,5 @@
+Execution example::
+
+ > column_create Lexicon tweet COLUMN_INDEX|WITH_POSITION Tweets content
+ [[0,1330339029.62682,0.00742125511169434],true]
+
\ No newline at end of file
Added: doc/example/indexing-online-index-construction.log (+9 -0) 100644
===================================================================
--- /dev/null
+++ doc/example/indexing-online-index-construction.log 2012-02-27 19:41:58 +0900 (2c625bb)
@@ -0,0 +1,9 @@
+Execution example::
+
+ > load --table Tweets
+ > [
+ > {"content":"Good morning! Nice day."},
+ > {"content":"Let's go shopping."}
+ > ]
+ [[0,1330339030.03821,0.801372528076172],2]
+
\ No newline at end of file
Added: doc/example/indexing-schema.log (+9 -0) 100644
===================================================================
--- /dev/null
+++ doc/example/indexing-schema.log 2012-02-27 19:41:58 +0900 (bd9168f)
@@ -0,0 +1,9 @@
+Execution example::
+
+ > table_create Tweets TABLE_NO_KEY
+ [[0,1330339027.61804,0.000236272811889648],true]
+ > column_create Tweets content COLUMN_SCALAR ShortText
+ [[0,1330339027.81905,0.000560760498046875],true]
+ > table_create Lexicon TABLE_HASH_KEY|KEY_NORMALIZE ShortText --default_tokenizer TokenBigram
+ [[0,1330339028.02028,0.000248432159423828],true]
+
\ No newline at end of file
Added: doc/example/indexing-search-after-offline-index-construction.log (+5 -0) 100644
===================================================================
--- /dev/null
+++ doc/example/indexing-search-after-offline-index-construction.log 2012-02-27 19:41:58 +0900 (4099164)
@@ -0,0 +1,5 @@
+Execution example::
+
+ > select Tweets --match_columns content --query 'good nice'
+ [[0,1330339029.83545,0.000765085220336914],[[[1],[["_id","UInt32"],["content","ShortText"]],[3,"I'm sleepy... Have a nice day... Good night..."]]]]
+
\ No newline at end of file
Added: doc/example/indexing-search-after-online-index-construction.log (+5 -0) 100644
===================================================================
--- /dev/null
+++ doc/example/indexing-search-after-online-index-construction.log 2012-02-27 19:41:58 +0900 (960fd8b)
@@ -0,0 +1,5 @@
+Execution example::
+
+ > select Tweets --match_columns content --query 'good nice'
+ [[0,1330339031.04064,0.000650644302368164],[[[2],[["_id","UInt32"],["content","ShortText"]],[3,"I'm sleepy... Have a nice day... Good night..."],[4,"Good morning! Nice day."]]]]
+
\ No newline at end of file
Added: doc/example/indexing-search-without-index.log (+5 -0) 100644
===================================================================
--- /dev/null
+++ doc/example/indexing-search-without-index.log 2012-02-27 19:41:58 +0900 (a101e2d)
@@ -0,0 +1,5 @@
+Execution example::
+
+ > select Tweets --match_columns content --query 'good nice'
+ [[0,1330339029.42452,0.000802278518676758],[[[0],[["_id","UInt32"],["content","ShortText"]]]]]
+
\ No newline at end of file
Added: doc/source/indexing.txt (+108 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/indexing.txt 2012-02-27 19:41:58 +0900 (61b4c62)
@@ -0,0 +1,108 @@
+.. -*- rst -*-
+
+.. highlightlang:: none
+
+.. groonga-command
+.. database: indexing
+
+Indexing
+========
+
+Groonga supports both online index construction and offline
+index construction since 1.3.1.
+
+Online index construction
+-------------------------
+
+In online index construction, registered documents can be
+searchable quickly while indexing. But indexing requires
+more cost rather than indexing by offline index
+construction.
+
+Online index construction is suitable for a search system
+that values freshness. For example, a search system for
+tweets, news, blog posts and so on will value
+freshness. Online index construction can make fresh
+documents searchable and keep searchable while indexing.
+
+Offline index construction
+--------------------------
+
+In offline index construction, indexing cost is less than
+indexing cost by online index construction. Indexing time
+will be shorter. Index will be smaller. Resources required
+for indexing will be smaller. But a registering document
+cannot be searchable until all registered documents are
+indexed.
+
+Offline index construction is suitable for a search system
+that values less resources. If a search system doesn't value
+freshness, offline index construction will be suitable. For
+example, a reference manual search system doesn't value
+freshness because a reference manual will be updated only at
+a release.
+
+How to use
+----------
+
+Groonga uses online index construction by default. We
+register a document, we can search it quickly.
+
+Groonga uses offline index construction by adding an index
+to a column that already has data.
+
+We create a schema:
+
+.. groonga-command
+.. include:: ../example/indexing-schema.log
+.. table_create Tweets TABLE_NO_KEY
+.. column_create Tweets content COLUMN_SCALAR ShortText
+.. table_create Lexicon TABLE_HASH_KEY|KEY_NORMALIZE ShortText --default_tokenizer TokenBigram
+
+We register data:
+
+.. groonga-command
+.. include:: ../example/indexing-data.log
+.. load --table Tweets
+.. [
+.. {"content":"Hello!"},
+.. {"content":"I just start it!"},
+.. {"content":"I'm sleepy... Have a nice day... Good night..."}
+.. ]
+
+We search without index. We get no result:
+
+.. groonga-command
+.. include:: ../example/indexing-search-without-index.log
+.. select Tweets --match_columns content --query 'good nice'
+
+We create index for ``Tweets.content``. Already registered
+data in ``Tweets.content`` are indexed by offline index
+construction:
+
+.. groonga-command
+.. include:: ../example/indexing-offline-index-construction.log
+.. column_create Lexicon tweet COLUMN_INDEX|WITH_POSITION Tweets content
+
+We search with index. We get a matched record:
+
+.. groonga-command
+.. include:: ../example/indexing-search-after-offline-index-construction.log
+.. select Tweets --match_columns content --query 'good nice'
+
+We register data again. They are indexed by online index
+construction:
+
+.. groonga-command
+.. include:: ../example/indexing-online-index-construction.log
+.. load --table Tweets
+.. [
+.. {"content":"Good morning! Nice day."},
+.. {"content":"Let's go shopping."}
+.. ]
+
+We can get newly registered records by searching:
+
+.. groonga-command
+.. include:: ../example/indexing-search-after-online-index-construction.log
+.. select Tweets --match_columns content --query 'good nice'
Modified: doc/source/reference.txt (+1 -0)
===================================================================
--- doc/source/reference.txt 2012-02-27 19:28:31 +0900 (40fdbc6)
+++ doc/source/reference.txt 2012-02-27 19:41:58 +0900 (02d41f6)
@@ -15,4 +15,5 @@
pseudo_column
expr
functions
+ indexing
log