Kouhei Sutou
null+****@clear*****
Fri Oct 30 11:46:20 JST 2015
Kouhei Sutou 2015-10-30 11:46:20 +0900 (Fri, 30 Oct 2015) New Revision: e559157f7be74eea3968b091f9cae84aacefd75b https://github.com/groonga/groonga.org/commit/e559157f7be74eea3968b091f9cae84aacefd75b Message: Add PGroogna 1.0.0 release announce Added files: en/_posts/2015-10-29-pgroonga-1.0.0.md Added: en/_posts/2015-10-29-pgroonga-1.0.0.md (+107 -0) 100644 =================================================================== --- /dev/null +++ en/_posts/2015-10-29-pgroonga-1.0.0.md 2015-10-30 11:46:20 +0900 (a5775ef) @@ -0,0 +1,107 @@ +--- +layout: post.en +title: PGroonga 1.0.0 has been released +description: PGroonga 1.0.0 has been released! +--- + +## PGroonga 1.0.0 has been released + +PGroonga (píːzí:lúnɡά) 1.0.0 has been released! It's the first major release! + +### About PGroonga + +PGroonga is a PostgreSQL extension that makes PostgreSQL fast full text search platform for all languages! + +There are some PostgreSQL extensions that improves full text search feature of PostgreSQL such as [pg_trgm](http://www.postgresql.org/docs/current/static/pgtrgm.html) and [pg_bigm](http://pgbigm.osdn.jp/index_en.html). + +pg\_trgm doesn't support languages that use non-alphanumerics characters such as Japanese and Chinese. + +pg\_bigm supports languages that use non-alphanumerics characters but it's slow. + +PGroonga supports all languages, provides rich full text search related features and is very fast. Because PGroonga uses [Groonga](http://groonga.org/) that is a full-fledged full text search engine as backend. + +For example, PGroonga is a few times faster than pg\_bigm. In some cases, PGroonga is 10 times over faster than pg\_bigm. + +Here are benchmark results between PGroonga and pg\_bigm. They use Japanese Wikipedia data. + +Here is a benchmark result for creating an index: + +Extension | Index creation time +-----------|-------------------- +PGroonga | 25m 37s +pg_bigm | 5h 56m 15s + +In this case, PGroonga is about 14 times faster than pg\_bigm. + +Here is a benchmark result for full text search: + +Search keywords | N hits | PGroonga | pg\_bigm +---------------------------------------|----------|------------|--------- +"PostgreSQL" or "MySQL" | 368 | *0.030s* | 0.107s +データベース (database in Japanese) | 17172 | *0.121s* | 1.224s +テレビアニメ (TV animation in Japanese) | 22885 | *0.179s* | 2.472s +日本 (Japan in Japanese) | 625792 | 0.646s | *0.556s* + +In "日本" (Japan in Japanese) case, pg\_bigm is a bit faster(\*) than PGroonga. But PGroonga is 3 times to 14 times faster than pg\_bigm in other cases. The result shows that PGroonga can perform stable high performance fast full text search against all keywords. + +(\*) pg\_bigm can perform faster full text search against keywords that have 2 or less characters rather than keywords that have 3 or more characters. + +PGroonga provides the following features that aren't provided by other extensions: + + * Normalize feature + * Custom tokenizer feature + * Snippet feature + +Normalize feature is a feature that converts different notation texts to unified notation text. + +For simple example, both "ABC" and "abc" are converted to "abc". + +For more complex example, both "ポスグレ" (HALFWIDTH KATAKANA) and "ポスグレ" (FULLWIDTH KATAKANA) are converted to "ポスグレ" (FULLWIDTH KATAKANA). ("ポスグレ" is an abbreviation of PostgreSQL in Japanese.) + +This normalization is based on [Unicode NFKC](http://unicode.org/reports/tr15/). + +Custom tokenizer feature is a feature that customizes search keyword extraction process (tokenization). If you can custom tokenization, you can control trade-off between search precision and search performance. + +For example, if you use "tokenizer that is based of morphological analyzer", you can get better search precision and search performance but may not find some texts. + +FYI: There is no other extension that supports morphological analyzer based tokenizer. PGroonga is the only extension that supports it. + +Snippet feature is a feature that shows texts around keyword. It's used by Web search engine. Google also uses it. You can find it under page title in hit page list. PGroonga provides a function that implements it. + +There are more features: + + * Query language that uses Web search engine like syntax + * JSON search + * You can use each value for condition. You can also perform full text search against all texts in JSON. No other extension such as [JsQuery](https://github.com/postgrespro/jsquery) doesn't provide full text search feature against JSON. + +Here are features that will be implemented in the feature. They are already implemented in Groonga. + + * Query expansion feature + * Weight feature + * Stemming feature + +### Usage + +You can use PGroonga without full text search knowledge. You just create an index and puts a condition into `WHERE`: + +```sql +CREATE INDEX index_name ON table USING pgroonga (column); + +SELECT * FROM table WHERE column @@ 'PostgreSQL'; +``` + +You can also use `LIKE` to use PGroonga. PGroonga provides a feature that performs `LIKE` with index. `LIKE` with PGroonga index is faster than `LIKE` without index. It means that you can improve performance without changing your application that uses the following SQL: + +```sql +SELECT * FROM table WHERE column LIKE '%PostgreSQL%'; +``` + +Are you interested in PGroonga? Please [install](http://pgroonga.github.io/install/) and try [tutorial](http://pgroonga.github.io/tutorial/). You can know all PGroonga features. + +You can install PGroonga easily. Because PGroonga provides packages for major platforms. There are binaries for Windows. + +## Conclusion + +New PGroonga version has been released. PGroonga is a PostgreSQL extension that makes PostgreSQL fast full text search platform for all languages. + +It's the first major release. If you want to use fast full text search for all langauges on PostgreSQL, try PGroonga! -------------- next part -------------- HTML����������������������������...Download