Daijiro MORI
null+****@clear*****
Thu Nov 14 16:56:14 JST 2013
Daijiro MORI 2013-11-14 16:56:14 +0900 (Thu, 14 Nov 2013) New Revision: 1013419acce65a7fde412af68b98aa2adc9030fd https://github.com/droonga/droonga.org/commit/1013419acce65a7fde412af68b98aa2adc9030fd Message: Copy a memo about catalog.json from github wiki. Added files: reference/catalog/index.md Added: reference/catalog/index.md (+91 -0) 100644 =================================================================== --- /dev/null +++ reference/catalog/index.md 2013-11-14 16:56:14 +0900 (a4e9ea3) @@ -0,0 +1,91 @@ +A droonga network consists of several resources. +**Catalog** is a series of data which represent the resouces in the network. +**Catalog** is shared all the nodes in the network. +So far, a **catalog** is only a json file which must be written and delivered manually. +Hopefully it would be generated by some utility program in near future, furthermore it would be maintained automatically and shared via droonga network itself. +Resources which **catalog** manages are as following. + +### effective_date + +A date string representing the day the **catalog** becomes effective. + +### zones + +**Zone** is an array of **farms** (or other **zones**). The elements in a **zone** are expected to be close to each other, like in the same host, in the same switch, in the same network. + +### farms + +**Farms** correspond with fluent-plugin-droonga instances. A fluentd process may have multiple **farms** if more than one **match** entry with type **droonga** appear in the "fluentd.conf". +Each **farm** has its own job queue. +Each **farm** can attach to a data partition which is a part of a **dataset**. + +### datasets + +A **dataset** is a set of **tables** which comprise a single logical **table** virtually. +Each **dataset** must have a unique name in the network. + +### ring + +**Ring** is a series of partitions which comprise a dataset. **replica_count**, **number\_of\_partitons** and **time-slice** factors affects the number of partitions in a **ring**. + +### number\_of\_partitions + +**number\_of\_partitions** is an integer number which represents the number of partitions devided by the hash function. The hash function which determine where each record resides the partion in a dataset is compatible with memcached. + +### date_range + +**date_range** determines when to split the dataset. If a string "infinity" is assigned, dataset is never split by time factor. + +### number\_of\_replicas + +**number\_of\_replicas** represents the number of replicas of dataset maintained in the network. + +### examples + + { + "effective_date": "2013-06-05T00:05:51Z", + "zones": ["localhost:23003/farm0", "localhost:23003/farm1"], + "farms": { + "localhost:23003/farm0": { + "device": "/disk0", + "capacity": 1024 + }, + "localhost:23003/farm1": { + "device": "/disk1", + "capacity": 1024 + } + }, + "datasets": { + "Wiki": { + "workers": { + "search": 4, + "update": 1 + }, + "number_of_replicas": 2, + "number_of_partitions": 2, + "partition_key": "_key", + "date_range": "infinity", + "ring": { + "localhost:23004": { + "weight": 10, + "partitions": { + "2013-07-24": [ + "localhost:23003/farm0.000", + "localhost:23003/farm1.000" + ] + } + }, + "localhost:23005": { + "weight": 10, + "partitions": { + "2013-07-24": [ + "localhost:23003/farm1.001", + "localhost:23003/farm0.001" + ] + } + } + } + } + } + } + -------------- next part -------------- HTML����������������������������...Download