Develop and Download Open Source Software

Browse Subversion Repository

Contents of /misc/data/README

Parent Directory Parent Directory | Revision Log Revision Log


Revision 114 - (show annotations) (download)
Mon Apr 21 01:56:41 2008 UTC (15 years, 11 months ago) by mir
File size: 987 byte(s)
Added ramdom Japanese data generator.

1 Random Japanese data generator
2
3 *What's this?
4
5 This program is a data generator for who need Japanese
6 data for performance test and so on..
7
8 Data is generated by using *.csv which is a part of mecab-ipadic.
9
10 *How to use?
11 1. Compile datagen.c
12
13 gcc -o datagen datagen.c
14
15 2. Execute datagen in *.csv directory
16
17 ./datagen 1000 2000
18
19 Argument #1 means number of bytes for each generated Japanese sentence.
20 Argument #2 means number of rows for total generated data.
21
22 Above means 1000bytes * 2000rows = total 2MB Japanese data.
23
24 *Does this program fit your needs?
25
26 It depends on if generated data should be valid Japanese or not.
27 Data is generated by random Japanese word choice, so if you want to
28 do performance test with N-gram, this is good for you.
29
30 *License
31 Dictionaly files (*.csv) is a part of mecab-ipadic so these license are
32 depends on mecab-ipadic. Please seee bellow.
33
34 http://mecab.sourceforge.net/
35
36 Tritonn Project has copyright of the others things and distributed under LGPL v2.

Back to OSDN">Back to OSDN
ViewVC Help
Powered by ViewVC 1.1.26