Skip to content

Commit

Permalink
Merge pull request #42 from SeSaMe-NUS/documentation
Browse files Browse the repository at this point in the history
Update documentation for release 0.9
  • Loading branch information
koallen authored Jul 12, 2017
2 parents 50ed0de + afc6c0d commit d5c6008
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 30 deletions.
34 changes: 20 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
# GENIE

GENIE is a Generic Inverted Index on the GPU. It builds the database from a csv file or a vector of instances. Then
GENIE will consturct the inverted list table and transfer it to the device. GENIE provides a simple way to
perform the similarity queries. User may define queries and their matching ranges, then directly call the matching
funtion. The library will parallel process all queries and save the matching result into a device_vector. A top k
search can also be simply perfromed. GENIE uses parallel searching to determine the top k values in a vector. It is
much faster than the CPU searching algorithm. All device methods are wrapped in host methods. Developers are not
required to configure the device function call. Please refer to the following documents:
GENIE is a Generic Inverted Index on GPU. It builds a database (inverted index) from high dimensional data, commonly
preprocessed by either Locality Sensitive Hashing or Shotgun and Assembly schemes. GENIE provides a simple way to
perform top-k similarity queries on top of such inverted index. The user may define queries as dimension and value
pairs, and optionally value ranges and weights. GENIE processes all queries in parallel on GPU using a Match Count
similarity model (number of dimensions with matching values in a query). For each query, top-k similar results and
their corresponding counts are returned. GENIE is much faster than other CPU searching algorithms due to extensive
parallelism on two levels: parallel query processing and multiple queries processed in parallel.

Please refer to the following technical report:

```
Generic Inverted Index on the GPU, Technical Report (TR 11/15), School of Computing, NUS.
Generic Inverted Index on the GPU, CoRR arXiv:1603.08390 at www.comp.nus.edu.sg/~atung/publication/gpugenie.pdf
Generic Inverted Index on the GPU, Technical Report (TR 11/15), School of Computing, NUS. <br>
CoRR arXiv:1603.08390 at www.comp.nus.edu.sg/~atung/publication/gpugenie.pdf
```


Expand All @@ -32,7 +35,8 @@ $ cd build
$ cmake ..
$ make -j8
```
Use target `$ make test` to run GENIE tests, `$ make doc` to build html code documentation, `$ make install` to install GENIE.
Use target `$ make test` to run GENIE tests, `$ make doc` to build html code documentation, `$ make install` to
install GENIE.

`CMake` build parameters can be further configured using the following options:
- `CMAKE_BUILD_TYPE:STRING` -- build type, one of `Release`, `Debug` (default `Debug`)
Expand All @@ -53,10 +57,10 @@ $ cmake -DGENIE_SIMDCAI=ON -DCMAKE_BUILD_TYPE=Release -DGENIE_DISTRIBUTED=ON -DG

## Running GENIE

There are several main parts of GENIE project. The core is a library `/lib/libgenie.a` with the main functionality.
To see how to use the library, you can check source code in either `/example` or `/test`. Tests are the simplest
applications built on top of GENIE library. Other utilities include a compression performance toolkit in `/perf` and
miscellaneous utilities in `/utility`. All of these tools are compiled into `/bin` directory.
There are several main parts of the GENIE project. The core is a library `/lib/libgenie.a` with the main functionality.
To see how to use the library, you can check the source code in either `/example` or `/test` directories. Tests are
the simplest applications built on top of GENIE library. Other utilities include a compression performance toolkit
in `/perf` and miscellaneous utilities in `/utility`. All of these tools are compiled into `/bin` directory.


### Compression performance toolkit
Expand Down Expand Up @@ -210,5 +214,7 @@ $ pid=$(pgrep odgenie | sed -n 2p); gdb -q --pid "${pid}"

## Documentation

The documentation is available online at http://sesame-nus.github.io/genie.

Code documentation for GENIE can be generated with `cmake` and `make`. After you configure CMake following steps in
[Compilation and Development](#compilation-and-development), just run `$ make doc`.
41 changes: 25 additions & 16 deletions doc/mainpage.dox
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,12 @@
*
* \section overview_sec Overview
*
* GENIE is a Generic Inverted Index on the GPU. It builds the database from a csv file or a vector of instances. Then GENIE will consturct the inverted list table and transfer it to the device. GENIE provides a simple way to perform the similarity queries. User may define queries and their matching ranges, then directly call the matching funtion. The library will parallel process all queries and save the matching result into a device_vector. A top k search can also be simply perfromed. GENIE uses parallel searching to determine the top k values in a vector. It is much faster than the CPU searching algorithm. All device methods are wrapped in host methods. Developers are not required to configure the device function call. Please refer to the following documents:
* GENIE is a Generic Inverted Index on GPU. It builds a database (inverted index) from high dimensional data, commonly preprocessed by either Locality Sensitive Hashing or Shotgun and Assembly schemes. GENIE provides a simple way to perform top-k similarity queries on top of such inverted index. The user may define queries as dimension and value pairs, and optionally value ranges and weights. GENIE processes all queries in parallel on GPU using a Match Count similarity model (number of dimensions with matching values in a query). For each query, top-k similar results and their corresponding counts are returned. GENIE is much faster than other CPU searching algorithms due to extensive parallelism on two levels: parallel query processing and multiple queries processed in parallel.
*
* Please refer to the following technical report:
*
* > Generic Inverted Index on the GPU, Technical Report (TR 11/15), School of Computing, NUS. <br>
* > Generic Inverted Index on the GPU, CoRR arXiv:1603.08390 at www.comp.nus.edu.sg/~atung/publication/gpugenie.pdf
* > CoRR arXiv:1603.08390 at www.comp.nus.edu.sg/~atung/publication/gpugenie.pdf
*
* \section install_sec Installation
*
Expand Down Expand Up @@ -55,23 +57,30 @@
* The GENIE interface consists of 4 important classes. They are
*
* - `genie::Config` for configuring GENIE
* - `genie::ExecutionPolicy` for providing implementation for building table, building query, and matching
* - `genie::table::inv_table` for the constructed tables
* - `genie::query::Query` for the constructed queries
* - `genie::ExecutionPolicy` for providing actual implementation of table and query building and matching
* - `genie::table::inv_table` for constructed tables (inverted index)
* - `genie::query::Query` for queries
*
* The interface also has several [functions](namespacegenie.html#func-members). The `genie::Search()`
* function is the 1st-level interface function for using GENIE. It accepts file paths to table and query CSV files and
* returns the matching result. There are also several 2nd-level interface functions, which are used internally by the
* `genie::Search()` function. These 2nd-level functions are meant to provide finer control of GENIE for advanced usage.
* `genie::Search()` function. These 2nd-level functions are meant to provide finer control of GENIE for advanced
* usage.
*
* These functions are
*
* - `genie::BuildTable()` for build the inverted index
* - `genie::BuildQuery()` for build the queries
* - `genie::BuildTable()` for building the inverted index
* - `genie::BuildQuery()` for building the queries
* - `genie::Match()` for matching
*
* To use GENIE, it should first be configured with the `genie::Config` class. According to the configurations, a corresponding
* execution policy could be generated with `genie::MakePolicy()`. Then the interface functions could be used to perform
* the search. Below is an example program demonstrating the usage of GENIE.
* To use GENIE, first configured it with the `genie::Config` class. According to the configurations, a corresponding
* execution policy will be generated using `genie::MakePolicy()`.
* Then use the 1st or 2nd level interface functions to perform the search.
*
* In general, everything in namespace [`genie`](namespacegenie.html) is a stable, public interface, while subnamespaces, such as
* `genie::utility` refer to internal components of GENIE, which may change or be removed over time.
*
* Below is an example program demonstrating the usage of GENIE.
*
* ```cpp
* #include <memory>
Expand All @@ -97,10 +106,10 @@
*
* \subsection usage_executable_sec Using the GENIE executable
*
* For convenience, we have provided an executable for interfacing with GENIE. Once GENIE gets compiled, an executable named
* `genie-cli` will appear in the `bin` folder. `./genie --help` shows you all the allowed options
* For convenience, we have provided an executable `genie-cli` (compiled to /bin directory) for interfacing with GENIE.
* `./genie-cli --help` shows you all the allowed options
*
* ```
* ```plain
* Allowed options:
* --help produce help message
*
Expand All @@ -112,6 +121,6 @@
* -q [ --query ] arg query file
* ```
*
* For example, you could run `./genie-cli -k 10 -n 5 --gpu 2 -t ../static/sift_20.csv -q ../static/sift_20.csv` to match 5 queries
* with k set to 10 using gpu 2. Table and query are loaded from `../static/sift_20.csv`.
* For example, you could run `./genie-cli -k 10 -n 5 --gpu 2 -t ../static/sift_20.csv -q ../static/sift_20.csv`
* to match 5 queries with k set to 10 using gpu 2. Table and query are loaded from `../static/sift_20.csv`.
*/

0 comments on commit d5c6008

Please sign in to comment.