Skip to content

Commit 004770c

Browse files
Consolidated CLI code into cli module.
Also added client_dev and server_dev shims to make it easier to develop these programs. Updated the README.
1 parent 4a86cb5 commit 004770c

File tree

7 files changed

+151
-169
lines changed

7 files changed

+151
-169
lines changed

README.txt

+47-67
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
1+
2+
.. image:: http://genomicsandhealth.org/files/logo_ga.png
3+
14
==============================
25
GA4GH Reference Implementation
36
==============================
47

5-
A reference implementation of the APIs defined in the schemas repository.
6-
7-
*************************
8-
Initial skeleton overview
9-
*************************
8+
This is a prototype for the GA4GH reference client and
9+
server applications. It is under heavy development, and many aspects of
10+
the layout and APIs will change as requirements are better understood.
11+
If you would like to help, please check out our list of
12+
`issues <https://github.com/ga4gh/server/issues>`_!
1013

11-
This is a proposed skeleton layout for the GA4GH reference client and
12-
server applications. As such, nothing is finalised and all aspects of
13-
the design and implementation are open for discussion and debate. The overall
14-
goals of the project are:
14+
Our aims for this implementation are:
1515

1616
Simplicity/clarity
1717
The main goal of this implementation is to provide an easy to understand
@@ -36,45 +36,15 @@ Ease of use
3636
make installing the ``ga4gh`` reference code very easy across a range of
3737
operating systems.
3838

39-
40-
41-
*************
42-
Trying it out
43-
*************
44-
45-
The project is designed to be published as a `PyPI <https://pypi.python.org/pypi>`_
46-
package, so ultimately installing the reference client and server programs
47-
should be as easy as::
48-
49-
$ pip install ga4gh
50-
51-
However, the code is currently only a proposal, so it has not been uploaded to
52-
the Python package index. The best way to try out the code right now is to
53-
use `virtualenv <http://virtualenv.readthedocs.org/en/latest/>`_. After cloning
54-
the git repo, and changing to the project directory, do the following::
55-
56-
$ virtualenv testenv
57-
$ source testenv/bin/activate
58-
$ python setup.py install
59-
60-
This should install the ``ga4gh_server`` and ``ga4gh_client`` scripts into the
61-
virtualenv and update your ``PATH`` so that they are available. When you have
62-
finished trying out the programs you can leave the virtualenv using::
63-
64-
$ deactivate
65-
66-
The virtualenv can be restarted at any time, and can also be deleted
67-
when you no longer need it.
68-
6939
********************************
7040
Serving variants from a VCF file
7141
********************************
7242

73-
Two implementations of the variants API is available that can serve data based
74-
on existing VCF files. This backends are based on tabix and `wormtable
75-
<http://www.biomedcentral.com/1471-2105/14/356>`_, which is a Python library
76-
to handle large scale tabular data. See `Wormtable backend`_ for instructions
77-
on serving VCF data from the GA4GH API.
43+
Two implementations of the variants API are available that can serve data based
44+
on existing VCF files. These backends are based on tabix and `wormtable
45+
<http://www.biomedcentral.com/1471-2105/14/356>`_, which is a Python library to
46+
handle large scale tabular data. See `Wormtable backend`_ for instructions on
47+
serving VCF data from the GA4GH API.
7848

7949
*****************
8050
Wormtable backend
@@ -159,38 +129,41 @@ building and indexing such large tables.
159129
Tabix backend
160130
*****************
161131

162-
The tabix backend allows us to serve variants from an arbitrary VCF file.
163-
The VCF file must first be indexed with `tabix <http://samtools.sourceforge.net/tabix.shtml>`_.
164-
Many projects, including the `1000 genomes project
165-
<http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/>`_, release files with tabix
166-
indicies already precomputed. This backend can serve such datasets without any
167-
preprocessing via the command:
132+
The tabix backend allows us to serve variants from an arbitrary VCF file. The
133+
VCF file must first be indexed with `tabix
134+
<http://samtools.sourceforge.net/tabix.shtml>`_. Many projects, including the
135+
`1000 genomes project
136+
<http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/>`_, release files
137+
with tabix indicies already precomputed. This backend can serve such datasets
138+
without any preprocessing via the command::
168139

169-
$ python ga4gh/scripts/server.py tabix DATADIR
140+
$ ga4gh_server tabix DATADIR
170141

171-
where DATADIR is a directory that contains folders of tabix-indexed VCF file(s). There cannot
172-
be more than one VCF file in any subdirectory that has data for the same reference contig.
142+
where DATADIR is a directory that contains subdirectories of tabix-indexed VCF
143+
file(s). There cannot be more than one VCF file in any subdirectory that has
144+
data for the same reference contig.
173145

174146
******
175147
Layout
176148
******
177149

178-
The code for the project is held in the ``ga4gh`` package, which corresponds
179-
to the ``ga4gh`` directory in the project root. Within this package,
180-
the functionality is split between the ``client``, ``server`` and
181-
``protocol`` modules. There is also a subpackage called ``scripts``
182-
which holds the code defining the command line interfaces for the
150+
The code for the project is held in the ``ga4gh`` package, which corresponds to
151+
the ``ga4gh`` directory in the project root. Within this package, the
152+
functionality is split between the ``client``, ``server``, ``protocol`` and
153+
``cli`` modules. The ``cli`` module contains the definitions for the
183154
``ga4gh_client`` and ``ga4gh_server`` programs.
184155

185-
For development purposes, it is useful to be able to run the command
186-
line programs directly without installing them. To do this, make hard links
187-
to the files in the scripts directory to the project root and run them
188-
from there; e.g::
156+
For development purposes, it is useful to be able to run the command line
157+
programs directly without installing them. To do this, use the
158+
``server_dev.py`` and ``client_dev.py`` scripts. (These are just shims to
159+
facilitate development, and are not intended to be distributed. The
160+
distributed versions of the programs are packaged using the setuptools
161+
``entry_point`` key word; see ``setup.py`` for details). For example, the run
162+
the server command simply run::
189163

190-
$ ln ga4gh/scripts/server.py .
191-
$ python server.py
192-
usage: server.py [-h] [--port PORT] [--verbose] {help,simulate} ...
193-
server.py: error: too few arguments
164+
$ python server_dev.py
165+
usage: server_dev.py [-h] [--port PORT] [--verbose] {help,wormtable,tabix} ...
166+
server_dev.py: error: too few arguments
194167

195168
++++++++++++
196169
Coding style
@@ -199,6 +172,13 @@ Coding style
199172
The code follows the guidelines of `PEP 8
200173
<http://legacy.python.org/dev/peps/pep-0008>`_ in most cases. The only notable
201174
difference is the use of camel case over underscore delimited identifiers; this
202-
is done for consistency with the GA4GH API. The code was checked for compliance
175+
is done for consistency with the GA4GH API. Code should be checked for compliance
203176
using the `pep8 <https://pypi.python.org/pypi/pep8>`_ tool.
204177

178+
179+
**********
180+
Deployment
181+
**********
182+
183+
*TODO* Give simple instructions for deploying the server on common platforms
184+
like Apache and Nginx.

client_dev.py

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
"""
2+
Simple shim for running the client program during development.
3+
"""
4+
import ga4gh.cli
5+
6+
if __name__ == "__main__":
7+
ga4gh.cli.client_main()

ga4gh/scripts/client.py ga4gh/cli.py

+87-5
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,102 @@
11
"""
2-
Command line interface for the ga4gh reference implementation.
2+
Command line interface programs for the GA4GH reference implementation.
3+
4+
TODO: document how to use these for development and simple deployment.
35
"""
46
from __future__ import division
57
from __future__ import print_function
68
from __future__ import unicode_literals
79

810
import time
911
import argparse
12+
import werkzeug.serving
1013

1114
import ga4gh
15+
import ga4gh.server
1216
import ga4gh.client
1317
import ga4gh.protocol
1418

19+
##############################################################################
20+
# Server
21+
##############################################################################
22+
23+
class ServerRunner(object):
24+
"""
25+
Superclass of server runner; takes care of functionality common to
26+
all backends.
27+
"""
28+
def __init__(self, args):
29+
backend = self.getBackend(args)
30+
self._port = args.port
31+
self._httpHandler = ga4gh.server.HTTPHandler(backend)
32+
33+
def run(self):
34+
werkzeug.serving.run_simple(
35+
'', self._port, self._httpHandler.wsgiApplication,
36+
use_reloader=True)
37+
38+
39+
class WormtableRunner(ServerRunner):
40+
"""
41+
Runner class to run the server using the wormtable based backend.
42+
"""
43+
def getBackend(self, args):
44+
backend = ga4gh.server.WormtableBackend(args.dataDir)
45+
return backend
46+
47+
48+
class TabixRunner(ServerRunner):
49+
"""
50+
Runner class to start the server using a tabix backend.
51+
"""
52+
def getBackend(self, args):
53+
backend = ga4gh.server.TabixBackend(args.dataDir)
54+
return backend
55+
56+
57+
def server_main():
58+
parser = argparse.ArgumentParser(description="GA4GH reference server")
59+
# Add global options
60+
parser.add_argument(
61+
"--port", "-P", default=8000, type=int,
62+
help="The port to listen on")
63+
parser.add_argument('--verbose', '-v', action='count', default=0)
64+
subparsers = parser.add_subparsers(title='subcommands',)
65+
66+
# help
67+
helpParser = subparsers.add_parser(
68+
"help",
69+
description="ga4gh_server help",
70+
help="show this help message and exit")
71+
# Wormtable backend
72+
wtbParser = subparsers.add_parser(
73+
"wormtable",
74+
description="Serve the API using a wormtable based backend.",
75+
help="Serve data from tables.")
76+
wtbParser.add_argument(
77+
"dataDir",
78+
help="The directory containing the wormtables to be served.")
79+
wtbParser.set_defaults(runner=WormtableRunner)
80+
# Tabix
81+
tabixParser = subparsers.add_parser(
82+
"tabix",
83+
description="Serve the API using a tabix based backend.",
84+
help="Serve data from Tabix indexed VCFs")
85+
tabixParser.add_argument(
86+
"dataDir",
87+
help="The directory containing VCFs")
88+
tabixParser.set_defaults(runner=TabixRunner)
89+
90+
args = parser.parse_args()
91+
if "runner" not in args:
92+
parser.print_help()
93+
else:
94+
runner = args.runner(args)
95+
runner.run()
96+
97+
##############################################################################
98+
# Client
99+
##############################################################################
15100

16101
class VariantSetSearchRunner(object):
17102
"""
@@ -130,7 +215,7 @@ def addUrlArgument(parser):
130215
parser.add_argument("baseUrl", help="The URL of the API endpoint")
131216

132217

133-
def main():
218+
def client_main():
134219
parser = argparse.ArgumentParser(description="GA4GH reference client")
135220
# Add global options
136221
parser.add_argument('--verbose', '-v', action='count', default=0)
@@ -173,6 +258,3 @@ def main():
173258
else:
174259
runner = args.runner(args)
175260
runner.run()
176-
177-
if __name__ == "__main__":
178-
main()

ga4gh/scripts/__init__.py

-3
This file was deleted.

ga4gh/scripts/server.py

-91
This file was deleted.

server_dev.py

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
"""
2+
Simple shim for running the server program during development.
3+
"""
4+
import ga4gh.cli
5+
6+
if __name__ == "__main__":
7+
ga4gh.cli.server_main()

0 commit comments

Comments
 (0)