Skip to content

Commit

Permalink
Rename to ParData (#266)
Browse files Browse the repository at this point in the history
We'll go with "ParData", because:

- Homophone "Partake", emphasizing sharing and participation.
- Homophone "PyData", interaction between Python and Data.
- "Par-" means "equal in value", emphasizing that with ParData, all data
  are equally accessible.
  • Loading branch information
xuhdev authored Jul 22, 2021
1 parent 65fb6f6 commit e6e5e04
Show file tree
Hide file tree
Showing 52 changed files with 342 additions and 339 deletions.
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -142,8 +142,8 @@ dmypy.json
docs/build/
docs/source/api-references

### pydax ###
/pydax/_version.py
### pardata ###
/pardata/_version.py
/tests/datasets
/tests/doctests
/tests/tls/test_ca_bundle.pem
2 changes: 1 addition & 1 deletion AUTHORS.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Authors
=======

PyDAX is written and maintained by the CODAIT team at International Business Machines Corp. (IBM)
ParData is written and maintained by the CODAIT team at International Business Machines Corp. (IBM)

Primary co-maintainers:

Expand Down
20 changes: 10 additions & 10 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Use ``tox``
~~~~~~~~~~~

To start developing, the recommended way is to use ``tox``. This way, your development environment is automatically
prepared by ``tox``, including virtual environment setup, dependency management, installing `pydax` in development mode.
prepared by ``tox``, including virtual environment setup, dependency management, installing `pardata` in development mode.

1. Install ``tox``:

Expand All @@ -21,7 +21,7 @@ prepared by ``tox``, including virtual environment setup, dependency management,
$ pip install -U -r requirements/tox.txt # If you are inside a virtual environment, conda environment
$ pip3 install --user -U -r requirements/tox.txt # If you are outside any virtual environment or conda environment and don't have tox installed
2. At the root directory of ``pydax``, run:
2. At the root directory of ``pardata``, run:

.. code-block:: console
Expand Down Expand Up @@ -60,13 +60,13 @@ Run All Tests

Before and after one stage of development, you may want to try whether the code would pass all tests.

To run all tests on the Python versions that are supported by PyDAX and available on your system, run:
To run all tests on the Python versions that are supported by ParData and available on your system, run:

.. code-block:: console
$ tox -s
When you are brave, to force running all tests on all Python versions that are supported by PyDAX, run:
When you are brave, to force running all tests on all Python versions that are supported by ParData, run:

.. code-block:: console
Expand Down Expand Up @@ -124,15 +124,15 @@ Where to Expose a Symbol (Function, Class, etc.)?

Generally speaking:

- If a symbol is likely used by a casual user regularly, it should be exposed in :file:`pydax/__init__.py`. This gives
- If a symbol is likely used by a casual user regularly, it should be exposed in :file:`pardata/__init__.py`. This gives
casual users the cleanest and the most direct access.
- If a symbol is used only by a power user, but is unlikely used by a casual user regularly, it should be exposed in a
file that does not start with an underscore, such as :file:`pydax/schema.py`; or in the :file:`__init__.py` file in a
subdirectory that does not start with an underscore, such as :file:`pydax/loaders/__init__.py`. The rationale is that
file that does not start with an underscore, such as :file:`pardata/schema.py`; or in the :file:`__init__.py` file in a
subdirectory that does not start with an underscore, such as :file:`pardata/loaders/__init__.py`. The rationale is that
the amount of such symbols is usually large and if we expose them at the root level of the package, it would be messy
and likely confuse casual users.
- If a symbol is solely used for internal purpose, it should be exposed only in files starting with a single underscore,
such as :file:`pydax/_dataset.py`.
such as :file:`pardata/_dataset.py`.

Please keep in mind that the criteria above are not meant to be rigid: They should be applied flexibly in light of
factors such as where existing symbols are placed and other potentially important considerations (if any).
Expand Down Expand Up @@ -197,8 +197,8 @@ these packages solely for deployment of our development environment (i.e., runni
want stable packages that are used by us for these purposes. We let `Renovate`_ verify that bumping the versions won't
break anything before we actually upgrade any of these dependencies.

We should not pin the actual dependencies of PyDAX (as specified in :file:`setup.py`), because PyDAX is an intermediate
software layer -- those should be pinned only by the actual deployed application that depends on PyDAX. We should only
We should not pin the actual dependencies of ParData (as specified in :file:`setup.py`), because ParData is an intermediate
software layer -- those should be pinned only by the actual deployed application that depends on ParData. We should only
code the info of supported versions of these dependencies. If there is some regression or incompatibilities in the
latest versions of our dependencies, we should either work around them, or update :file:`setup.py` to avoid depending on
those versions.
Expand Down
78 changes: 39 additions & 39 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,43 +3,43 @@

.. readme-start
PyDAX
=====
ParData
=======

.. image:: https://img.shields.io/pypi/v/pydax.svg
:target: https://pypi.python.org/pypi/pydax
.. image:: https://img.shields.io/pypi/v/pardata.svg
:target: https://pypi.python.org/pypi/pardata
:alt: PyPI

.. image:: https://img.shields.io/pypi/pyversions/pydax
:target: https://pypi.python.org/pypi/pydax
.. image:: https://img.shields.io/pypi/pyversions/pardata
:target: https://pypi.python.org/pypi/pardata
:alt: PyPI - Python Version

.. image:: https://img.shields.io/pypi/implementation/pydax
:target: https://pypi.python.org/pypi/pydax
.. image:: https://img.shields.io/pypi/implementation/pardata
:target: https://pypi.python.org/pypi/pardata
:alt: PyPI - Implementation

.. image:: https://badges.gitter.im/codait/pydax.svg
:target: https://gitter.im/codait/pydax
.. image:: https://badges.gitter.im/codait/pardata.svg
:target: https://gitter.im/codait/pardata
:alt: Gitter

.. image:: https://github.com/codait/pydax/workflows/Runtime%20Tests/badge.svg
:target: https://github.com/CODAIT/pydax/commit/master
.. image:: https://github.com/codait/pardata/workflows/Runtime%20Tests/badge.svg
:target: https://github.com/CODAIT/pardata/commit/master
:alt: Runtime Tests

.. image:: https://github.com/codait/pydax/workflows/Lint/badge.svg
:target: https://github.com/CODAIT/pydax/commit/master
.. image:: https://github.com/codait/pardata/workflows/Lint/badge.svg
:target: https://github.com/CODAIT/pardata/commit/master
:alt: Lint

.. image:: https://github.com/codait/pydax/workflows/Docs/badge.svg
:target: https://github.com/CODAIT/pydax/commit/master
.. image:: https://github.com/codait/pardata/workflows/Docs/badge.svg
:target: https://github.com/CODAIT/pardata/commit/master
:alt: Docs

.. image:: https://github.com/codait/pydax/workflows/Development%20Environment/badge.svg
:target: https://github.com/CODAIT/pydax/commit/master
.. image:: https://github.com/codait/pardata/workflows/Development%20Environment/badge.svg
:target: https://github.com/CODAIT/pardata/commit/master
:alt: Development Environment

PyDAX is a Python API that enables data consumers and distributors to easily use and share datasets, and establishes a
standard for exchanging data assets. It enables:
ParData (homophone of *partake*) is a Python API that enables data consumers and distributors to easily use and share
datasets, and establishes a standard for exchanging data assets. It enables:

- a data scientist to have a simpler and more unified way to begin working with a wide range of datasets, and
- a data distributor to have a consistent, safe, and open source way to share datasets with interested communities.
Expand All @@ -48,24 +48,24 @@ standard for exchanging data assets. It enables:

.. code-block:: python
>>> import pydax
>>> pydax.list_all_datasets()
>>> import pardata
>>> pardata.list_all_datasets()
{'claim_sentences_search': ('1.0.2',),
..., 'wikitext103': ('1.0.1',)}
>>> pydax.load_dataset('wikitext103')
>>> pardata.load_dataset('wikitext103')
{...} # Content of the dataset
Install the Package & its Dependencies
--------------------------------------

To install the latest version of PyDAX, run
To install the latest version of ParData, run

.. code-block:: console
$ pip install pydax
$ pip install pardata
Alternatively, if you have downloaded the source, switch to the source directory (same directory as this README file,
``cd /path/to/pydax-source``) and run
``cd /path/to/pardata-source``) and run

.. code-block:: console
Expand All @@ -74,43 +74,43 @@ Alternatively, if you have downloaded the source, switch to the source directory
Quick Start
-----------

Import the package and load a dataset. PyDAX will download `WikiText-103
Import the package and load a dataset. ParData will download `WikiText-103
<https://developer.ibm.com/exchanges/data/all/wikitext-103/>`__ dataset (version ``1.0.1``) if it's not already
downloaded, and then load it.

.. code-block:: python
import pydax
wikitext103_data = pydax.load_dataset('wikitext103')
import pardata
wikitext103_data = pardata.load_dataset('wikitext103')
View available PyDAX datasets and their versions.
View available ParData datasets and their versions.

.. code-block:: python
>>> pydax.list_all_datasets()
>>> pardata.list_all_datasets()
{'claim_sentences_search': ('1.0.2',), ..., 'wikitext103': ('1.0.1',)}
To view your globally set configs for PyDAX, such as your default data directory, use :func:`pydax.get_config`.
To view your globally set configs for ParData, such as your default data directory, use :func:`pardata.get_config`.

.. code-block:: python
>>> pydax.get_config()
>>> pardata.get_config()
Config(DATADIR=PosixPath('dir/to/download/load/from'), ..., DATASET_SCHEMA_FILE_URL='file/to/load/datasets/from')
By default, :func:`pydax.load_dataset` downloads to and loads from
:file:`~/.pydax/data/<dataset-name>/<dataset-version>/`. To change the default data directory, use :func:`pydax.init`.
By default, :func:`pardata.load_dataset` downloads to and loads from
:file:`~/.pardata/data/<dataset-name>/<dataset-version>/`. To change the default data directory, use :func:`pardata.init`.

.. code-block:: python
pydax.init(DATADIR='new/dir/to/download/load/from')
pardata.init(DATADIR='new/dir/to/download/load/from')
Load a previously downloaded dataset using :func:`pydax.load_dataset`. With the new default data dir set, PyDAX now
Load a previously downloaded dataset using :func:`pardata.load_dataset`. With the new default data dir set, ParData now
searches for the `Groningen Meaning Bank <https://developer.ibm.com/exchanges/data/all/groningen-meaning-bank/>`__
dataset (version ``1.0.2``) in :file:`new/dir/to/download/load/from/gmb/1.0.2/`.

.. code-block:: python
gmb_data = load_dataset('gmb', version='1.0.2', download=False) # assuming GMB dataset was already downloaded
To learn more about PyDAX, check out `the documentation <https://pydax.readthedocs.io>`__ and the
`tutorial <https://pydax.readthedocs.io#tutorial>`__.
To learn more about ParData, check out `the documentation <https://pardata.readthedocs.io>`__ and the
`tutorial <https://pardata.readthedocs.io#tutorial>`__.
12 changes: 6 additions & 6 deletions docs/notebooks/pydax-debater-claim-demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -104,14 +104,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Download Dataset with PyDAX"
"# Download Dataset with ParData"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The PyDAX package will download the DAX dataset and load the training set as a Pandas dataframe."
"The ParData package will download the DAX dataset and load the training set as a Pandas dataframe."
]
},
{
Expand Down Expand Up @@ -140,8 +140,8 @@
}
],
"source": [
"import pydax\n",
"pydax.list_all_datasets()"
"import pardata\n",
"pardata.list_all_datasets()"
]
},
{
Expand All @@ -163,7 +163,7 @@
}
],
"source": [
"print(pydax.describe_dataset('claim_sentences_search'))"
"print(pardata.describe_dataset('claim_sentences_search'))"
]
},
{
Expand All @@ -172,7 +172,7 @@
"metadata": {},
"outputs": [],
"source": [
"claim_sentences_dataset = pydax.load_dataset('claim_sentences_search')\n",
"claim_sentences_dataset = pardata.load_dataset('claim_sentences_search')\n",
"train_data = claim_sentences_dataset['train']"
]
},
Expand Down
Loading

0 comments on commit e6e5e04

Please sign in to comment.