Refactor exofop data (#271)

* remove paper * remove studies dir * remove slurm job generator (moved to https://github.com/tess-atlas/tess_atlas_slurm_utils) * move webbuilder to https://github.com/tess-atlas/tess_atlas_webbuilder * remove slurm job test * remove unused installs * remove slurm cli * remove_webpage_cli * remove missig cli entrypoints * add summary cli tool * Change web data url * Remove unused run_menu_page * Remove outdated docs * change version string format * refactor EXOFOP innterface * refactor EXOFOP innterface * run tests
dfm · Oct 3, 2023 · 9ff4cc5 · 9ff4cc5
1 parent f52880a
commit 9ff4cc5
Show file tree

Hide file tree

Showing 22 changed files with 125 additions and 7,062 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,6 @@
 .pytest_cache
 tmp
+*.csv
 test_jobgen
 *.tar.gz
 .virtual_documents
@@ -8,7 +9,6 @@ test_jobgen
 *.log
 out_*/
 cached*
-!src/tess_atlas/data/exofop/cached_tic_database.csv
 build
 docs/notebooks/*/*.html
 __pycache__

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,8 @@
 # All notable changes will be documented in this file
 
+# v1.0.0 : 2023-09-27
+- Refactor into [paper], [slurm], [webpage], [pipeline] repositories
+
 # v0.2.1 : 2020-09-18
 - Remove unused files
 - Refactor code into ./tess_atlas/
@@ -15,3 +18,7 @@
 
 
 [exoplanet-docs]: https://github.com/exoplanet-dev/case-studies/blob/main/docs/notebooks/quick-tess.ipynb
+[paper]: https://github.com/tess-atlas/tess_atlas_paper
+[slurm]: https://github.com/tess-atlas/tess_atlas_slurm_utils
+[webpage]: https://github.com/tess-atlas/tess_atlas_webbbuilder
+[pipeline]: https://github.com/tess-atlas/tess_atlas_pipeline
diff --git a/README.md b/README.md
@@ -1,30 +1,28 @@
-[![](https://img.shields.io/badge/Paper-Download-orange)](https://nightly.link/dfm/tess-atlas/workflows/build_paper/paper/main.pdf.zip)
-
-<!-- Pytest Coverage Comment:Begin -->
-\n<!-- Pytest Coverage Comment:End -->
-
+# TESS-Atlas
 
 <p align="center">
-  <img width = "450" src="src/tess_atlas/webbuilder/template/_static/atlas_logo.png" />
+  <img width = "450" src="https://raw.githubusercontent.com/tess-atlas/tess_atlas_webbuilder/main/source/_static/atlas_logo.png" />
   <br>
   <b>TESS Atlas</b>
 </p>
 
+The python package used to run the analyses in the TESS-Atlas catalog.
+
 ## Installation instructions
-To install the necessary packages, run
+To install the package:
 ```bash
+python -m pip install tess-atlas
+```
+or
+```bash
+git clone [email protected]:tess-atlas/tess-atlas.git
+cd tess-atlas
 python -m pip install -e .
 ```
 
-## Instructions to update TESS Atlas
-1. Create and analyse some TOIs with the following: `make run`
-2. Commit the completed TOIs to a branch and make a PR to main
-3. Once PR to main completed, run `make website` to convert the notebooks to HTML and upload to the gh-pages branch and deploy to github-pages
-
 ## How to use
 
-### Running analyses
-#### Local Run
+### Analyse a TOI
 To run the analysis for one TOI, you can run
 ```bash
 run_toi <toi id number>
@@ -35,69 +33,15 @@ To only setup the notebook + data needed for the analysis for one TOI, you can r
 ```bash
 run_toi <toi id number> --setup
 ```
-where an example `<toi id number> = 724`
-
-To run all the notebooks (in batches of 8), you can run
-```bash
-run_tois
-```
-
-#### Slurm Run
-
-To make the slurm files needed to analyse a CSV of TOIs you can run:
-```bash
-make_slurm_job --toi_csv toi_ids.csv --module_loads "git/2.18.0 gcc/9.2.0 openmpi/4.0.2 python/3.8.5"
-```
 
-Or, if you want to make the slurm job for just one TOI:
-```bash
-make_slurm_job --toi_number 174 --module_loads 'git/2.18.0 gcc/9.2.0 openmpi/4.0.2 python/3.8.5'
-```
 
 ### Downloading results
-You can download completed analyses with
+You can __download completed analyses with
 ```bash
 download_toi 103 --outdir analysed_tois
 ```
 
-## Running tests!
-Use the following to run tests (skipping slow tests)
-```bash
-python -m pip install -e ".[test]"
-pytest tests/
-```
-The following only runs the slow ones
-```bash
-pytest -vv -k "slow" tests/test_template_notebook.py
-```
-
-## Building + deploying the catalog
-### Building website
-Once all your analyses are complete, you can package all the runs into a website:
-```bash
-make_webpages --webdir webpages --notebooks {notebook_dir} --add-api
-```
-Using `add-api` will copy over the data files in addition to making the webpages (but can be a bit slow!)
-When this completes, you should have a zipped file with the webpages+data: `tess_atlas_pages.tar.gz`
-
-### Deploy website + api data
-We are storing the website data on a Nectar project.
-Assuming you are a part of the project, the steps to deploy are
-1. ssh into Nectar
-2. Delete old pages
-3. scp `tess_atlas_pages.tar.gz` into Nectar's webdir.
-4. untar webpages
-```bash
-ssh -i ~/.ssh/nectarkey.pem [email protected]
-cd /mnt/storage/
-mv _build trash
-scp [email protected]:/fred/oz200/avajpeyi/projects/atlas_runs/tess_atlas_pages.tar.gz .
-tar -xvzf tess_atlas_pages.tar.gz
-rm -rf trash
-```
-
-
 ## Publishing `tess_atlas` to pypi
 To publish to pypi, you will need admin access to this repo.
 Then, publishing just requires you to change the version number `tag` a commit.
-The `pypi_release` github action will (hopefully) take care of the rest.
+The `pypi_release` Github action will (hopefully) take care of the rest.
diff --git a/setup.py b/setup.py
@@ -24,6 +24,8 @@
     "Programming Language :: Python :: 3",
 ]
 INSTALL_REQUIRES = [
+    "ploomber-engine>=0.0.30",
+    "ploomber-core==0.2.12",  # https://github.com/ploomber/core/issues/74
     "exoplanet>=0.5.1",
     "pymc3>=3.9",
     "pymc3-ext>=0.1.0",
@@ -51,10 +53,6 @@
         "flake8",
         "black<=21.9b0",
         "isort",
-        "pympler",
-        "psutil",
-        "ploomber-engine>=0.0.30",
-        "ploomber-core==0.2.12",  # https://github.com/ploomber/core/issues/74
         "pretty-jupyter",
         "interruptingcow",
     ]
@@ -88,9 +86,7 @@ def get_cli_entry_point(cmd, pkg=NAME):
     setup(
         name=NAME,
         use_scm_version={
-            "write_to": os.path.join(
-                "src", NAME, "{0}_version.py".format(NAME)
-            ),
+            "write_to": os.path.join("src", NAME, f"{NAME}_version.py"),
             "write_to_template": '__version__ = "{version}"\n',
         },
         author=find_meta("author"),
@@ -124,6 +120,7 @@ def get_cli_entry_point(cmd, pkg=NAME):
                 get_cli_entry_point("download_toi"),
                 get_cli_entry_point("update_tic_cache"),
                 get_cli_entry_point("plot_run_stats"),
+                get_cli_entry_point("tess_atlas_summary"),
             ]
         },
     )
diff --git a/src/tess_atlas/api/download_analysed_toi.py b/src/tess_atlas/api/download_analysed_toi.py
@@ -8,7 +8,7 @@
 logger = logging.getLogger(LOGGER_NAME)
 
 COMMAND = "wget -np -r {url}"
-ROOT = f"{__website__}/_sources/content/toi_notebooks"
+ROOT = f"{__website__}/toi_data"
 NOTEBOOK = f"{ROOT}/toi_{{TOI}}.ipynb"
 FILES = f"{ROOT}/toi_{{TOI}}_files/"
 

diff --git a/src/tess_atlas/cli/download_toi_cli.py b/src/tess_atlas/cli/download_toi_cli.py
@@ -15,7 +15,7 @@ def __get_cli_args():
     parser.add_argument(
         "toi_number",
         type=int,
-        help="The TOI number to download data for (e.g. 103)",
+        help="The TOI number to __download data for (e.g. 103)",
     )
     args = parser.parse_args()
     return args.toi_number

diff --git a/src/tess_atlas/cli/tess_atlas_summary_cli.py b/src/tess_atlas/cli/tess_atlas_summary_cli.py
@@ -0,0 +1,47 @@
+import argparse
+
+from tess_atlas.data.analysis_summary import AnalysisSummary
+
+PROG = "tess_atlas_summary"
+
+
+def __get_cli_args():
+    parser = argparse.ArgumentParser(
+        prog=PROG,
+        description="""
+        Gets the latest TESS-Atlas catalog summary (a CSV with all the TOIs and their analysis status).
+        If catalog_dir is provided, it builds a new summary file.
+        """,
+        usage=f"{PROG} --catalog_dir <dir>",
+    )
+    parser.add_argument(
+        "--catalog_dir",
+        type=str,
+        help="The directory with all analyses that you want to summarise "
+        "(directory with the toi_*.ipynb and toi_*_files/)."
+        "If not provided, the latest summary file will be downloaded from the TESS-Atlas website.",
+        default=None,
+    )
+    parser.add_argument(
+        "--outdir",
+        type=str,
+        help="The directory to save the analysis summary to",
+        default=".",
+    )
+    parser.add_argument(
+        "--n_threads",
+        type=int,
+        help="The number of threads to use when summarising all the TOIs.",
+        default=1,
+    )
+    return parser.parse_args()
+
+
+def main():
+    args = __get_cli_args()
+    AnalysisSummary.load(
+        notebook_dir=args.catalog_dir,
+        outdir=args.outdir,
+        clean=True,
+        n_threads=args.n_threads,
+    ).save(args.outdir)
diff --git a/src/tess_atlas/cli/update_tic_cache_cli.py b/src/tess_atlas/cli/update_tic_cache_cli.py
@@ -62,6 +62,6 @@ def main():
         logger.info(f"To run job:\n>>> sbatch {fn}")
     else:
         logger.info(f"UPDATING TIC CACHE... clean={args.clean}")
-        db = ExofopDatabase(update=True, clean=args.clean)
+        db = ExofopDatabase(clean=args.clean)
         db.plot()
         logger.info("UPDATE COMPLETE!!")
diff --git a/src/tess_atlas/data/analysis_summary.py b/src/tess_atlas/data/analysis_summary.py
@@ -39,14 +39,23 @@ def __repr__(self):
 
     @classmethod
     def load(
-        cls, notebook_dir: str, n_threads=1, clean=True
+        cls, notebook_dir: str = None, outdir=None, n_threads=1, clean=True
     ) -> "AnalysisSummary":
-        fname = AnalysisSummary.fname(notebook_dir)
+        if notebook_dir is None:
+            # __download and return loaded summary
+            raise NotImplementedError(
+                "Downloading summary not implemented yet"
+            )
+
+        if outdir is None:
+            outdir = notebook_dir if notebook_dir else "."
+
+        fname = AnalysisSummary.fname(outdir)
         if os.path.exists(fname) and not clean:
             analysis_summary = cls.from_csv(fname)
         else:
             analysis_summary = cls.from_dir(notebook_dir, n_threads=n_threads)
-            analysis_summary.save(notebook_dir)
+            analysis_summary.save(outdir)
         return analysis_summary
 
     @classmethod
@@ -111,11 +120,8 @@ def save(self, notebook_dir: str):
     @staticmethod
     def fname(notebook_dir: str) -> str:
         # make sure notebook dir does not have a file extension
-        my_dir, fname = os.path.splitext(notebook_dir)
-        if fname:
-            raise ValueError(
-                "notebook_dir should not have a file extension: {notebook_dir}"
-            )
+        if not os.path.isdir(notebook_dir):
+            raise ValueError("notebook_dir should be a dir: {notebook_dir}")
         return os.path.join(notebook_dir, "analysis_summary.csv")