Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Gemini to include getting file content and logging out of GOA #2851

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 98 additions & 6 deletions astroquery/gemini/core.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,21 @@
# Licensed under a 3-clause BSD style license - see LICENSE.rst
"""
Search functionality for the Gemini archive of observations.
==================================================
Gemini Observatory Archive (GOA) Astroquery Module
==================================================

For questions, contact [email protected]
Query public and proprietary data from GOA.
"""

import os

from datetime import date

from astroquery import log
import astropy
from astropy import units
from astropy.table import Table, MaskedColumn

from astroquery.gemini.urlhelper import URLHelper
import numpy as np

from .urlhelper import URLHelper
from ..query import QueryWithLogin
from ..utils.class_or_instance import class_or_instance
from . import conf
Expand Down Expand Up @@ -433,6 +434,97 @@
local_filepath = os.path.join(download_dir, filename)
self._download_file(url=url, local_filepath=local_filepath, timeout=timeout)

def _download_file_content(self, url, timeout=None, auth=None, method="GET", **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make optional arguments kwarg only. Also, what are the possible keys in kwargs, explicitly list them if possible.

Suggested change
def _download_file_content(self, url, timeout=None, auth=None, method="GET", **kwargs):
def _download_file_content(self, url, *, timeout=None, auth=None, method="GET", **kwargs):

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kwargs could be any accepted keys in the underlying Session.request. I can make a note of them.

"""Download content from a URL and return it. Resembles
`_download_file` but returns the content instead of saving it to a
local file.

Parameters
----------
url : str
The URL from where to download the file.
timeout : int, optional
Time in seconds to wait for the server response, by default
`None`.
auth : dict[str, Any], optional
Authentication details, by default `None`.
method : str, optional
The HTTP method to use, by default "GET".

Returns
-------
bytes
The downloaded content.
"""

response = self._session.request(method, url, timeout=timeout, auth=auth, **kwargs)
response.raise_for_status()

if 'content-length' in response.headers:
length = int(response.headers['content-length'])
if length == 0:
log.warn(f'URL {url} has length=0')

Check warning on line 466 in astroquery/gemini/core.py

View check run for this annotation

Codecov / codecov/patch

astroquery/gemini/core.py#L466

Added line #L466 was not covered by tests

blocksize = astropy.utils.data.conf.download_block_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to explicitly import this, importing astropy above is not enough.

content = b""

for block in response.iter_content(blocksize):
content += block

response.close()

return content

def logout(self):
"""Logout from the GOA service by deleting the specific session cookie
and updating the authentication state.
"""
# Delete specific cookie.
cookie_name = "gemini_archive_session"
if cookie_name in self._session.cookies:
del self._session.cookies[cookie_name]

# Update authentication state.
self._authenticated = False

def get_file_content(self, filename, timeout=None, auth=None, method="GET", **kwargs):
"""Wrapper around `_download_file_content`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use double backticks for everything that is not sphinx linkable.


Parameters
----------
filename : str
Name of the file to download content.
timeout : int, optional
Time in seconds to wait for the server response, by default
`None`.
auth : dict[str, Any], optional
Authentication details, by default `None`.
method : str, optional
The HTTP method to use, by default "GET".

Returns
-------
bytes
The downloaded content.
"""
url = self.get_file_url(filename)
return self._download_file_content(url, timeout=timeout, auth=auth, method=method, **kwargs)

def get_file_url(self, filename):
"""Generate the file URL based on the filename.

Parameters
----------
filename : str
The name of the file.

Returns
-------
str
The URL where the file can be downloaded.
"""
return f"https://archive.gemini.edu/file/{filename}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This url is hardwired at a lot of places, couldn't ve instead use the conf value, or store it as an instance attribute and use it here and elsewhere, too?

Copy link
Author

@davner davner Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I have refactored the gemini module to use conf more and extend the module in a planned future PR.

I did not want to make much changes on this first PR so I could first confirm that the documentation and coding styling matched astroquery.



def _gemini_json_to_table(json):
"""
Expand Down
42 changes: 41 additions & 1 deletion astroquery/gemini/tests/test_gemini.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,18 +24,37 @@ class MockResponse:

def __init__(self, text):
self.text = text
self.headers = {'content-length': str(len(text))}
self.status_code = 200

def json(self):
return json.loads(self.text)

def raise_for_status(self):
pass

def iter_content(self, blocksize):
yield self.text

def close(self):
pass


@pytest.fixture
def patch_get(request):
""" mock get requests so they return our canned JSON to mimic Gemini's archive website """
mp = request.getfixturevalue("monkeypatch")

mp.setattr(requests.Session, 'request', get_mockreturn)
return mp


@pytest.fixture
def patch_content(monkeypatch):
"""Mock requests with encoded content."""
def mock_request(*args, **kwargs):
return MockResponse(b"mock_content")

monkeypatch.setattr(requests.Session, 'request', mock_request)


# to inspect behavior, updated when the mock get call is made
Expand Down Expand Up @@ -171,3 +190,24 @@ def test_url_helper_eng_fail(test_arg):
urlsplit = url.split('/')
assert (('notengineering' in urlsplit) == should_have_noteng)
assert (('NotFail' in urlsplit) == should_have_notfail)


def test_logout():
"""Test logout functionality."""
gemini.Observations._session.cookies = {"gemini_archive_session": "some_value"}
gemini.Observations._authenticated = True
gemini.Observations.logout()
assert "gemini_archive_session" not in gemini.Observations._session.cookies
assert gemini.Observations._authenticated is False


def test_get_file_content(patch_content):
"""Test wrapper around _download_file_content."""
content = gemini.Observations.get_file_content("filename", timeout=5)
assert content == b"mock_content"


def test_get_file_url():
"""Test generating file URL based on filename."""
url = gemini.Observations.get_file_url("filename")
assert url == "https://archive.gemini.edu/file/filename"
16 changes: 16 additions & 0 deletions astroquery/gemini/tests/test_remote.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
https://astroquery.readthedocs.io/en/latest/testing.html
"""
import pytest
import requests

import os
import shutil
Expand All @@ -23,6 +24,10 @@
""" Coordinates to use for testing """
coords = SkyCoord(210.80242917, 54.34875, unit="deg")

# Filename and url
filename = "S20231016S0018.fits.bz2" # Small file
file_url = f"https://archive.gemini.edu/file{filename}"


@pytest.mark.remote_data
class TestGemini:
Expand Down Expand Up @@ -78,3 +83,14 @@ def test_get_file(self):
os.unlink(filepath)
if os.path.exists(tempdir):
shutil.rmtree(tempdir)

def test_get_file_content(self):
"""Test the `get_file_content` function."""
content = gemini.Observations.get_file_content(filename)
assert isinstance(content, bytes)
assert len(content) > 0

def test_get_file_content_with_timeout(self):
"""Test `get_file_content` with a timeout."""
with pytest.raises(requests.exceptions.Timeout):
gemini.Observations.get_file_content(filename, timeout=0.001)
21 changes: 16 additions & 5 deletions docs/gemini/gemini.rst
Original file line number Diff line number Diff line change
Expand Up @@ -132,19 +132,21 @@ the *NotFail* or *notengineering* terms respectively.
Authenticated Sessions
----------------------

The Gemini module allows for authenticated sessions using your GOA account. This is the same account you login
with on the GOA homepage at `<https://archive.gemini.edu/>`__. The `astroquery.gemini.ObservationsClass.login`
method returns `True` if successful.
The Gemini module allows for authenticated sessions using your GOA account. This is the same account you
login with on the GOA homepage at `<https://archive.gemini.edu/>`__. The
`astroquery.gemini.ObservationsClass.login` method returns `True` if successful. To logout, use the
`astroquery.gemini.ObservationsClass.logout` method to remove the Gemini Observatory Archive session cookie.

.. doctest-skip::

>>> from astroquery.gemini import Observations
>>> Observations.login(username, password)
>>> # do something with your elevated access
>>> Observations.logout()


File Downloading
----------------
File Downloading and File Content Getting
-----------------------------------------

As a convenience, you can request file downloads directly from the Gemini module. This constructs the appropriate
URL and fetches the file. It will use any authenticated session you may have, so it will retrieve any
Expand All @@ -156,6 +158,15 @@ proprietary data you may be permissioned for.
>>> Observations.get_file("GS2020AQ319-10.fits", download_dir="/tmp") # doctest: +IGNORE_OUTPUT


To get the file content without writing to disk, you can use the method
`astroquery.gemini.ObservationsClass.get_file_content`. This constructs the appropriate url and fetches the
file contents. This will use any authenticated session you have for proprietary data.

.. doctest-remote-data::
>>> from astroquery.gemini import Observations
>>> Observations.get_file_content("GS2020AQ319-10.fits") # doctest: +IGNORE_OUTPUT


Reference/API
=============

Expand Down