Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

[wip] signed GCS URL #81

Closed
wants to merge 5 commits into from
Closed

Conversation

honnix
Copy link
Member

@honnix honnix commented Mar 27, 2020

TL;DR

Add support of signed GCS URL.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

Right now only AWS/S3 presigned URL is support. This PR adds support for GCS URL.

The signing part looks a bit funky, mainly due to GCP lib does not support signing URL without
materialised service account key. So this PR uses https://cloud.google.com/iam/docs/reference/credentials/rest/v1/projects.serviceAccounts/signBlob directly.

SigningPrincipal in most cases refers to the service account that signs the URL. Service account associated with flyteadmin instance needs to have the role on the principal.

Some extra effort could be done default the principal to the same service account running flyteadmin instance, but would require talking to GCE metadata server (or I just haven't figured out more handy API in client lib to use).

Tracking Issue

Follow-up issue

NA

@kumare3
Copy link
Contributor

kumare3 commented Mar 30, 2020

This is great. I think no one looks at this with [wip] :).
So eventually, our goal is to move this into flytestdlib/storage
All handling of storage URI's etc should be done there, and so it can be managed much more easily

@katrogan / @EngHabu where did we land on this?

@honnix
Copy link
Member Author

honnix commented Mar 30, 2020

@kumare3 Yeah that makes sense. I'm still working on this, so in progress. It would be great if I can get a pointer where to put this in stdlib.

Also as I put in the description, I don't quite like the fact that SigningPrincipal is mandatory, and I want to make that default to Google Compute Engine default service account.

@katrogan
Copy link
Contributor

@katrogan / @EngHabu where did we land on this?

Sorry @honnix I don't think we ever made any progress on this (or at least I can't remember 😅)

Not sure if https://github.com/lyft/flytestdlib/tree/master/storage is the right place for this in stdlib - @EngHabu and other owners would probably know better

@kumare3
Copy link
Contributor

kumare3 commented Mar 30, 2020

@katrogan why do you think flytestdlib/storage is not the right place? I think it is because then we can configure the storage uniformly.
Stow probably is not the right place, so we wanted to add API's to our own storage layer to handle these functionalities

@EngHabu
Copy link
Contributor

EngHabu commented Mar 30, 2020

I agree, getting a signed URL should be an API here.

There is an open issue that isn't getting enough traction there... I think it makes sense to add the API support in Stow. or a fork of stow for faster development (or in stdlib... less ideal) stdlib storage layer should ideally not have "logic" and should remain as thin as possible...

@katrogan
Copy link
Contributor

katrogan commented Mar 30, 2020

@kumare3 I don't work too frequently on flytestdlib so I wasn't sure if my suggestion was correct or not. thanks @EngHabu for the link!

@honnix
Copy link
Member Author

honnix commented Mar 30, 2020

I will take a look at stow then. I can leave this PR open for a couple of days to get more comments. Thank you all folks!

@kumare3
Copy link
Contributor

kumare3 commented Jul 22, 2020

@honnix maybe we can get this moving and then work on the alternate.
I think Flyteadmin has a bunch of code that should go into flytestdlib first and eventually to even more downstream layer

@honnix
Copy link
Member Author

honnix commented Jul 23, 2020

Sure. I will continue this after I'm back to work.

@kumare3
Copy link
Contributor

kumare3 commented Aug 28, 2020

@honnix do you think this is good?

@honnix
Copy link
Member Author

honnix commented Aug 29, 2020

I should be able to check this out on Monday or Tuesday.

@kumare3
Copy link
Contributor

kumare3 commented Aug 29, 2020

Thank you @honnix

@codecov-commenter
Copy link

codecov-commenter commented Sep 1, 2020

Codecov Report

Merging #81 into master will decrease coverage by 0.45%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #81      +/-   ##
==========================================
- Coverage   62.39%   61.94%   -0.46%     
==========================================
  Files         104      105       +1     
  Lines        7752     7809      +57     
==========================================
  Hits         4837     4837              
- Misses       2345     2402      +57     
  Partials      570      570              
Flag Coverage Δ
#unittests 61.94% <0.00%> (-0.46%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/data/implementations/gcp_remote_url.go 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bf159e5...c787862. Read the comment docs.

Although this doesn't seem to be required, it is better to do things according
to API doc.
@honnix
Copy link
Member Author

honnix commented Sep 1, 2020

Not quite sure of those docker tagging failures. Maybe I should start a new branch in this repo instead of my fork?

@honnix honnix mentioned this pull request Sep 1, 2020
8 tasks
@honnix
Copy link
Member Author

honnix commented Sep 1, 2020

Continuing in #121

@honnix honnix closed this Sep 1, 2020
@honnix honnix deleted the gcs-remote-data branch September 1, 2020 08:53
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants