Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(provider-maxcompute): fixed and refactored maxcompute and oss client caching flow #203

Merged
merged 9 commits into from
Mar 10, 2025

Conversation

ayushi0014
Copy link

No description provided.

Copy link
Member

@rahmatrhd rahmatrhd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to do the same changes in alicloud_ram provider?

"github.com/aliyun/aliyun-odps-go-sdk/odps/account"
)

const assumeRoleDurationHours int64 = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const assumeRoleDurationHours int64 = 1
var assumeRoleDefaultDuration = time.Hour

can we directly set the value into duration here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure


var durationSeconds = assumeRoleDurationHours * int64(time.Hour.Seconds())

type AliAuthAccount struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make it internal

Suggested change
type AliAuthAccount struct {
type aliAuthAccount struct {

return nil, time.Time{}, fmt.Errorf("failed to assume role: %w", err)
}

expiryTimeStamp := time.Now().Add(time.Hour * time.Duration(assumeRoleDurationHours))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we check if token TTL is also present in from the assume role response, and use it for expiry instead of using default duration

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TTL is not present in the assume role response😕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's change this to pkg/aliauth/aliauth.go

if odpsClient, ok := p.getCachedOdpsClient(ramRole, stsClientID, pc.URN); ok {
return odpsClient, nil
}
ramRole := p.getRamRole(creds, ramRoleFromAppeal)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think let's rename ramRoleFromAppeal to overrideRamRole for easier understanding 😀

}
delete(p.odpsClients, cachedClientKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do the mutex lock & unlock above and below this line only

Suggested change
delete(p.odpsClients, cachedClientKey)
p.mu.Lock()
delete(p.odpsClients, cachedClientKey)
p.mu.Unlock()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should take the mutex lock at the time we're retreiving the value from map and release it once the delete operation is done. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading the value doesn't need to lock the mutex but updating/deleting does

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering a race condition where a value could be deleted from the map while another instance is reading it. To prevent multiple deletions, I used the lock. But since deleting the same key multiple times doesn’t cause issues in this case, it's not strictly necessary. we should ideally use a read lock (RLock) while reading to avoid potential race conditions. we can defer using it for now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change the implementation to as you suggested

@ayushi0014 ayushi0014 force-pushed the fix-odps-client-cache branch 2 times, most recently from 8c62095 to 14b1118 Compare March 7, 2025 05:25
@ayushi0014 ayushi0014 force-pushed the fix-odps-client-cache branch from 14b1118 to a9d6778 Compare March 7, 2025 05:42
@bearaujus bearaujus self-requested a review March 10, 2025 04:35
@rahmatrhd rahmatrhd merged commit ca1561c into main Mar 10, 2025
6 checks passed
@rahmatrhd rahmatrhd deleted the fix-odps-client-cache branch March 10, 2025 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants