-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skipping the timestamp check for permission failures #877
base: main
Are you sure you want to change the base?
Conversation
@pallavia7 you could run |
9eb30a1
to
b9e9f72
Compare
if (!skipTimestampCheck) { | ||
val gbTables = ListBuffer[String]() | ||
joinConf.joinParts.toScala.foreach { part => | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
// join parts | ||
val joinPart = Builders.GroupBy( | ||
sources = Seq(getTestGBSourceWithTs(namespace=namespace)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure how access denial is replayed here. Could you explain? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used Mockito.spy on tableUtils objects. when(tableUtils.checkTablePermission(any(), any())).thenReturn(false) modifies checkPermission behavior. So when tableUtils.checkTablePermission is invoked, the method returns false which happens when there is no access to table.
b9e9f72
to
d582acf
Compare
runTablePermissionValidation((gbTables.toList ++ List(joinConf.left.table)).toSet) | ||
} else Set() | ||
|
||
if (!skipTimestampCheck && noAccessTables.isEmpty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we print a warning error if we are skipping the timestamp check because there are tables with permission issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added warning message
def testJoinAnalyzerInvalidTablePermissions(): Unit = { | ||
val spark: SparkSession = SparkSessionBuilder.build("AnalyzerTest" + "_" + Random.alphanumeric.take(6).mkString, local = true) | ||
val tableUtils = spy(TableUtils(spark)) | ||
when(tableUtils.checkTablePermission(any(), any())).thenReturn(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider mocking the tableUtils.sql
to throw a runtime exception to mimic table permission issue, since the timestamp check logic will try to access the table data using tableUtils.sql
, and we want to ensure that this is not triggered and gated by the permission check first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not find a way to mock tableUtils.sql, as some sql queries like create etc we want to behave as existing and only certain sql statements to mock. So, I made sure that timestamp check logic function is not called by spying on mocked function with verify(analyzer, never()).runTimestampChecks(any(), any())
d582acf
to
d202219
Compare
Ran scalafmt reformat ran scalafmt
7ec9bdb
to
bd4a585
Compare
Summary
Moved the permission check before timestamp check. If permission fails, skipped the timestamp check
Why / Goal
User reported an issue in analyzer: the job failed open when users don't have permission to certain Hive tables. The correct behavior should be that any table permission issues should be caught in the analyzer step.
The root cause is that the timestamp check is enabled by default, and it runs before the table permission check. Since timestamp check requires accessing data, it failed open.
Test Plan
[+ ] Added Unit Tests
Checklist
Reviewers