SPARKC-577: Removal of Driver Duplicate Classes #1245

absurdfarce · 2020-04-30T06:13:33Z

Description

How did the Spark Cassandra Connector Work or Not Work Before this Patch

Connector was using internal serializable reps for keyspace/table metadata because corresponding Java driver classes weren't serializable. This changed in v4.6.0.

General Design of the patch

Replaced internal refs with Java driver types directly wherever possible. Internal classes within the connector were used for several different functions:

Metadata retrieval/access
As a definition/descriptor for future table creation

In some cases (1) above could be handled with direct replacement, but we were also storing multiple layers of metadata within a single object (i.e. some table level information was stored in ColumnDef). Unfortunately the 4.x driver doesn't allow traversing the metadata tree in this way. In order to make this information available without too much breakage to the existing API some of the old classes are preserved with a new role: containers for all metadata types on a "branch" of this tree. Thus TableDef stores keyspace + table metadata, ColumnDef stores keyspace + table + column metadata, etc.

Fixes: SPARKC-577

How Has This Been Tested?

Still a WIP, hasn't been tested meaningfully yet

Checklist:

I have a ticket in the OSS JIRA
I have performed a self-review of my own code
Locally all tests pass (make sure tests fail without your patch)

…ere as well. Also added a few utility methods to help with table/view interactions.

absurdfarce · 2020-05-06T21:54:49Z

connector/src/main/scala/com/datastax/spark/connector/rdd/AbstractCassandraJoin.scala

-      val maxIndex = maxCol.componentIndex.get
-      val requiredColumns = tableDef.clusteringColumns.takeWhile(_.componentIndex.get <= maxIndex)
+      val maxIndex = tableDef.clusteringColumns.indexOf(maxCol)
+      val requiredColumns = tableDef.clusteringColumns.take(maxIndex + 1)


Wanted to highlight this for review. I'm pretty sure the logic I have in there now mirrors what was being done but wanted to make sure this was looked at more closely.

…ely)

* ColumnSelector modified to work with both TableDef and TableDescriptor ** Need an IT for the TableDef case as it isn't really covered anymore * DatasetFunctions.createCassandraTable() modified to take table options + per-clustering column ordering val

absurdfarce · 2020-06-16T07:58:29Z

Will likely wind up closing this in favor of #1250

absurdfarce added 5 commits April 27, 2020 12:40

Table/ColumnDef as containers for _all_ metadata objects they care about

86388b6

ColumnMapper (and concrete impl thereof) changes

2a2cea0

Rename/refactor

5205535

DatasetFunctions compilation fixes + a few tweaks to ColumnMapper(s)

20a852f

Minor logic fix

b6a6c34

absurdfarce added the in progress label Apr 30, 2020

absurdfarce added 2 commits May 5, 2020 15:10

Generalized type of TableDef to RelationMetadata and included views h…

41b8875

…ere as well. Also added a few utility methods to help with table/view interactions.

Looks like compilation is happy, at least via sbt

8871d7c

absurdfarce commented May 6, 2020

View reviewed changes

absurdfarce added 5 commits June 9, 2020 11:23

Merge branch 'b3.0' into sparkc577-tree-of-metadata

8586ae5

Checkpoint commit before diving into ColumnMapForReading/Writing

068f453

Making table/column descriptors implement struct/field def (respectiv…

bf25e8b

…ely)

More test fixes

d91f9d3

absurdfarce closed this Jun 19, 2020

absurdfarce removed the in progress label Jun 25, 2020

absurdfarce mentioned this pull request Jun 25, 2020

SPARKC-577, round two #1250

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARKC-577: Removal of Driver Duplicate Classes #1245

SPARKC-577: Removal of Driver Duplicate Classes #1245

absurdfarce commented Apr 30, 2020

absurdfarce May 6, 2020

absurdfarce commented Jun 16, 2020

SPARKC-577: Removal of Driver Duplicate Classes #1245

SPARKC-577: Removal of Driver Duplicate Classes #1245

Conversation

absurdfarce commented Apr 30, 2020

Description

How did the Spark Cassandra Connector Work or Not Work Before this Patch

General Design of the patch

How Has This Been Tested?

Checklist:

absurdfarce May 6, 2020

Choose a reason for hiding this comment

absurdfarce commented Jun 16, 2020