SPARKC-403: Add CLUSTERING ORDER in cql statement #981

skp33 · 2016-05-17T11:44:37Z

I added code for Clustering Order support for table creation. Please review the code.

RussellSpitzer · 2016-07-18T20:15:12Z

I think this is a good addition but I have two major requests.

We need a Jira for tracking
I don't like just having a String "option". I think we are slowly approaching fully redoing the Java Driver TableMetadata so I think we should have it be a copy of that object. We don't have to necessarily expose all the parameters right now but I'd feel more comfortable with TableDef just having a list of ClusteringOrders.

https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/TableMetadata.html

ClusteringOrder is represented by a list of Orderings so very close to what you currently have.

…teringOrder and Added list of Clustering order to TableDef

skp33 · 2016-07-27T13:03:17Z

@RussellSpitzer based on your suggestion i changed the code. Do i need to add & change anything else?

RussellSpitzer · 2016-07-27T16:39:57Z

Make a jira https://datastax-oss.atlassian.net/projects/SPARKC/ For tracking (this is how we catch everything up in release notes and track versioning)

RussellSpitzer · 2016-07-27T16:42:00Z

spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/cql/Schema.scala

@@ -138,7 +149,9 @@ case class TableDef(
    clusteringColumns: Seq[ColumnDef],
    regularColumns: Seq[ColumnDef],
    indexes: Seq[IndexDef] = Seq.empty,
-    isView: Boolean = false) extends StructDef {
+    isView: Boolean = false,
+    clusteringOrder: Option[Seq[ClusteringOrder]] = None,


Shouldn't this be a property of the ColunDef now? Ie do you really need to specify it separately?

skp33 · 2016-07-28T13:36:00Z

@RussellSpitzer , Jira is already there, this is the Link.

According to discussion, we need to have list of ClusteringOrders inside TableDef, correct me if i miss understood.

And for property part, it depends on use case, i don't have better project knowledge, so let me know what should be there?

RussellSpitzer · 2016-08-01T01:20:50Z

It looks to me like the information about ClusteringOrders is in two places within the TableDef.

Available as an indexedSequence clusteringOrder: Option[Seq[ClusteringOrder]] = None,
Available within the clusteringColumn objects themselves + clusteringOrder: ClusteringOrder = ClusteringOrder.ASC) extends FieldDef {

I think only 2 is neccessary? Having both 1 and 2 gives us the possibility of having them not matching which seems like a bit of a hole in the implementation. I think it would be find to have a function on tableDef

def clusterOrder: Seq[ClusteringOrder] = clusteringColumns.map(_.clusteringOrder

If we want it exposed at that level.

skp33 · 2016-08-01T10:11:47Z

I changed according to you, Please verify. And do we need to add some functionality for this method also DataFrameFunctions.createCassandraTable?

RussellSpitzer · 2016-08-09T17:31:42Z

spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/cql/Schema.scala

@@ -138,7 +150,8 @@ case class TableDef(
    clusteringColumns: Seq[ColumnDef],
    regularColumns: Seq[ColumnDef],
    indexes: Seq[IndexDef] = Seq.empty,
-    isView: Boolean = false) extends StructDef {
+    isView: Boolean = false,
+    options: String = "") extends StructDef {


I still don't like being able to just pass a string here, If you really think we need this I think it should at least be Seq[String] and we should just require that they not contain "AND" or "WITH". Then we can convert the append code below into a string join instead of having the more complicated logic.

This removes the need for the "appendOptions" function and replaces it with

require(options.forAll( option => !(option.toLowerCase.contains("and") && !(option.toLowerCase.contains("WITH")), "Table options must not contain "WITH OR AND" (options +: clusteringOptions).mkString("WITH", "AND")```

RussellSpitzer

I still have a few thoughts on the tableOptions and some of the CQL Creation statement statement. Let me know if you want to discuss. Thanks again for your hard work on this!

RussellSpitzer · 2016-10-18T23:57:54Z

spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/cql/Schema.scala

-    clusteringColumns.map { col => s"${quote(col.columnName)} ${col.clusteringOrder}"}.mkString(", ")
-
-  def clusterOrder: Seq[ClusteringOrder] = clusteringColumns.map(_.clusteringOrder)
+    val ordered = clusteringColumns.map( col => s"${quote(col.columnName)} ${col.clusteringOrder}")


This is a bit of an overloaded variable name. Perhaps clusterOrderingClause ?

RussellSpitzer · 2016-10-19T00:00:18Z

spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/cql/Schema.scala

@@ -151,11 +151,12 @@ case class TableDef(
    regularColumns: Seq[ColumnDef],
    indexes: Seq[IndexDef] = Seq.empty,
    isView: Boolean = false,
-    options: String = "") extends StructDef {
+    options: Seq[String] = Seq.empty) extends StructDef {


Perhaps this should be tableOptions (matching the cql docs), and also now that I think about this perhaps it fits a Map better than a sequence? This would make it much clearer that we are looking for a set of key-value pairs.

RussellSpitzer · 2016-10-19T00:02:20Z

spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/cql/Schema.scala


  require(partitionKey.forall(_.isPartitionKeyColumn), "All partition key columns must have role PartitionKeyColumn")
  require(clusteringColumns.forall(_.isClusteringColumn), "All clustering columns must have role ClusteringColumn")
  require(regularColumns.forall(!_.isPrimaryKeyColumn), "Regular columns cannot have role PrimaryKeyColumn")
+  require(options.forall( option => !(option.toLowerCase.contains("and") && !(option.toLowerCase.contains("with")))), "Table options must not contain WITH OR AND")


What if I want to set my comment on a table to Sand Castles with Judge's Rankings. Basically I think we should let the driver validate the tableOptions. If we want to test them here we should probably only test keys (once we change the tableOptions to a map). I think we are probably best off without the requires here.

I changed accordingly. What will be the better option to validate tableOptions keys?

RussellSpitzer · 2016-10-19T00:10:39Z

spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/cql/Schema.scala

-    else if (!stmt.contains("WITH") && opts.startsWith("AND")) s"WITH ${opts.substring(3)}"
-    else if (opts == "") s"$stmt"
-    else s"$stmt${Properties.lineSeparator}$opts"
+    val orderWithOptions:Seq[String] = if (clusteringColumns.size > 0) options.+:(ordered) else options


lets just go straight to the treble quote here, I think it may be a bit clearer like

val tableOptionsClause = s"WITH $clusteringOrderingClause ${(tableOptions...).mkString(AND)}" """$stmt $tableOptionsClause"""

kaushal added 2 commits May 17, 2016 14:57

Add CLUSTERING ORDER in cql statement

4ae3ff9

Add CLUSTERING ORDER in cql statement

5bc7cc6

Changed custom ClusteringOrder class to com.datastax.driver.core.Clus…

4dd085c

…teringOrder and Added list of Clustering order to TableDef

RussellSpitzer reviewed Jul 27, 2016
View reviewed changes

Removed list of Clustering order from TableDef

bae67de

RussellSpitzer reviewed Aug 9, 2016
View reviewed changes

Changed Options String to Seq[String]

697dd0e

bcantoni changed the title ~~Add CLUSTERING ORDER in cql statement~~ SPARKC-403: Add CLUSTERING ORDER in cql statement Oct 17, 2016

RussellSpitzer suggested changes Oct 19, 2016

View reviewed changes

kaushal added 2 commits October 19, 2016 20:02

Changed Options to tableOptions and its type to Map[String,String].

0aa10be

Changed Options to tableOptions and its type to Map[String,String].

dfa0382

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARKC-403: Add CLUSTERING ORDER in cql statement #981

SPARKC-403: Add CLUSTERING ORDER in cql statement #981

skp33 commented May 17, 2016

RussellSpitzer commented Jul 18, 2016

skp33 commented Jul 27, 2016

RussellSpitzer commented Jul 27, 2016

RussellSpitzer Jul 27, 2016

skp33 commented Jul 28, 2016

RussellSpitzer commented Aug 1, 2016

skp33 commented Aug 1, 2016

RussellSpitzer Aug 9, 2016

RussellSpitzer left a comment

RussellSpitzer Oct 18, 2016

RussellSpitzer Oct 19, 2016

RussellSpitzer Oct 19, 2016

skp33 Oct 19, 2016

RussellSpitzer Oct 19, 2016

SPARKC-403: Add CLUSTERING ORDER in cql statement #981

Are you sure you want to change the base?

SPARKC-403: Add CLUSTERING ORDER in cql statement #981

Conversation

skp33 commented May 17, 2016

RussellSpitzer commented Jul 18, 2016

skp33 commented Jul 27, 2016

RussellSpitzer commented Jul 27, 2016

RussellSpitzer Jul 27, 2016

Choose a reason for hiding this comment

skp33 commented Jul 28, 2016

RussellSpitzer commented Aug 1, 2016

skp33 commented Aug 1, 2016

RussellSpitzer Aug 9, 2016

Choose a reason for hiding this comment

RussellSpitzer left a comment

Choose a reason for hiding this comment

RussellSpitzer Oct 18, 2016

Choose a reason for hiding this comment

RussellSpitzer Oct 19, 2016

Choose a reason for hiding this comment

RussellSpitzer Oct 19, 2016

Choose a reason for hiding this comment

skp33 Oct 19, 2016

Choose a reason for hiding this comment

RussellSpitzer Oct 19, 2016

Choose a reason for hiding this comment