Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARKC-403: Add CLUSTERING ORDER in cql statement #981

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

skp33
Copy link
Contributor

@skp33 skp33 commented May 17, 2016

I added code for Clustering Order support for table creation. Please review the code.

@RussellSpitzer
Copy link
Contributor

I think this is a good addition but I have two major requests.

  1. We need a Jira for tracking
  2. I don't like just having a String "option". I think we are slowly approaching fully redoing the Java Driver TableMetadata so I think we should have it be a copy of that object. We don't have to necessarily expose all the parameters right now but I'd feel more comfortable with TableDef just having a list of ClusteringOrders.

https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/TableMetadata.html

ClusteringOrder is represented by a list of Orderings so very close to what you currently have.

…teringOrder and Added list of Clustering order to TableDef
@skp33
Copy link
Contributor Author

skp33 commented Jul 27, 2016

@RussellSpitzer based on your suggestion i changed the code. Do i need to add & change anything else?

@RussellSpitzer
Copy link
Contributor

Make a jira https://datastax-oss.atlassian.net/projects/SPARKC/ For tracking (this is how we catch everything up in release notes and track versioning)

@@ -138,7 +149,9 @@ case class TableDef(
clusteringColumns: Seq[ColumnDef],
regularColumns: Seq[ColumnDef],
indexes: Seq[IndexDef] = Seq.empty,
isView: Boolean = false) extends StructDef {
isView: Boolean = false,
clusteringOrder: Option[Seq[ClusteringOrder]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a property of the ColunDef now? Ie do you really need to specify it separately?

@skp33
Copy link
Contributor Author

skp33 commented Jul 28, 2016

@RussellSpitzer , Jira is already there, this is the Link.

According to discussion, we need to have list of ClusteringOrders inside TableDef, correct me if i miss understood.

And for property part, it depends on use case, i don't have better project knowledge, so let me know what should be there?

@RussellSpitzer
Copy link
Contributor

It looks to me like the information about ClusteringOrders is in two places within the TableDef.

  1. Available as an indexedSequence clusteringOrder: Option[Seq[ClusteringOrder]] = None,
  2. Available within the clusteringColumn objects themselves + clusteringOrder: ClusteringOrder = ClusteringOrder.ASC) extends FieldDef {

I think only 2 is neccessary? Having both 1 and 2 gives us the possibility of having them not matching which seems like a bit of a hole in the implementation. I think it would be find to have a function on tableDef

def clusterOrder: Seq[ClusteringOrder] = clusteringColumns.map(_.clusteringOrder

If we want it exposed at that level.

@skp33
Copy link
Contributor Author

skp33 commented Aug 1, 2016

I changed according to you, Please verify. And do we need to add some functionality for this method also DataFrameFunctions.createCassandraTable?

@@ -138,7 +150,8 @@ case class TableDef(
clusteringColumns: Seq[ColumnDef],
regularColumns: Seq[ColumnDef],
indexes: Seq[IndexDef] = Seq.empty,
isView: Boolean = false) extends StructDef {
isView: Boolean = false,
options: String = "") extends StructDef {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't like being able to just pass a string here, If you really think we need this I think it should at least be Seq[String] and we should just require that they not contain "AND" or "WITH". Then we can convert the append code below into a string join instead of having the more complicated logic.

This removes the need for the "appendOptions" function and replaces it with

require(options.forAll( option => !(option.toLowerCase.contains("and") && !(option.toLowerCase.contains("WITH")), "Table options must not contain "WITH OR AND"
(options +: clusteringOptions).mkString("WITH", "AND")```

@bcantoni bcantoni changed the title Add CLUSTERING ORDER in cql statement SPARKC-403: Add CLUSTERING ORDER in cql statement Oct 17, 2016
Copy link
Contributor

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have a few thoughts on the tableOptions and some of the CQL Creation statement statement. Let me know if you want to discuss. Thanks again for your hard work on this!

clusteringColumns.map { col => s"${quote(col.columnName)} ${col.clusteringOrder}"}.mkString(", ")

def clusterOrder: Seq[ClusteringOrder] = clusteringColumns.map(_.clusteringOrder)
val ordered = clusteringColumns.map( col => s"${quote(col.columnName)} ${col.clusteringOrder}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of an overloaded variable name. Perhaps clusterOrderingClause ?

@@ -151,11 +151,12 @@ case class TableDef(
regularColumns: Seq[ColumnDef],
indexes: Seq[IndexDef] = Seq.empty,
isView: Boolean = false,
options: String = "") extends StructDef {
options: Seq[String] = Seq.empty) extends StructDef {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be tableOptions (matching the cql docs), and also now that I think about this perhaps it fits a Map better than a sequence? This would make it much clearer that we are looking for a set of key-value pairs.


require(partitionKey.forall(_.isPartitionKeyColumn), "All partition key columns must have role PartitionKeyColumn")
require(clusteringColumns.forall(_.isClusteringColumn), "All clustering columns must have role ClusteringColumn")
require(regularColumns.forall(!_.isPrimaryKeyColumn), "Regular columns cannot have role PrimaryKeyColumn")
require(options.forall( option => !(option.toLowerCase.contains("and") && !(option.toLowerCase.contains("with")))), "Table options must not contain WITH OR AND")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I want to set my comment on a table to Sand Castles with Judge's Rankings. Basically I think we should let the driver validate the tableOptions. If we want to test them here we should probably only test keys (once we change the tableOptions to a map). I think we are probably best off without the requires here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed accordingly. What will be the better option to validate tableOptions keys?

else if (!stmt.contains("WITH") && opts.startsWith("AND")) s"WITH ${opts.substring(3)}"
else if (opts == "") s"$stmt"
else s"$stmt${Properties.lineSeparator}$opts"
val orderWithOptions:Seq[String] = if (clusteringColumns.size > 0) options.+:(ordered) else options
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just go straight to the treble quote here, I think it may be a bit clearer like

val tableOptionsClause = s"WITH $clusteringOrderingClause ${(tableOptions...).mkString(AND)}"
"""$stmt $tableOptionsClause"""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants