Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User defined type support #159

Open
felipefzdz opened this issue May 18, 2016 · 3 comments
Open

User defined type support #159

felipefzdz opened this issue May 18, 2016 · 3 comments

Comments

@felipefzdz
Copy link
Contributor

Before getting into the details I'd like to share some ideas about this issue. The example that I'll use is this one:

CREATE TYPE address (
  street text,
  phones set<text>
);

CREATE TABLE user (
  name text,
  address frozen <address>
)

Priming UDTs: Java Types and Antlr

An UDT is basically a set of UDT Fields. Client usage would be something like this:

UdtField nameField = new UdtField("name", "Oxford St.");
UdtField phonesField = new UdtField("phones", Sets.newHashSet("076456253645", "076456253646"));

UdtType address = new UdtType(ImmutableSet.of(nameField, phonesField));

Map<String, ?> row = ImmutableMap.of(
       "name", "Peter",
       "address", address
);

PrimingRequest preparedStatementPrime = PrimingRequest.preparedStatementBuilder()
                .withQuery("select * from user ")
                .withColumnTypes(column("address", udt(TEXT, set(TEXT))))
                .withRows(row)).build();

Let's jump now into cql-antlr. One possible representation of an UDT might be:

"udt<text set<text>>"

Antlr grammar would incorporate something like this (pseudo antlr code):

data_type
    : native_type
    | list_type
    | set_type
    | map_type
    | udt_type
    ;

udt_type
    : 'udt' '<' (data_type)+ '>'
    ;

This grammar presents a recursion problem as an UDT should not contain an UDT itself. Not sure if that would be a massive problem.

A possible test case in CqlTypeFactoryTest would be:

{"udt<(text set<text>>", new UdtType(
        ImmutableSet.of(
                new UdtField(PrimitiveType.TEXT),
                new UdtField(new SetType(PrimitiveType.TEXT))
        )
)}

Any thoughts?

@chbatey
Copy link
Member

chbatey commented May 24, 2016

Thanks for this @felipefzdz

I think this looks good.

The only other way I can think of representing them in stubbed cassandra is to have them created independently of a regular prime then be referenced from a prime.

That way the grammar would be the same as the cassandra grammar.

E.g

Have a create type endpint that takes in:

TYPE address (
  street text,
  city text,
  zip_code int,
  phones set<text>
);

Then just refer to it as address in the prime as the type. This has the advantage that users can just copy their definition from the real schema parts.

A feature I have wanted for a long time but would require a lot of work is for scassandra to take your schema so that you do not need to prime variable types in prepared statements. That is by far the most common user error with scassandra (not priming them or priming them after the prepare of the statement). If you are game for starting work on that I suggest we go with the create type syntax as it'll be consistent with passing the whole schema in future versions of scassandra.

@felipefzdz
Copy link
Contributor Author

felipefzdz commented May 25, 2016

That was actually my first idea but I discarded it as I was mislead by the static nature of antlr4. Correct me if I wrong but the idea would be:

  • Implement support for DDL queries, UDT for now. That would involve handling a 'new' result type for SCassandra such as 0x0005 Schema_change: the result to a schema altering query. SCassandra Server should keep that schema in memory, like in PrimeQueryStore.
  • Expose an endpoint PrimingTypeRoute for registering schemas. SCassandra Server should keep that schema in memory, like in PrimeQueryStore.
  • After the type has been created, users will be able to prime queries using address type. CqlTypeFactory.buildType will throw an exception, as g4 grammar is static, i.e. the parser won't recognise address. ColumnType.fromString method in the server will need to check that address has been previously registered. Not really sure yet how that method will be able to parse incoming bytes into some CqlUDT object, but I'll figure out.

If this is correct I'll start working in the first bullet point :)

@felipefzdz
Copy link
Contributor Author

felipefzdz commented Jun 3, 2016

Small question @chbatey. Assuming address type and a table with a field like this fiscal_address frozen <address>. You said this: "Then just refer to it as address in the prime as the type."

        .withColumnTypes(column("fiscal_address", ???))

In those question marks I should include a CqlType. Those types are generated statically from antlr4 schema, so not sure how I can extend it dynamically with address type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants