Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attribute with datatype of string doesn't accept a string containing a slash #21

Closed
sorsaffari opened this issue Feb 18, 2019 · 5 comments · Fixed by #22
Closed

attribute with datatype of string doesn't accept a string containing a slash #21

sorsaffari opened this issue Feb 18, 2019 · 5 comments · Fixed by #22
Assignees

Comments

@sorsaffari
Copy link

Grakn version: 1.5 SNAPSHOT

define

title sub attribute,
  datatype string;

insert $x isa title; $x "/";

Expected outcome:

the new instance of attribute title to be created with value "/".

Actual outcome

Error: syntax error at line 1:
insert $x isa title; $x "/";
                        ^
token recognition error at: '"/'
syntax error at line 1:
insert $x isa title; $x "/";
                          ^
token recognition error at: '";'
syntax error at line 1:
insert $x isa title; $x "/";
                            ^
no viable alternative at input '$x '
syntax error at line 1:
insert $x isa title; $x "/";
                     ^
extraneous input '$x' expecting {<EOF>, 'match', 'define', 'undefine', 'insert', 'compute'}
All uncommitted data is cleared
@haikalpribadi haikalpribadi transferred this issue from typedb/typedb Feb 27, 2019
@haikalpribadi haikalpribadi added this to the v1.0 milestone Feb 27, 2019
haikalpribadi added a commit that referenced this issue Feb 27, 2019
## What is the goal of this PR?

During the recent refactor of Graql grammar, we accidentally the lexer for `STRING_` to be too strict: we disallowed `/` character. The reason for this was that the unescape-and-escape behaviour over `/` was not symmetrical. `unescape('/') -> '/'` and `escape('/') -> '\/'`, which means the user reads back characters that they did not write, and that's not acceptable. 

However, we need to bring back `/` characters in strings (issue #21) because it is widely used.

During the investigation, we discovered that there is no need for Graql to be escaping/unescaping strings that are provided by the user in the first place. Given that the String that is parsed by ANTLR is already a valid String, we can store it "as is" (with the exception for Regular Expression, that still requires a special escape operation). I.e. what the user writes, is what the user reads, for any Unicode character accepted by ANTLR.

Fixing the above also resulted in fixing our tests to expect the correct behaviour from Graql.

## What are the changes implemented in this PR?

1) Relaxed Graql grammar to accept `/` (fixes #21)
2) Remove `escapeString()` and `unescapeString()` from `StringUtil`, and all it's usage.
3) Introduced `escapeRegex()` and `escapeRegex()` in `StringUtil`, and used them in parsing and printing regex.
4) Fixed tests to comply with the newly refined behaviour of Graql string and regex parsing/printing.
5) Additionally, we reimplement `StringUtils.repeat()` with a simple `spaces(int len)` in `SyntaxError`, so we can remove the dependency to `org.apache.commons.lang`
@sorsaffari
Copy link
Author

@haikalpribadi the team had a discussion on this and think this is a critical issue we need to resolve. I'm gonna reopen this for now until we come to a conclusive decision. there was more to the discussion than what's documented here. let's talk about it when you get the chance

@sorsaffari sorsaffari reopened this Apr 29, 2019
@lolski
Copy link
Member

lolski commented Apr 30, 2019

To clarify what @sorsaffari meant, we should be able to store a string ". The escape character \ should not be stored along with it.

The following snippet should insert " and not \" into the database:

grakn> define name sub attribute, datatype string;
grakn> insert $n isa name; $n "\"";
{$n "\"" isa name;}

This is the behaviour of most databases (eg., Postgres, MySQL) and therefore I think this should be how Grakn behaves as well.

@haikalpribadi
Copy link
Member

But why? I'm not sure I understand why you need to store an escaped quote? A string with a quote is not wrong. You can easily write that like this, '"', for example. This is accepted. In different languages, you can provide a string with single quotes ' too. It's just in Java you can only provide strings in double quotes " which means you need to escape your actual double quotes " in the string itself, but this only needs one backslash \ in java, which will be rendered first during the string assignment.

So I'm not following where this is an issue. If the user writes the string like \" then that should be stored as is. @sorsaffari @lolski

@lolski lolski modified the milestones: 1.0.0, 1.0.2 May 7, 2019
@flyingsilverfin
Copy link
Member

The main concern right now is that we aren't actually able to store a string containing " AND ' within a Graql string without a backslach being stored in Grakn as well. We will consider further what the best solution for this problem is.

@lolski lolski removed this from the 1.0.2 milestone Sep 5, 2019
@flyingsilverfin
Copy link
Member

This is related to and resolved/discussed in #224

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants