Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty strings shouldn't be indexed #96

Open
mlissner opened this issue Aug 14, 2015 · 2 comments
Open

Empty strings shouldn't be indexed #96

mlissner opened this issue Aug 14, 2015 · 2 comments

Comments

@mlissner
Copy link
Contributor

If you create an object with attributes that are empty strings, they'll get indexed.

So:

item.attribute = ''

Will result in an item in your index with attribute = ''.

This is very easy to fix since there's already a check to make sure that attributes aren't None. All that needs to happen is for the line to also say, and a != ''.

@tow
Copy link
Owner

tow commented Sep 1, 2015

Not obvious to me - why shouldn't an empty string be indexed?

Or rather - why shouldn't an empty string be stored - an attribute on an object may well be stored in Solr without that field being indexed, and it is a perfectly reasonable thing for an object to have an attribute which is an empty string.

@mlissner
Copy link
Contributor Author

mlissner commented Sep 1, 2015

Thanks for the reply.

I guess the way I look at this is that I want one and only one way to know if an attribute has values.

Django itself solves this problem by having the convention that CharFields are never null. So, the only value they can have is "". (As of about Django 1.7, a warning will be issued if your model has a nullable CharField.)

My idea here is along the same lines, but flips it so that blanks are normalized to None. It would also be possible, (though harder, I think?) to make None values into blanks instead.

So, when indexing:

"" None Thoughts
current "" no attr This is OK, but inconsistent
proposed no attr no attr Good, easy to implement, opposite of Django convention
alt proposal "" "" Best? Harder to implement, aligns with Django convention

Thanks again for the reply. Glad to see your involvement in the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants