A case-insensitive MatchCI module without dependencies. #61

mikehat · 2016-07-20T22:28:01Z

Here's a version of MatchCI without the case-insensitive dependency. I kind
of dislike the HasCase and caselessEq names, but didn't take time to think
of any that didn't overlap some other lib's. Keeping them internal (aka not the
first commit on the match-ci branch) helps.

There are actually three commits on this branch. The first adds a class to the
StringLike module, which may not be desirable. The second moves all changes
into the MatchCI module. The third approach adds a new module Text.HasCase
with the option of not making it an exposed module.

Here's why I think MatchCI is better than a toLower :: Tag s -> Tag s:

Not all uses of Match are necessarily against static names that the client
code controls. Assuming that the client code can match a toLower-ed tag
stream against lower-case literal values assumes that the client code
contains the names to be matched. Of course, the client can toLower
everything, which leads to...
Client code that doesn't match against string literals will be littered with
toLower. Switching to case-insensitive matching becomes as simple as
importing the MatchCI module rather than Match.
Using toLower to make everything work means lower performance. We can be
sure that the penalty is small, but performance doesn't end at tag parsing.
It is reasonable to assume that almost every opening tag will be matched for
a tag name. While the Match module isn't part of the bench routine, any use
against real-world HTML will require case-insensitive matching. I think that
for this usage (compare once and forget), a local function is probably better
than using the case-insensitive package (allocate a lower-case instance or
a thunk).

ndmitchell · 2016-07-21T15:35:07Z

Thanks for the revised pull request. You arguments in favour of MatchCI rather than my suggestion seem good.

Final remark is why not merge HasCase into StringLike, and demand everyone providing StringLike implements caselessEq? I don't see that as particularly burdensome (there are probably no implementations of StringLike outside tagsoup), and simplifies the number of pieces.

mikehat · 2016-07-21T17:11:46Z

I originally had caselessEq inside StringLike, but didn't want to break anyone's instance. But I think you're right. How about a better name, ciEq or ^==?

ndmitchell · 2016-07-22T07:20:17Z

In fact, you can have a default instance for caselessEq which does toString and caselessEq that way - then no one who doesn't implement it gets broken.

I'm not too fussed about the name, but given it's pretty rare, I would avoid symbols or excessive/unusual abbreviations. caseInsensitiveEq is probably where I'd land, but it's up to you.

mikehat added 3 commits July 20, 2016 13:33

Added MatchCI module for case-insensitive tag and attribute matching.

00c4c5b

Moved HasCase class into MatchCI, leaving StringLike unchanged

056f5c3

Moved HasCase to its own module.

d93a538

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A case-insensitive MatchCI module without dependencies. #61

A case-insensitive MatchCI module without dependencies. #61

mikehat commented Jul 20, 2016

ndmitchell commented Jul 21, 2016

mikehat commented Jul 21, 2016

ndmitchell commented Jul 22, 2016

A case-insensitive MatchCI module without dependencies. #61

Are you sure you want to change the base?

A case-insensitive MatchCI module without dependencies. #61

Conversation

mikehat commented Jul 20, 2016

ndmitchell commented Jul 21, 2016

mikehat commented Jul 21, 2016

ndmitchell commented Jul 22, 2016