Parser is bad Alternative #122

cblp · 2016-06-21T12:54:50Z

Alternative is expected to be a monoid.

But for Parser we have:

import Prelude
import Data.Attoparsec.Text
import Control.Applicative

λ> parseOnly (fail "oops") "" == parseOnly (fail "oops" <|> empty) ""
False

λ> parseOnly (fail "oops") ""
Left "Failed reading: oops"
λ> parseOnly (fail "oops" <|> empty) ""
Left "Failed reading: empty"

Implications:

Adding empty erases useful error message.
User can't substitute p1 <|> p2 <|> p3 with asum [p1, p2, p3] (asum always appends empty at the end).

The text was updated successfully, but these errors were encountered:

bgamari · 2016-06-30T20:40:18Z

Indeed this is true but I'm afraid I don't see any way to fix this which doesn't either compromise performance or significantly complicate the implementation (and perhaps, as a result, also compromise performance).

Fixing this would in effect require that we add a variety of "weak failure" result, which <|> could then drop in favor of the desired error. To do this efficiently we'd need to carry another closure through the parser state for signaling this mode of failure which I believe would rather significantly increase allocations. I would probably say that if you care enough about errors for the current Alternative behavior to be problematic, you should probably be using another parsing library.

cblp · 2016-06-30T21:47:15Z

I want to use Alternative freely with Aeson.Parser, and I have no option in aeson's underlying parsing library.

sol · 2016-07-01T02:26:58Z

One way to fix this is to have an Ord instance for your error type and then always return the "largest" error. Doing so can give you much-needed much better error messages for backtracking parsers.

(Basically what you do is order your errors from less specific to more specific, so that you always keep the more specific one.)

See https://github.com/vimus/vimus/blob/master/src/Vimus/Command/Parser.us for an implementation.

That said, attoparsec parsers are still a monoid under the assumption that we consider all errors equal (similar to what GHC does for exceptions).

Sent from mobile

On 1 Jul 2016, at 5:47 AM, Yuriy Syrovetskiy [email protected] wrote:

I want to use Alternative freely with Aeson.Parser, and I have no option in aeson's underlying parsing library.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

bgamari · 2016-07-01T09:19:19Z

@sol, the problem is that you end up carrying around significantly more information through the parser state. attoparsec is first-and-foremost designed with performance in mind; this is manifested in the fact that the Parser type itself is a CPS'd state monad, which allows us to produce straight-line, non-allocating code for nearly all parsers. Unfortunately I can't see a simple way to allow us to offer the "correct" Alternative behavior while preserving this property.

The simplest approach I could come up with (implemented here) appears to significantly degrade performance in uses of many and some, which I suspect is due to the book-keeping in tracking the additional errors (namely the allocation of a new failure closure at every branch point in the parse).

I don't have time to investigate this further but I suspect this will be a quite difficult issue to fix, if it is possible at all. That, of course, shouldn't stop others from taking a stab at it themselves. I'd be happy to see what others come up with.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parser is bad Alternative #122

Parser is bad Alternative #122

cblp commented Jun 21, 2016

bgamari commented Jun 30, 2016

cblp commented Jun 30, 2016

sol commented Jul 1, 2016

bgamari commented Jul 1, 2016 •

edited

Loading

Parser is bad Alternative #122

Parser is bad Alternative #122

Comments

cblp commented Jun 21, 2016

bgamari commented Jun 30, 2016

cblp commented Jun 30, 2016

sol commented Jul 1, 2016

bgamari commented Jul 1, 2016 • edited Loading

bgamari commented Jul 1, 2016 •

edited

Loading