"dhall resolve" does too much #2116

georgefst · 2020-12-21T19:47:56Z

I was under the impression that dhall resolve would just substitute in the contents of import expressions. But it sometimes performs a normalisation step that can cause a huge blowup in output size. e.g.

-- 1.dhall
 let x = ./2.dhall in [ x.A, x.B ]

-- 2.dhall
< A | B | C >

> echo ./1.dhall | dhall resolve
[ < A | B | C >.A, < A | B | C >.B ]

I would have expected let x = < A | B | C > in [ x.A, x.B ].

In my actual use case, this is resulting in an output of over 100,000 lines, instead of under 1000 (there's a 540-member sum type involved...).

The text was updated successfully, but these errors were encountered:

georgefst · 2020-12-21T19:53:36Z

Hmm. Github's syntax highlighting appears to be slightly broken - wrong colour for the opening bracket in 2.dhall.

sjakobi · 2020-12-22T01:26:29Z

dhall resolve's non-normalizing imports resolution is currently limited to its input expression – any transitive imports are resolved in the usual, normalizing way.

In your example, the output is fully normalized because the input expression is a single import: ./1.dhall. You can preserve the let-binding by letting dhall resolve work on the expression in 1.dhall directly:

$ cat 1.dhall | dhall resolve
let x = < A | B | C > in [ x.A, x.B ]
$ dhall resolve --file 1.dhall
let x = < A | B | C > in [ x.A, x.B ]

In principle, we could also extend dhall resolve so any transitive imports aren't normalized either. (The Status used during import resolution can use a custom _normalizer, which, in this case, would simply return its input.)

georgefst · 2020-12-22T18:23:56Z

Ah, thanks, that suddenly makes a lot of sense.

Is there any particular reason for it to be the way it currently is? Keeping normalisation separate seems preferable on the face of it.

Gabriella439 · 2020-12-22T19:57:16Z

@georgefst: There is a high likelihood that I will propose changing the standard to disable the behavior of pre-normalizing all imports. This is in part to fix the performance issues discussed in #1890 (comment)

georgefst · 2020-12-27T00:38:37Z

@Gabriel439 That would be great!

There's still a potential blowup from diamond dependencies. But that seems like something I can experiment with downstream. e.g. introducing a new let-binding with a fresh name.

Gabriella439 · 2021-01-02T21:08:19Z

@georgefst: Yes, I plan to fix the diamond dependency issue, too. The idea is that if an import is protected by an integrity check then we don't have to resolve the import in order to cache it. We would just cache the import separately and then resolve it from the cache on the fly when decoding anything that imports it with an integrity check. That ensures that we only ever cache one copy of an import within the entire cache.

georgefst · 2022-06-06T13:03:34Z

Has there been any progress towards this? It's still my main frustration with Dhall.

(I don't like to pester maintainers - I'm just genuinely curious whether there are any threads I've missed)

Gabriella439 · 2022-06-06T16:59:24Z

@georgefst: No worries! There hasn't been progress on this recently

georgefst · 2022-06-12T14:26:15Z

@georgefst: Yes, I plan to fix the diamond dependency issue, too. The idea is that if an import is protected by an integrity check then we don't have to resolve the import in order to cache it. We would just cache the import separately and then resolve it from the cache on the fly when decoding anything that imports it with an integrity check. That ensures that we only ever cache one copy of an import within the entire cache.

Re-reading, I realise this doesn't quite solve the issue I'm thinking of. My use case for dhall resolve is for distributing a single file which doesn't depend on any other Dhall files or on remote resources. So the blowup I'm talking about is where a single dependency might be inlined in multiple locations in the final file.

Gabriella439 · 2022-06-12T14:50:42Z

@georgefst: I believe the combination of:

Not pre-β-reducing imports
Caching unresolved imports (so long as their transitive imports were protected by an integrity check)

… would suffice to give you the performance improvement you're looking for, at least when using the Haskell implementation of Dhall. This is because the implementation is smart enough to avoid duplicating sub-expressions when they are referenced multiple times.

The exception is if the final result display to the user or conversion tool contains duplicate sub-expressions, in which case it has to actually render all of them. That's unavoidable.

arobertn mentioned this issue Jan 10, 2022

Mention of performance in comparison with Dhall? tweag/nickel#560

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"dhall resolve" does too much #2116

"dhall resolve" does too much #2116

georgefst commented Dec 21, 2020 •

edited

Loading

georgefst commented Dec 21, 2020

sjakobi commented Dec 22, 2020

georgefst commented Dec 22, 2020

Gabriella439 commented Dec 22, 2020

georgefst commented Dec 27, 2020

Gabriella439 commented Jan 2, 2021 •

edited

Loading

georgefst commented Jun 6, 2022

Gabriella439 commented Jun 6, 2022

georgefst commented Jun 12, 2022

Gabriella439 commented Jun 12, 2022

"dhall resolve" does too much #2116

"dhall resolve" does too much #2116

Comments

georgefst commented Dec 21, 2020 • edited Loading

georgefst commented Dec 21, 2020

sjakobi commented Dec 22, 2020

georgefst commented Dec 22, 2020

Gabriella439 commented Dec 22, 2020

georgefst commented Dec 27, 2020

Gabriella439 commented Jan 2, 2021 • edited Loading

georgefst commented Jun 6, 2022

Gabriella439 commented Jun 6, 2022

georgefst commented Jun 12, 2022

Gabriella439 commented Jun 12, 2022

georgefst commented Dec 21, 2020 •

edited

Loading

Gabriella439 commented Jan 2, 2021 •

edited

Loading