editscript composition and optimization #2

EmergentBehavior · 2018-05-10T21:16:52Z

First, I think this library is pretty interesting. I was wondering about one use case though: let's say you have entity A_t0 (where t is analogous to a time step) and you have an editscript e_0->1 to describe the transformation needed to get A_t0 to A_t1. If you capture an editscript for transformations at each time step (if there is a change), you'd have a collection of e, right? Then if you want to get the present state of A you could just concatenate all those editscripts together (to describe changes between t0 and tN). Have you tried this use case?

I wonder if at some point though, if the editscript gets large enough the patching process would slow down and it would be helpful to have some sort of editscript optimizer to reduce to the minimal editscript needed to get from At0 to AtN.

The text was updated successfully, but these errors were encountered:

huahaiy · 2018-05-10T21:44:21Z

For your first paragraph, yes, the editscript is designed to do just that. (get-edits e) return a vector. These vectors can be concatenated to represent a larger change. BTW, I added a 'combine` function.

For the second, it is a very interesting question. I have not encountered the cases where the patching process take too long. When these cases do appear, I will think about an optimizer.

On the other hand, editscript is designed with stream processing in mind. An editscript should be conceptualized as a chunk in a potentially endless stream of changes. So it is more meaningful to worry about data integrity, compression, windowing, etc, rather than the sizes of individual ediscripts. Optimizers in these contexts are indeed what I am very interested in.

Basically, I consider editscript as a part of the data-oriented effort of Clojure, that tries to elevate the level of abstraction of data from characters or bytes level to that of maps, sets, vectors, and lists level. So instead of talking about byte streams, we can talk about change streams in term of these data structures.

Do I make sense?

pepe · 2018-05-11T07:52:51Z

I haven't had a chance to try edit script yet, but I think it will play nice with Specter. It seems to me they have a similar view of the data.

EmergentBehavior · 2018-05-11T12:42:11Z

@huahaiy Thanks for the answer. My latter paragraph was considering a scenario in event streaming where I rebuild the "present" version of an entity by composing all historical mutations over its entire history of existence (if checkpointing or other strategies weren't used).

huahaiy · 2018-05-12T21:29:31Z

@EmergentBehavior You scenario sounds similar to mine.

Given an editscript, there are indeed some opportunities to optimize, e.g. if one sub-tree will later be deleted, all edits happened inside that sub-tree could be safely removed without impacting the end results.

Such optimization may require the editscript to record some kind of identifiers for internal nodes. I will think about these.

Meanwhile, my current focus is to further improve the diffing speed. I am working on fingerprinting the data to avoid drill down sub-trees that have the same content.

huahaiy · 2020-06-26T20:59:21Z

Implementing some obvious optimizations should be a good starting point.

huahaiy self-assigned this Jun 26, 2020

huahaiy added the enhancement New feature or request label Jun 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

editscript composition and optimization #2

editscript composition and optimization #2

EmergentBehavior commented May 10, 2018

huahaiy commented May 10, 2018 •

edited

Loading

pepe commented May 11, 2018

EmergentBehavior commented May 11, 2018

huahaiy commented May 12, 2018

huahaiy commented Jun 26, 2020

editscript composition and optimization #2

editscript composition and optimization #2

Comments

EmergentBehavior commented May 10, 2018

huahaiy commented May 10, 2018 • edited Loading

pepe commented May 11, 2018

EmergentBehavior commented May 11, 2018

huahaiy commented May 12, 2018

huahaiy commented Jun 26, 2020

huahaiy commented May 10, 2018 •

edited

Loading