Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Filter To Persist For Warm Boot #28

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

chadhietala
Copy link
Contributor

This incorporates the ideas in
https://github.com/stefanpenner/broccoli-persistent-filter and to make optional
via a persist flag. This currently does not work on windows (See stefanpenner/hash-for-dep#8)
however it is not detrimental to the end user as it is opt in.

Todos

  • Some how test that the persistent cache is working correctly

@chadhietala chadhietala changed the title Allow Filter To Persist For Warm Initial Boot Allow Filter To Persist For Warm Boot Aug 18, 2015
@chadhietala chadhietala force-pushed the persist branch 3 times, most recently from 1fdcf92 to cb13857 Compare August 18, 2015 23:24
This incorporates the ideas in https://github.com/stefanpenner/broccoli-persistent-filter and to make optional via a persist flag. This currently does not work on windows however it is not detrimental to the end user as it is opt in.
@@ -43,6 +46,16 @@ function Filter(inputTree, options) {
this.inputEncoding = options.inputEncoding;
if (options.outputEncoding != null)
this.outputEncoding = options.outputEncoding;
if (options.persist) {
if (/^win/.test(process.platform)) {
console.log('Unfortunately persistent cache is currently not available on windows based systems.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should link to a tracking issue from this warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@stefanpenner
Copy link
Contributor

One high level thought, I would like to be sure we minimize the code complexity moving forward.

Some ideas:

  • at best: extract all branching logic to instead rely on a strategy pattern
  • at least: extract branches to be there own methods

@stefanpenner
Copy link
Contributor

@joliss just for reference, this solution costs us a-bit more in terms of complexity and upfront cost but is invaluable for some transforms for large apps.

For example, one of our apps:

➜  dsp git:(master) cloc {app,tests}
    2679 text files.
    2622 unique files.
      62 files ignored.

http://cloc.sourceforge.net v 1.64  T=7.39 s (354.8 files/s, 22297.8 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Javascript                    1994          15572           4755         117565
Handlebars                     521            793            186          13832
LESS                           103           1601            111          10039
CSS                              1              0              0            180
HTML                             2              9              7             78
-------------------------------------------------------------------------------
SUM:                          2621          17975           5059         141694
-------------------------------------------------------------------------------

Initial build, babel alone takes nearly a minute (although babel is improving, the apps growth seems to be growing at a similar rate). With the persistent variant cold builds take 5%-10% longer, but initial builds at 95% faster.

Our Minute babel time on warm builds drops to 2-3s.

It's worth mentioning Babel isn't the only transform to benefit, HTMLBars, JSCS, JShint etc all see similar improvements.

* @return {String} this filters top-level cache key
*/
Filter.prototype.cacheKey = function() {
return hashForDep(this.baseDir());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is fairly costly, and we easily have N independent broccoli transforms. I believe this only needs to run once per subclass, not once per instance. As such I believe caching the result on the constructor should be sufficient.

@timmfin
Copy link

timmfin commented Aug 19, 2015

Nice. I totally need to dig into this and see if I can switch to it instead of what I've done in timmfin/broccoli-filter@master...timmfin:broccoli-persistent-filter (and timmfin/broccoli-filter@master...timmfin:broccoli-persistent-filter-with-symlinked-dirs).

@stefanpenner
Copy link
Contributor

Nice. I totally need to dig into this and see if I can switch to it instead of what I've done in [email protected]:broccoli-persistent-filter (and [email protected]:broccoli-persistent-filter-with-symlinked-dirs).

I'm eager to drop https://github.com/stefanpenner/broccoli-persistent-filter (which has been working well for us) in-favor of merging it back into filter proper (this PR)

@timmfin your feedback would be great. FYI we are trying to get this in in the next few days.

This removes majority of the branching and instead uses a strategy
pattern.
Prior to this commit we bled details about the support level for
persistent builds. This places those details into 1 place so that when
support for windows is there we simply can remove the strategy code.
@joliss
Copy link
Member

joliss commented Aug 20, 2015

I think we can move some of this functionality to broccoli-plugin, and the Broccoli API:

  • Plugin can provide a persistentCachePath directory (created by the Broccoli builder), if requested by the subclass. The subclass can perhaps provide a cacheKey so that the persistent cache is automatically cleared if something changes.
  • If on Windows we can't do persistent caches, Broccoli can simply fall back to the regular cache mechanism. (And so then there's no need to concern the consumer with whether to persist or not.)
  • Once we have that, Filter can switch from this.cachePath to this.persistentCachePath whenever the subclass provides a key.

I'd rather deal with the other parts of the API upgrade first though, and then tackle this. We can't add features like "persistentCache" until we rewrite the builder to use the new API, which is blocking on updating all the plugins.

@stefanpenner
Copy link
Contributor

If on Windows we can't do persistent caches, Broccoli can simply fall back to the regular cache mechanism. (And so then there's no need to concern the consumer with whether to persist or not.)

this limitation should be resolved shortly (days)

@stefanpenner
Copy link
Contributor

I'd rather deal with the other parts of the API upgrade first though,

So this effort is blocked on figuring out the solution at the broccoli-plugin level?

This work needs to land soon to mitigate serious pains, if its blocked we will fork until the core has it handled.

@timmfin
Copy link

timmfin commented Aug 21, 2015

@stefanpenner I took @joliss's comment to mean that we should wait on pulling persistent caches into broccoli-plugin until broccoli-plugins's migration is complete. And not necessarily that we should stop working on it here. (But now I'm not so sure...)

@stefanpenner
Copy link
Contributor

broccoli-plugin until broccoli-plugins's migration is complete.

unclear why that is a blocker – but w/e, babel/jshint/jscs etc will get this soon friendly fork or GM. As long as it eventually goes mainline again.

@joliss
Copy link
Member

joliss commented Aug 21, 2015

broccoli-plugin until broccoli-plugins's migration is complete.

unclear why that is a blocker

We can put the tmp handling code for persistent caches straight into Broccoli builder, but only once the migration is complete. If we try doing it before that, we're doing the work twice.

Also, for me it's a matter of time management: The most important thing for me is to finish up the API upgrade, to get us out of the current messy situation of having 2 APIs at the same time. So I don't want to spend too much time on this right now.

but w/e, babel/jshint/jscs etc will get this soon friendly fork or GM

(GM = Git merge?) I'd prefer not to have too many forks flying around. The cauliflower thing was pretty painful and didn't help that much. But if you do want do a fork, here's my suggestion for an API:

For plugins that want to use a persistent cache, simply pass Filter.call(this, ..., { persistentCacheKey: someString }) into the base class constructor. I don't think we need any other method or API.

This still isn't quite optimal because it leaves two common tasks up to plugin authors, namely collecting npm version specs and hashing over everything, but it's all I can come up with off the top of my head. :)

@stefanpenner
Copy link
Contributor

he cauliflower thing was pretty painful and didn't help that much.

This can shave multiple minutes off boot time for an app, which does provide more value then the cauliflower fork.

I can appreciate the API stability goal (big :+1) – but we will need a side-channel option in the short term either a friendly fork or a "opt in feature" in core.

@stefanpenner
Copy link
Contributor

FYI window support has landed for hash-for-dep stefanpenner/hash-for-dep@335d69b

stefanpenner added a commit to broccolijs/broccoli-persistent-filter that referenced this pull request Aug 22, 2015
@joliss
Copy link
Member

joliss commented Aug 23, 2015

Here's my perspective: I think we'll be in a place to land proper persistent-cache support in about a month or two, and then it'll be pretty straightforward from an API perspective. (There'll still be some inherent complexities like making sure old caches get cleaned out, but we have that problem either way.)

So if we fork now and try to make it work with some stop-gap code, like this PR, we'll get maybe three months of better performance (i.e. between nowish and November), but we'll all pay a price in extra engineering time: (a) Writing code that gets ultimately thrown away, and (b) having to spend more time on the migration to the new API, because we'll need to think about regressions and we'll need to clean out the code we wrote in step (a).

I'd intuitively prefer we didn't make that trade, but of course I also don't know exactly how bad the problem is on your end. Can you live with it for a while, or are there maybe some other useful things you can do to speed your app up for now, like precompiling sublibraries, or improving Babel performance some more?

Happy to talk on Skype or Slack if you want.

@stefanpenner
Copy link
Contributor

we'll get maybe three months of better performance (i.e. between nowish and November), but we'll all pay a price in extra engineering time:

Us maintainers should carry this burden, or we cause a large amplification of time waste.

Anyways, fork now exists here: https://github.com/stefanpenner/broccoli-persistent-filter

babel + jshint + jscs + htmlbars will be migrated over the next day or so.

Once this is officially supported we will migrate those core libraries back.

I'd intuitively prefer we didn't make that trade, but of course I also don't know exactly how bad the problem is on your end.

Let me share my perspective, to help fill any potential missing information:

as mentioned, on big apps this saves minutes saves per boot.
Reboots happen often, because today the following result in full restarts (hopefully overtime we continue to mitigate this)

  1. changing the brocfile (or adding addons that change it)
  2. changing package.json
  3. running a new addon/tool/app
  4. building prod/dev/tests etc.

It is not uncommon, especially when moving between feature branches to have to restart several times an hour, but lets assume once per hour.

Lets say there are 100 active users spending 5 hours a day using broccoli on large code-basis.

  • 100 users – this is very low relative to our analytics but its a nice round number
  • 1 restart per hour
  • 5 hours a day using the tool
  • 60s of babel + jshint overhead persistent filter will save in most cases on large code-basis. (our big apps are 60s - 70s for each)

(100 * 5 * 60 = 30,000s) / 86,400 === 1/3 day wasted

Actually, assuming 5h using the tool a day. We are wasting nearly 2 full dev days per 100 users per day.

Lets say 9 weeks (maybe 40 work days or so) till this is supported. Thats potentially 80 dev days of wasted time per 100 users on a large apps.

My math is likely wrong, or ratios not representative,but if we slide any of the numbers around we still end up with a fork having better cost-benefit then doing nothing.

@joliss
Copy link
Member

joliss commented Aug 23, 2015

OK, the math checks out. You got me convinced that this is important.

So I guess we can use your fork for now, and then we can try to get this landed in broccoli-filter hopefully soon (with support from Broccoli and broccoli-plugin)?

@stefanpenner
Copy link
Contributor

So I guess we can use your fork for now, and then we can try to get this landed in broccoli-filter hopefully soon (with support from Broccoli and broccoli-plugin)?

Let it be clear, I am absolutely fine with the fork. I just felt I share share why I am motivated. To make sure you where not missing pieces of information.

The fork does buy us some time, I hope to carve out some upcoming weeks to next to make merge-trees + filter utilize your new persistentOutput flag.

@JamesMGreene
Copy link

Any update here?

@stefanpenner
Copy link
Contributor

@JamesMGreene i think @joliss has some related work planned, until then feel free to use: https://github.com/stefanpenner/broccoli-persistent-filter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants