Skip to content
This repository has been archived by the owner on Mar 27, 2020. It is now read-only.

less and ava plugin migration + appreciation #43

Open
panjiesw opened this issue Sep 2, 2018 · 18 comments
Open

less and ava plugin migration + appreciation #43

panjiesw opened this issue Sep 2, 2018 · 18 comments

Comments

@panjiesw
Copy link

panjiesw commented Sep 2, 2018

Hei there,

Just recently found out about this task runner and I'm lovin it so far! It provides simple task based runner and really useful in a monorepo project. I actually prefer NPM scripts over task runner, but for large monorepo project it'll inevitably become tedious to maintain and sync across multiple packages.

Now start, provides ability to orchestrate build stanza right from a root of monorepo, without much complexion. Really like it!

I noticed that less and ava were originally supported as plugin but not yet migrated to the current start monorepo. Is there any ETA in when they'll be migrated? Or is there any pointer if I were to implement custom plugin for them in the current version of start?

@deepsweet
Copy link
Owner

Hi.

Thanks for the kind words. I agree that Start really shines when it comes to monorepo, I've been using it for 6 monorepos of different scales, and the flexibility always keeps me surprised every time I face a need for some functionality I haven't thought initially.

"To be migrated" status is still here for some plugins mostly because it looks like I'm the only active user of Start, so there are only plugins which I need(ed) in my workflow.

I never tried AVA in my projects, and according to their issue it's still unclear is there any more or less official "API" or not. I have no idea does my previous attempt still work or not.

Regarding Less – I believe that it should be a very straightforward port of previous plugin implementation, literally the same as the current Babel one with a bit of tweaks for input source maps.

So... I can port it by myself, but right now I have no proper way to test it correctly. It would be really nice if you try to do it by yourself on top of a real-life project :) I'm here to help with anything.

@panjiesw
Copy link
Author

panjiesw commented Sep 3, 2018

Thanks @deepsweet!

For less plugin, if you could migrate it to the current start, I'm willing to test it because I have some use cases with less stuff right now.

For AVA, currently I'm thinking to create start plugin to just spawn ava's cli using execa or something. The cli accepts file/directory/glob patterns to execute test. Maybe something similar to start's plugin-lib-typescript-generate. I'll try this later this week and will get back here with the progress.

@deepsweet
Copy link
Owner

Sure.

Sorry, I was a bit busy, will try to migrate Less plugin next week.

@panjiesw
Copy link
Author

So after much tinkering I ended up using lerna run script in each sub-package for AVA tests. I noticed when dealing with a lot of files (hundreds), which tends to happen if you run from a root monorepo, start takes a lot of RAM compared to just running NPM scripts in each package (in parallel, e.g. lerna run --parallel).

CMIIW, maybe because by design start holds file contents in memory, sequence(find -> read -> transform -> rename -> write), doing that for all packages with plugin-xargs, ends up consuming more RAM than running NPM scripts in parallel. Or maybe I'm doing something wrong.

But for now I have to go back to traditional way of managing sub-package and will get back to start to tinker more later, because I'm still interested in these awesome packages!

@deepsweet
Copy link
Owner

¯\(ツ)

I can't say much without seeing any numbers. It's not Start holds files content in memory, it's what Babel, ESLint and others wants by design: in order to make AST you have to read and parse file.

The only weak spot of start-parallel/start-xargs is Node.js lack of normal threads/workers – we have to spawn entire Node.js instance from the ground with all its startup time delays and RAM usage. It's going to be changed relatively soon with Node.js 10 Worker Threads. So far the way to have something in parallel (not to be confused with "concurrently") is only one, quite barbaric – child processes. You can try to tweak maxProcesses option for parallel/xargs plugins, which is Infinity by default, to something more realistic like Math.max(os.cpus().length - 1, 1). On really large amount of files so many child processes everything will choke for sure, it's not like with goroutines.

I believe that lerna run --parallel does absolutely the same, but maybe it feels more "sparse" and difficult to measure as a "single run".

Anyway, I'm all open for proposals, thoughts and feedback.

@panjiesw
Copy link
Author

panjiesw commented Sep 10, 2018

Unfortunately I can't share the project which I tried start on for now.

But, I created a repro at https://github.com/panjiesw/start-playground to simulate what I experienced above :)
Maybe there's something wrong with how I use start, so you can inspect it.

From the readme in that repro, on a monorepo with 10 sub-packages, compilation of all sub-packages sources from .ts to .js using [email protected]:

  • Using traditional way, lerna run and also with help of npm-run-all

    time --verbose npm run build:npm
      Command being timed: "npm run build:npm"
      User time (seconds): 56.50
      System time (seconds): 2.91
      Percent of CPU this job got: 694%
      Elapsed (wall clock) time (h:mm:ss or m:ss): 0:08.55
      Average shared text size (kbytes): 0
      Average unshared data size (kbytes): 0
      Average stack size (kbytes): 0
      Average total size (kbytes): 0
      Maximum resident set size (kbytes): 90924
      Average resident set size (kbytes): 0
      Major (requiring I/O) page faults: 0
      Minor (reclaiming a frame) page faults: 720474
      Voluntary context switches: 39600
      Involuntary context switches: 44971
      Swaps: 0
      File system inputs: 0
      File system outputs: 10488
      Socket messages sent: 0
      Socket messages received: 0
      Signals delivered: 0
      Page size (bytes): 4096
      Exit status: 0
  • Using start

    time --verbose npm run build:start
      Command being timed: "npm run build:start"
      User time (seconds): 128.51
      System time (seconds): 4.78
      Percent of CPU this job got: 720%
      Elapsed (wall clock) time (h:mm:ss or m:ss): 0:18.49
      Average shared text size (kbytes): 0
      Average unshared data size (kbytes): 0
      Average stack size (kbytes): 0
      Average total size (kbytes): 0
      Maximum resident set size (kbytes): 147804
      Average resident set size (kbytes): 0
      Major (requiring I/O) page faults: 0
      Minor (reclaiming a frame) page faults: 1001942
      Voluntary context switches: 41396
      Involuntary context switches: 53945
      Swaps: 0
      File system inputs: 0
      File system outputs: 13672
      Socket messages sent: 0
      Socket messages received: 0
      Signals delivered: 0
      Page size (bytes): 4096
      Exit status: 0

@deepsweet
Copy link
Owner

Hm, quite interesting. Thanks for such a detailed response.

I see 2 things:

  1. xargs + parallel. It gives child processes spawning another child processes, which I believe ends up with 20 Node.js instances in total. I can imagine some new plugin for doing "for each argument run these 2 tasks in parallel", maybe xargs could receive an array of tasks as first argument.
  2. Lerna simply doesn't apply ESM loader (which should be very fast even on cold run, that's why it's built-in in Start CLI) and Babel register (which is, I believe, slow) on the fly. This is a pay for "ESNext" tasks file including TypeScript, and on a larger tasks file with many advanced tasks it really shines (well, at least in my experience, but I've been using Start really heavily). There is only one thing to do here: write tasks file only with what is supported by Node.js natively and remove start.require options from package.json completely.

And btw, I have already fixed as any fault in master but haven't released yet.

@deepsweet
Copy link
Owner

I can definitely work on 1, and we should try 2 without Babel register.

@panjiesw
Copy link
Author

About the master fix, I already tried it in linked local packages and it's indeed been fixed!

From what I know for case 1: lerna run --parallel accepts --concurrency options and defaults to 4. I tried to "simulate" it in Start using maxProcess arguments in xargs and parallel. I didn't notice any different tbh. With or without maxProcess it's close to above Start result.

Should I try with custom babel plugin which just spawn babel command? Also I think esm is not as heavy as @babel/register so let's just try without it as you said. Having esm is nice tbh 😃

@panjiesw
Copy link
Author

Here is the result without @babel/register

time --verbose npm run build:start
  Command being timed: "npm run build:start"
  User time (seconds): 102.66
  System time (seconds): 3.65
  Percent of CPU this job got: 744%
  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.28
  Average shared text size (kbytes): 0
  Average unshared data size (kbytes): 0
  Average stack size (kbytes): 0
  Average total size (kbytes): 0
  Maximum resident set size (kbytes): 142056
  Average resident set size (kbytes): 0
  Major (requiring I/O) page faults: 0
  Minor (reclaiming a frame) page faults: 885725
  Voluntary context switches: 34806
  Involuntary context switches: 46870
  Swaps: 0
  File system inputs: 0
  File system outputs: 10248
  Socket messages sent: 0
  Socket messages received: 0
  Signals delivered: 0
  Page size (bytes): 4096
  Exit status: 0

Definitely better and also start faster than before. I've updated the readme in my repro with this result.

@deepsweet
Copy link
Owner

deepsweet commented Sep 10, 2018

From what I see xargs doesn't limit the amount of parallel processes 🤔 With maxProcess set to 1 I see:

compileAll.xargs: start
compileAll.xargs: done
compile.parallel: start
compile.parallel: start
compile.parallel: start
compile.parallel: start
compile.parallel: start
compile.parallel: start
compile.parallel: start
compile.parallel: start
compile.parallel: start
compile.parallel: start

Aaand I found an extremely stupid bug – I forgot to await execa! 🙈 Thanks so much for pointing me to there, I really need tests. You can put yield on that line of compiled file in node_modules and try again.

Another idea: I'm planning to drop Node.js 6 (probably this week), so that there will be no @babel/runtime/helpers/asyncToGenerator anymore, only native async/await. Hope that it will give some performance boost.

@panjiesw
Copy link
Author

😆 that's a 5 letters bug!

It's a big improvement! The result with time doesn't show much, but my RAM usage isn't spiking as high as before, it's on par with the lerna run. Notice the CPU usage has gone down to lower percentage as lerna run.

time --verbose npm run build:Start
  Command being timed: "npm run build:start"
  User time (seconds): 95.28
  System time (seconds): 3.56
  Percent of CPU this job got: 657%
  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:15.02
  Average shared text size (kbytes): 0
  Average unshared data size (kbytes): 0
  Average stack size (kbytes): 0
  Average total size (kbytes): 0
  Maximum resident set size (kbytes): 143976
  Average resident set size (kbytes): 0
  Major (requiring I/O) page faults: 0
  Minor (reclaiming a frame) page faults: 905030
  Voluntary context switches: 35297
  Involuntary context switches: 34037
  Swaps: 0
  File system inputs: 0
  File system outputs: 10248
  Socket messages sent: 0
  Socket messages received: 0
  Signals delivered: 0
  Page size (bytes): 4096
  Exit status: 0

@deepsweet
Copy link
Owner

Great.

I'll publish everything from master tomorrow including that xargs wonderful fix and then will think about how to squash xargs and parallel together for this case, it really spawns too much.

@panjiesw
Copy link
Author

Dropping Node 6 is a good idea, since frankly Start is relatively new (the revived version), so its main target is also somewhat new packages with high possibility of running on Node 8 or higher.

One more possible improvement is providing a less verbose reporter. I suspect the reporter-verbose spewing a lot of things to stdout is also one of the cause of high resource.

@deepsweet
Copy link
Owner

reporter-verbose spewing a lot of things to stdout is also one of the cause of high resource

That's why it's called "verbose" :) Any suggestions for something like reporter-pretty?

@panjiesw
Copy link
Author

Not too pretty, but I created a reporter at https://github.com/panjiesw/start-ora-reporter. It's not for performance gain but I do need a reporter which doesn't output all files processed.

@deepsweet
Copy link
Owner

I'd say that it's pretty enough, would you mind if I add it to readme?

@panjiesw
Copy link
Author

It's still rough though. The spinners are overlapping each other. In a task with watch plugin, this become more apparent. Maybe ora is too low level.

But yea I don't mind.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants