Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cylc Flow - Profile Battery #38

Open
oliver-sanders opened this issue Jul 11, 2019 · 1 comment
Open

Cylc Flow - Profile Battery #38

oliver-sanders opened this issue Jul 11, 2019 · 1 comment
Assignees

Comments

@oliver-sanders
Copy link
Member

The old Cylc profile battery was removed at Cylc 8, it was decided that it would be moved to a top level project outside of Cylc Flow.

The old profile battery was designed to provide support back to Cylc 6 which made it's logic quite messy, as a result it is creaking at the seams and desperate for a re-write. Cylc installation has also changed meaning the profile battery does not work with Cylc 8 anyway.

Thanks to changes made at Cylc 7 the re-write should be pretty simple and can take some code from the old implementation.

Proposal:

  • The profile battery should generate a Cylc suite to run the experiments.
  • To install different Cylc versions:
    • It will archive cylc code at different versions into the suite directory.
    • Create and activate an environment based on a user provided command template (so as to support virtualenv / conda).
    • Then pip install in-place.
  • Results stored in an sqlite3 DB in ~/.cylc.
  • The battery will be able to profile against:
    • Experiment.
    • Cylc Flow Version.
    • Platform.
    • Python Version.

Priority: low

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Aug 23, 2021

Dumping some

How do I profile Cylc Flow?

There are multiple approaches, and it's largely down to what you want to know.

It's a bit of a rabbit warren, Cylc scales in many different ways, which are you interested in.

Scaling factors:

  • Number of task definitions.
  • Number of dependencies.
  • Number of recurrences.
  • Length of queues.
  • Churn (task state changes per second).
  • Number of workflows per server.

Other metrics to consider:

  • Idle efficiency (CPU usage when there are no task state changes).
  • Latency (time between task state change happening and the main loop responding).
  • Startup time.

Reference suites

The reference suites are pretty useful and their performance characteristics are well understood, you can find them in Cylc7 in etc/dev-suites. The most important reference suite is complex which is and old operational MO suite with the functional bits stripped out. It's good for finding the kind of performance issues you only hit with real-world workflows. We have data for this going back to Cylc 6!

tldr;

For most purposes the best bet is probably:

$ /usr/bin/time -v cylc play complex

Look at RSS memory and add together User+System CPU.

The Methods

/usr/bin/time -v

This lovely command gives you CPU memory and more for any shell execution (e.g. cylc run <flow>) and you get an "outer measurement" which includes Cylc as well as it's dependencies and Python itself. These measurements are closer to real world usage.

Note: For GNU time use -v, for BSD time use -lp

Note: time and /usr/bin/time may differ in your shell, use /usr/bin/time.

Cylc's onboard profiling / CProfile

Cylc has an on-board profiler. It uses CProfile under the hood which gives you a breakdown of all the method calls and lets you know what methods are taking the most time.

Note: {method 'poll' of 'select.epoll' objects} is sleep time and can be ignored.

CProfile gives you an "inner measurement" (from within Python itself) which is good for performance investigations but not so good for measuring the system as a whole, largley as CProfile itself weighs the program it is measuring down.

For cylc play you can find the cprofile results in log/scheduler/log. For cylc validate they will go to $PWD.

You can visualise these results with snakeviz:

$ pip install snakeviz
$ snakeviz ~/cylc-run/<workflow>/log/scheduler/profile.prof`

/proc/meminfo

If the question you are trying to answer is "how many flows can I run on my server before I run out of memory" then the memory reported by CProfile is no use (as it doesn't include Python) and the number reported by /usr/bin/time is no use as it counts all shared libraries. /proc/meminfo contains RSS (resident set size) memory measurements, but also USS (unique set size) unshared memory and more. This gives you the ability to make all kinds of "outer measurements". I've used awk to sum over the fields I want every so many seconds which is a pretty good way to get relevant data.

cylc profile-battery

The cylc profile battery command existed from late Cylc6 until the Python3 upgrade in Cylc8. It ran standard suites through a set of tests, focusing on real world usage and scaling. It could checkout branches from Git repos for you to automate testing and we ran it weekly for over a year to spot any unwanted performance decreases. Under the hood it used either /usr/bin/time or --profile and had the scope for other measurement tools.

It's due for reincarnation in a new Cylc repo, it will probably support Cylc7 and Cylc8 when reinstated. I did make a start on it but other things seemed more urgent and it is a fair bit of work, especially now the installation of Cylc means creating and managing environments.

memory-profiler

Hillary and Bruno have had good success using this wonderful tool for tracking memory usage. It uses psutil which in turn uses /proc/* I think. It's another "outer measurement" which includes Python itself, it's a nicer alternative to summing the fields in /proc/meminfo yourself but you may need to read into what the numbers actually mean.

--main-loop 'log memory' --main-loop 'log data store'

Another on-board profiler which tracks the number of items on the data store and the memory usage of the different components of the Scheduler. It's an inner measurement that's good for memory investigations.

Note: You may see one attribute randomly increasing in memory and another randomly falling at the same time. This can happen when two objects share common objects. Pympler will only associate that memory with one of the parent objects, some times it may swap which parent gets lumbered with it causing these strange swaps.

Pympler

The backend of the main loop plugins is Pympler.

This is a Python project with memory measurement tools.

from pympler.asizeof import asized

# view the memory usage of an object:
print(asized(scheduler).format())

# view the memory usage of the attributes of an object recursively:
print(asized(scheduler, detail=5).format(depth=5))

Note: For dictionaries [K] and [V] mark keys and values.

Note: You will see size and flat measurements. The size is the overall size of the object, whereas flat excludes the data structures.

>>> print(asized(['a'], detail=2).format(depth=2))
['a'] size=136 flat=80
    'a' size=56 flat=56

So here string 'a' weighs 56 b, however, the list containing it weighs 80 b making for a total size of 80 + 56 = 136 b. Note that the "flat" memory usage of the list increases as it grows.

New age asyncio tooling.

Bruno demonstrated a really nice asyncio profiler in Wellington (can't remember the name) that's built into pycharm but can be used standalone. It shows the timings of async calls and is good for identifying bottlenecks in async algorithms.

Async usage in Cylc is currently minimal, however, we expect this to change in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant