Resource allocation of firmament #47

cxxly · 2016-07-10T08:36:35Z

Hi:

As far as I know, there are two ways to allocate resource:

Coarse granularity： Partition machine into fixed-size slots, and every slot can run one task, such as Hadoop.
Fine-grained resource allocate like Brog. (Borg users request CPU in units of milli-cores, and memory and disk space in bytes)

I have seen that both your work and Quincy use constant integer K to represent the capacity of a machine, like coarse-grained allocate. But there are some fine-grained resource information in cost model.

I want to know

How does firmament represent resource requested by a task and resource owned a machine ?
what's the physical meaning of capacity and how do you get the value of K?

ms705 · 2016-07-14T07:53:08Z

Hi @cxxly,

Sorry for the delayed response -- I'm currently travelling. I'll respond in more detail a bit later.

The bottom line is this: Firmament does use "slots" in the sense that each running task "uses" a leaf of the resource topology (= a PU/CPU core). This makes it easy to implement slot-based allocation policies, but does not mean that you must use slots.

Instead, you can see the leaves as an upper limit on the number of tasks that can run on a machine, which can be greater than the number of CPU cores (just add another level, or make up some "fake" cores, or increase the per-leaf capacity K). A multi-dimensional resource fit model can then be implemented by connecting tasks appropriately to places where they can fit -- as done in CoCo.

To address your questions quickly:

Firmament's job submission protobuf contains a resource reservation vector (see here), so it is currently user-specified. @joshbambrick did some excellent work to make Firmament automatically estimate resource requirements (using machine learning techniques to predict the initial reservation, and dynamic adaptation to tighten it), but this work isn't yet upstreamed.
See above -- the number of slots (K) sets an upper limit on the number of tasks per machine (since K * num_cores gives the aggregate outgoing flow capacity for the machine). We use K = 1 in most code models, but you can set it higher if you want to allow time-sharing of CPU cores.

Hope that makes sense!

cxxly · 2016-07-21T11:18:19Z

Thanks! @ms705

I 'm interested in @joshbambrick work, is there any published paper I can learn.

And I have some question about admission control in COCO cost model, I will open a new issue.

ms705 · 2016-07-28T16:50:20Z

Hi @cxxly,

We're going to have a blog post on @joshbambrick's work soon; if you're interested in a longer writeup, his BA dissertation is available here.

Did you end up opening a new issue about the CoCo admission control questions? I don't see any, but I may have missed it while travelling.

cxxly · 2016-10-17T08:26:02Z

Hi @ms705

I‘m really sorry for the delayed response. I’m very busy nowadays for my graduation.

I will open a new issue later.

ms705 added the documentation label Jul 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource allocation of firmament #47

Resource allocation of firmament #47

cxxly commented Jul 10, 2016

ms705 commented Jul 14, 2016 •

edited

Loading

cxxly commented Jul 21, 2016

ms705 commented Jul 28, 2016

cxxly commented Oct 17, 2016

Resource allocation of firmament #47

Resource allocation of firmament #47

Comments

cxxly commented Jul 10, 2016

ms705 commented Jul 14, 2016 • edited Loading

cxxly commented Jul 21, 2016

ms705 commented Jul 28, 2016

cxxly commented Oct 17, 2016

ms705 commented Jul 14, 2016 •

edited

Loading