Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource allocation of firmament #47

Open
cxxly opened this issue Jul 10, 2016 · 4 comments
Open

Resource allocation of firmament #47

cxxly opened this issue Jul 10, 2016 · 4 comments

Comments

@cxxly
Copy link

cxxly commented Jul 10, 2016

Hi:

As far as I know, there are two ways to allocate resource:

  1. Coarse granularity: Partition machine into fixed-size slots, and every slot can run one task, such as Hadoop.
  2. Fine-grained resource allocate like Brog. (Borg users request CPU in units of milli-cores, and memory and disk space in bytes)

I have seen that both your work and Quincy use constant integer K to represent the capacity of a machine, like coarse-grained allocate. But there are some fine-grained resource information in cost model.

I want to know

  1. How does firmament represent resource requested by a task and resource owned a machine ?
  2. what's the physical meaning of capacity and how do you get the value of K?
@ms705
Copy link
Collaborator

ms705 commented Jul 14, 2016

Hi @cxxly,

Sorry for the delayed response -- I'm currently travelling. I'll respond in more detail a bit later.

The bottom line is this: Firmament does use "slots" in the sense that each running task "uses" a leaf of the resource topology (= a PU/CPU core). This makes it easy to implement slot-based allocation policies, but does not mean that you must use slots.

Instead, you can see the leaves as an upper limit on the number of tasks that can run on a machine, which can be greater than the number of CPU cores (just add another level, or make up some "fake" cores, or increase the per-leaf capacity K). A multi-dimensional resource fit model can then be implemented by connecting tasks appropriately to places where they can fit -- as done in CoCo.

To address your questions quickly:

  1. Firmament's job submission protobuf contains a resource reservation vector (see here), so it is currently user-specified. @joshbambrick did some excellent work to make Firmament automatically estimate resource requirements (using machine learning techniques to predict the initial reservation, and dynamic adaptation to tighten it), but this work isn't yet upstreamed.
  2. See above -- the number of slots (K) sets an upper limit on the number of tasks per machine (since K * num_cores gives the aggregate outgoing flow capacity for the machine). We use K = 1 in most code models, but you can set it higher if you want to allow time-sharing of CPU cores.

Hope that makes sense!

@cxxly
Copy link
Author

cxxly commented Jul 21, 2016

Thanks! @ms705

I 'm interested in @joshbambrick work, is there any published paper I can learn.

And I have some question about admission control in COCO cost model, I will open a new issue.

@ms705
Copy link
Collaborator

ms705 commented Jul 28, 2016

Hi @cxxly,

We're going to have a blog post on @joshbambrick's work soon; if you're interested in a longer writeup, his BA dissertation is available here.

Did you end up opening a new issue about the CoCo admission control questions? I don't see any, but I may have missed it while travelling.

@cxxly
Copy link
Author

cxxly commented Oct 17, 2016

Hi @ms705

I‘m really sorry for the delayed response. I’m very busy nowadays for my graduation.

I will open a new issue later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants