-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compound API interactions for cookbooks included by fb_init
are difficult to implement
#233
Comments
Internally, compound API interactions are fairly rare for us. We have a few workarounds we employ, some of which are basically the options you discussed:
My favorite option is what we internally call a "settings" cookbook (e.g. "fb_cron_settings"), which has a recipe which goes immediately before the "implementation" cookbook (e.g. "fb_cron") in the This might not be a great answer if the Certainly a more general solution using |
I read this situation as: there are two (or more) recipes which are late in the run_list, and they're both including some third recipe (e.g. "fb_apache"), in a way that makes the order of when fb_apache will be included unclear / hard to reason about. That makes API handling (which depends on the ordering) tough. The best answer here (though perhaps not the most convenient), is to put that third recipe into This reduces the likelihood of compound API complications, and creates the natural hook for a future "fb_apache_settings " (to handle compound APIs interactions if needed). Arguably more importantly, it codifies the order of when fb_apache runs, which means you can explicitly handle dependencies. While dependencies weren't part of your question, it inevitably becomes a big problem in a dynamic and/or sufficiently complicated environment, and the more codified they are, the easier your life will be. Put into more concrete example terms: fb_apache will need to happen after some things are set up already (like configuring the mount / filesystem it will run on), and some things will need to happen after fb_apache has been done. |
Thanks for your insight (and speedy reply) @joshuamiller01!
This makes sense, but I think our biggest reservation is that we'd probably need to make changes in many Facebook cookbooks, and we felt like that might be a lot of effort to upstream, nor would it necessarily be desirable for you all to merge (especially if compound API interactions are rare for you).
This is pretty interesting! To make sure that I understand this correctly, do you mean something like this? In whyrun_safe_ruby_block 'set foo cron' do
block do
if node['fb_init']['foo_cron']['enabled']
# ...
end
end
end I think you're right that we might want to avoid this pattern as-is since we might have a large number of these, considering that compound API interactions may be more common for us.
If you have a chance, I'd be curious to get a more specific example of this! I'm having a hard time imagining a scenario for this at the moment.
That's good to know, and if you're interested at all in talking more about this in the future, you can reach me in this issue or at my first initial and last name @ the company I work for!
Yeah, and that these also present the same compound API interaction issue as one with a recipe included in Your I'm not criticizing the As I write this and think about our own patterns, I also wonder if giving up the "last write wins" mindset for
...so maybe it's okay for us to read Anyhow, I deeply appreciate you taking the time to answer; I realize I've written a great deal and this isn't exactly an issue with this repository. Feel free to close this issue, and like I said above I'm open to chatting more here, or via email. Thanks again! |
Basically, yeah; we make a cookbook named
Validation can be things like "the API value for some directory said 777 permissions, but here at company X we don't like that cause we care about our security, so we
Yeah - at the end of the day, if you carefully handle the ordering, you don't have to play by the 'write-at-compile, read-at-converge' rules. It really only matters to write before the implementation, and read after all the writes (if you're chef-savvy enough and have a small enough team writing the code and who understand which bits are run when). |
That makes a ton of sense, thank you.
Right, we've discussed this a bit internally and will think more about whether we could statically enforce this for our Back to original suggestions though, I'm curious what you all think of implementing something like how the # in cookbooks/fb_init_sample/libraries/node.rb
class Chef
class Node
def compound_api_interaction(&block)
default['fb_init']['compound_api_interaction'] << block
end
end
end # in cookbooks/fb_init_sample/recipes/default.rb
# *before anything else*
whyrun_safe_ruby_block 'process compound API interactions' do
block do
node['fb_init']['compound_api_interactions'].each(&:call)
end
end
# now it is okay to include_recipe 'fb_cron', etc. ...and then when necessary, a recipe could do something like: # in cookbooks/fb_foo/recipes/default.rb
node.compound_api_interaction do
node.default['fb_cron']['jobs']['foo_cron'] = {
# ...
}
end There are many different ways this could be implemented, however, so I'm not particularly attached to the literal code above - it could be something like calling If this idea sounded at all reasonable to you and the team, I'd be happy to submit a PR where we could nitpick the implementation details. We'd prefer to be aligned with this repository and the patterns inside where possible, but if this isn't something you'd like to see upstreamed that's fine too. |
This is clever and I think it'd be a nice option; I support the endeavor. But to be clear; it wouldn't solve for the validation / mixin use case, because there's a big gap between the very start of converge phase and the point when the actual implementation recipe runs. But the general case you're pursuing - "drive an API write based on reading another API value" should still work. Another gotcha while I'm thinking about it; if one of the @dafyddcrosby @jaymzh feel free to weigh-in. |
Thanks!
Ah, yeah, apologies for not being explicit in my intentions but you are entirely correct; I would still expect that use case to need the
That sounds right to me, and I agree that it's a corner case that would be worth documenting - if I'm thinking about this correctly, compound compound API interactions are unsafe, or at least depend on careful ordering. Maybe more explicitly, it's okay for a cookbook to read from its own API, but not necessarily another's (e.g. it can read from
Thanks for tagging others in! I already have a rough working copy, so I may file a PR before hearing back from them; I'll link back to this issue if I do so. |
Assuming i understand this, I'm a 👎 on this, as it currently stands. In the Consider today this case:
Lets assume that Now, lets say that we implement this. Now In the event that But, adding this API allows anyone anywhere to be the last-writer even after the real-last writer. It's re-creating I might be missing something, and if I am, my apologies. That said, adding more |
I think the intent behind But I'm mostly thinking about this as a a relief valve for the one-offs API complications which come up. If compound API interactions were actually commonplace in your environment, and you were building layers on top of that, I can definitely imagine it turning into a Still - AFAICT, the available general-purpose options are either to make exceptions to last-writer-wins (which is effectively what all these are discussing), or live without compound API interactions. |
I don't think it's that black and white. I think complex API interactions need to be handled EITHER BY the underlying API (by making it |
Thanks for taking a look @jaymzh! I wonder if your example might be addressed by having Even if we slightly tweak your example by dropping the While that does work today, the tradeoff is that @joshuamiller01 noted that way to solve this would be to have the Chef team define a Which I think is the answer to your point about
I am for this approach as well, but there's two things worth noting about this:
I agree that if this was becoming any more complex than the examples I've given, we've gone too far.
I agree, and we really don't want to live without compound API interactions.
I hopefully made a case for why making the underlying API even lazier (by expecting I'd argue that we kind-of have a pseudo- Anyhow, apologies for the wall of text. I think that the |
Re: lazy stuffI think only leaf-node things can/should ever be lazy. Then, much like an old_oi = node['fb_cron']['jobs']['foo']['only_if']
node['fb_cron']['jobs']['foo']['only_if'] = proc { old_oi.call && some_test } I don't think making an entire job a proc makes much else, it's like allowing something to be a hash or an array, things go bonkers. But I think it would be pretty trivial to encapsulate our fb_init OrderingWe've generally found there's a handful of core APIs we expect other APIs to be able to write to. Cron is one of them. Swap is a special case, we expect if you're using It IS possible to build encapsulations of compound APIs. compound_interactionsThe problem with this, is it provides a generalized override mechanism. There already is one - it's But I do see the problem you're trying to solve, so allow me to spitball a bit. I wonder if we could have some sort of white-list in the Chef::Config, for each cookbook, which cookbooks were allowed to be compounding APIs. This would of course mean it's a bit less flexible for you, but it would at least add some safety to it. Like maybe: fb_pattern.compound_apis = {
'fb_cron' = > [
# list of cookbooks that can override it
],
...
} node.compound_api would then need to look at the call stack. Which is a ugly in its own right. And yes, I'm aware that We could even have an Just thinking outloud here. One last thought.... in many cases, every place someone uses And finally, while I stay heavily involved in this as the original author of most of this... I am not at FB/Meta anymore and don't have final say. |
I am curious about how to handle things like teaching the Sure,
I'm also interested in how this would apply to APIs where they just take a
I think there's an interesting thing here. In your first In my modified example - with just
One concern I have - and it may not apply in your environment - is that I can imagine a situation where it is "impossible" to have a working configuration with just ordering alone. That could be like the situation I mentioned in the original post of having two stack cookbooks that both include One answer would be to move
I think I understand this problem generally, but I'm also having a hard time re-articulating it. Going back to your example again, the concern is that On the other hand, sometimes it's okay that something can't be overridden, like with In that world, one way that So I'd summarize the possible bad scenarios as:
I appreciate that you can understand our situation! I'd like to give just a bit more context around how we got here, if it helps: we've taken an approach of categorizing our cookbook implementations into "Facebook-style, un-opinionated, potentially upstream-able", and then building on those in "Etsy-opinionated wrappers" (which we also write in the Facebook style, but we never expect to upstream them). We find that this categorization is really useful to make sure stuff is fully configurable, rather than being codified to whatever it was in our old Chef repository. So we would have an Since I might be repeating myself from the original post, but I thought it was worth emphasizing that we've been approaching development in this way.
I'm not familiar with using On the other hand, I'm wondering if there's a more direct way to prevent the "bad" things I tried to summarize above, but it almost seems like a catchable-in-code-review-only thing. This isn't a new problem, either, because - as much as I wish this wasn't true - a lot of writing a "good" Facebook pattern cookbook is not something you can statically analyze. I'm somewhat partial to the idea that we could come up with a rule for this rather than rely on an allow list, or to just rely on code reviews, but I don't think I feel overly strong about this.
I agree, and after thinking more about your point, I think you could also express the compound API issue as a result of not having the ability to say "when If In my earlier research on this topic I did find https://github.com/Annih/chef-updatable-attributes, which essentially is what I described above, but I'd have to think more about the approach and consider if there was a better way to implement it. Maybe having a good implementation of something like this makes our discussion so far moot? Does this address a generic compound API interaction pattern and avoid the concerns you brought up?
I appreciate the humility, but I definitely value your opinion! So thanks for giving this your consideration. (Also, apologies for this absolute unit of a comment.) |
Today every We could make all the attrs lazy, and I'm cool with that, but also, you could work around the lack of that but having a few entries with the appropriate
This is a good point. One of the impetuses behind my design of the FB pattern was that we didn't want to have to think of all possible combinations. Resources, today, often do. They have awful 1-1 mappings of properties to item-in-config-file which constantly have to be kept in sync with the upstream software, and I just think that's The Wrong Way(TM). So you're right, we don't want to go too far down that path. That's a super convincing argument. I'll come back to this.
Welllllllll I don't think these are quite the same. I'm not saying "swap owns this fstab entry" - I'm saying "swap owns swap, which has nothing to do with fstab, but fbstab happens to be the way it's configured, which is this weird side-effect of unix history" When I wrote But ultimately since we owned init and could own the order, it made sense to just have one build on the other since I could control the runlist order and be done with it. In a systemd world, I might use systemd_unit directly and bypass fb_fstab, but this was all written way before that. Or usr the generator thing I mention below. You're example is different. You're not trying to own, say, Actually I have a better (maybe) solution for that that we have often used:
We do that a lot, actually, and it works quite well (for many cases).
The model states this should only ever be true with core cookbooks, and never ever outside of it. At least... that's the theory. :)
Almost every time I've EVER made an array in an API I've regretted it. Here's another.
Precisely. Well summarized! Much better than my summary, thanks.
This is exactly what we use In fact, at one point I thought of making a run-list generator. You'd pass in something like:
And it would do, roughly: def gen_runlist(prefix, runlist)
runlist.each do |item|
_, name = item.split('_', 2)
wrapper = "#{prefix}_#{name}_wrapper"
if cookbook_exists?(wrapper)
include_recipe wrapper
end
include_recipe item
end
end I actually think that's the right way to do (most of) this. That solves the
Hehe, you want React, but for Chef. :) OK coming back to your point about not having to know everything that someone might ever want to do, I agree. My take is roughly:
If we still have holes after that, we should do compound_api, but with handles to turn it off/whitelist it. |
No apologies necessary, I much prefer it, it helps ensure a productive discussion, rather than just "well no I need it!" - the more detail the easier it is to build solutions that work well in the long run. |
Yeah, I see your point, and that is why this whole pattern is pretty attractive - you can make all sorts of decisions but they don't end up getting set in stone, because you can always change it later without having to jump through hoops (like overrides or rewriting core cookbooks, etc.).
I was this close to suggesting method calls that write to node attributes as a solution in my last message, and now I wish I had, haha.
I think I get the
Ha, fair point. There's two reasons why I didn't end up suggesting a method call, which kinda ties together with how I arrived at a React-like suggestion:
There's also a property of writing to a node attribute API that I think we could put aside for now but I want to mention anyways - if you write to an Back to the original point: since it feels most natural to write a stack as a cookbook following the Facebook pattern, it sticks out that configuring the stack would require calling a method, instead of (or in addition to) changing an attribute. And so in the spirit of keeping attribute writes as the only API, being able to define some attributes in such a way that they immediately compute downstream attribute changes would allow for both writing to other APIs and to keep the last-write-wins mindset, without needing "overrides". Entertaining this point just a little bit more, I could imagine something like: # in cookbooks/stack_foo/attributes/default.rb
default['stack_foo'] = {
'foo' => 'bar',
'baz' => CompoundAPI.new(StackFoo::handleBaz, {
'baz-sub-attribute' => 42,
# ...
}
} ...and so setting I can understand that this is getting into decently complicated territory, but I can't think of any other way to maintain the consistency of a last-write-wins, compile-time node attribute API. Open to other ideas though!
I think I agree (putting aside my conviction about React for Chef ™️), and one thing that I'd like to add is: we should have some way of helping devs write correct wrapper cookbooks with guardrails, e.g. static analysis. We've attempted to do that at Etsy by using Rubocop to enforce that node attribute reads in cookbooks must be lazy, which works for "core" cookbooks when we can re-arrange the ordering, but unfortunately as this issue reveals this doesn't work in "wrapper" cookbooks. Otherwise though it's been great, since it helps to drill in the Facebook pattern to developers who aren't familiar with it, and of course even core maintainers can make mistakes. So if there was any way to guide people to writing wrapper cookbooks correctly, whether the solution is moving stuff to method calls or using a magic reactive API, I think we'd be set. Maybe this can be done purely with Rubocop, or maybe there can be some sort of on-rails wrapper class / cookbook / whatever that a dev can extend?
I haven't fully thought this out yet, but if wrapper cookbooks have their own solution, it may be that the only time we need It is kind of a shame though; |
What a great discussion! Some lightly-organized thoughts on my part. Some of the discussion here is conflating 'settings' cookbooks with 'wrapper' cookbooks. In this context - dealing with compound API problems - 'settings' cookbooks are an opportunity to tactically plug a particular compound API gap. This works for Meta internally - because we don't do 'wrapper' cookbooks. In this discussion, a 'wrapper' cookbook will write to a bunch of APIs, but which exposes its own API. Since this new API will be the thing driving all API settings, it turns every API attribute into a compound interaction. 'settings' cookbooks don't help here (unless you're rewriting every API value, in which case we're back where we started). If you imagine lots of wrapper cookbooks, after a bit you're writing Eric's original question: how do we generically make this work? One answer is "don't make wrapper cookbooks", because they create this problem. But that will probably create lots of duplication / boilerplate / copy-paste, which also isn't great. Some of the discussion here is about using a library call to initialize node API values vs
Of course, if any of the values you want to set are based on other APIs, you're back to the original problem. But (in our experience) this is something that can usually be one-off addressed by passing a Proc, or a settings cookbook, or one of the solutions already discussed. Maybe this might work for you and be a way to try to avoid generally solving compound APIs. Back on the topic of generally solving it through.
Every solution on the table here involves a block of code which runs at converge time; they all are a generalized override mechanism. We have to assume good intent, and this is chef code; an uncaring / hostile chef author can always clobber their way into getting what they want. This discussion is about building a process end users can use to deal with compound API interactions (rather than a hammer to break last-writer-wins). That said ... having an allowlist or some element of code review that forces somebody who actually understands chef and the API model to green-light these sounds reasonable to me. With enough users and time, uninformed copy-paste will creep in for no good reason, and if you had a lot of this, last-writer-wins would be severely degraded.
This touches on some of the other problems with this API model (unrelated to compound APIs) - and is its own huge discourse, so - not touching that here.
There are technically hooks in the event framework which could allow you to do this, but I advise against, for a couple reasons. There's a gotcha in here, unrelated to the FB API model, but due to chef and how it does Immutable attributes. In short, any time you write to a node attribute top-level hash (say Finally, the events hooks are a spot which is brittle and something like the attributes code is called real early, such that there's a very good chance you'll eventually break chef in ways chef can't fix itself. |
I appreciate this framing, thank you!
In your example, what if
Somewhat repeating my last question, but I just want to confirm that if this service had a compound API, it would look like this to the caller: node.default['company_service']['foo'][API] = ...
include 'service_foo'
# either:
Company::Service::Foo.setup('different-parameters')
# ...or:
Company::Service::Foo.set_up_one_part_of_foo('different-parameters') So the Again, not rejecting this, I just want to be extra clear about what this will look like for when I explain this to others.
A thought that keeps sticking out to me is that it's almost like I want a two-pass model for writing node attributes, e.g. the reference I was making to a "post-compile, pre-any-cookbook-convergence" phase in my original post. That is, I suppose I want to be able to go through all of the If that were possible, I believe this would correctly handle one layer of compound API writes, while still allowing for last-write wins, and without a convergence-time "override" - though of course you could still implement one for core cookbooks to enforce invariants. This is because the underlying problem is that in a compound API I want to read my node attributes lazily so I can pick up on attribute writes that happen later in compile time, and if I move that to the convergence phase we have the problems we've discussed already - that it either doesn't work because it's too late, or it doesn't let someone else change what I write unless they too defer their writes until the convergence phase - but if you were to run through all of the I think the flow would look like this for one of the examples we've discussed:
This seems to me like this has all of the properties we want? Though maybe this would suffer from the performance problem you noted later of reading and writing top-level attributes.
That makes sense to me too; I think ideally we'd be able to enforce "good" usage of a compound API with static analysis, but that may be impossible, and so failing that, requiring a code review would suffice. It'd be important to have good written documentation on when to reach for this though. I'll note that we have a self-imposed incentive to avoid a centralized allow list since we rebuild disk images for every affected host, so e.g. making a change to
Haha, yeah I totally understand, I just didn't want to let it go unstated :)
This is new to me, thank you! I had seen a bit of the deep merge stuff when I was going through Chef internals, but I wasn't aware of the performance implications. |
I wouldn't have them write to the node. I'd have them return the data. node.default['fb_cron']['jobs']['etsy_job'] = Etsy::Bar.gen_cronjobs() Keep the write-to-the-node obvious. It's just a helper to generate it the right way.
Because it always makes sure the wrapper is before it, so it can define a This has a few advantages: It's just anyone who can come along and add one of these "overriding" APIs, it has to be something specifically wrapping a core-cookbook, specifically named as such, and specially accepted by the core team. etsy_randomdude can't start overwriting fb_apache. There's a strict (implied) contract here that
I think you're misunderstanding my suggestion. Maybe. For a random service, ## Customizing the refresher cron
The default recipe in this cookbook populates a cronjob to refresh metadata every few days. You may modify this by modifying `node['fb_cron']['jobs']['etsy_coolservice_refresher']` to update the `time` attribute.
However, since there are some other refresher options you may want to change, we have provided a method to generate the job for you with different options. You may do:
node.default['fb_cron']['jobs']['etsy_coolservice_refresher'] = Etsy::CoolServices.gen_cron(options)
Where `options` are:
* `time` which can by `:hourly`, `:daily`, `:default`, `:weekly`, or a time string
* `overwrite` - defaults to true, but set to false if you want to merge new data
I WILL point out that this contrived example is bad because there's a WAY better way to do this. The CORRECT way to do this is to have a much simpler cron job that reads a CONFIG file, where that config file is a template that is populated by the But it's a contrived example to show how one would use such a method.
In this case a compound API is probably the wrong call. It should write this to a config file, using a template, and then reference it there. The config will be correct because it's only written at runtime. |
Thank you both for your help so far. I've been thinking about our discussion over the past few days, and I wrote up a rough draft of what I think are the set of "rules" that are necessary to follow when a developer wants to dynamically modify a node attribute, so that I could share them with my colleagues. It reads somewhat like a flowchart currently, and I may tweak the presentation a bit, but it currently is as follows:
The "make this a method call" suggestion will need some examples fleshed out, like the When I discussed these rules internally, one of my colleagues pointed out that the "computed compound API" idea a.k.a. React for Chef, is much more attractive. I hate to keep bringing it up, but after thinking more about the code example I gave, I am not quite sure I've been convinced to reject it outright. Going back to @joshuamiller01's comments about that:
I think that, in my example, it would be a read from hash A, and a write to hash B. That is, you'd define your compound API in hash
In the example I gave, Chef event hooks weren't needed. It's instead using a custom class as a node attribute, and so it can directly intercept writes via the ... The rules that I have above, if they're even correct, are arguably pretty complicated for the average developer at Etsy to work with. It may be that this comes from a fundamentally different motivation that we have: our company is smaller and we expect a larger number of employees to write and maintain their own systems in Chef as a part of their responsibilities instead of having a core team that is large enough to handle writing That said, it's possible that the answer is that we just have to follow these rules, and that's fine. Moving on to a few points from the last comment.
Ah, yeah, that makes sense.
I see, that also makes sense to me. That said, I think there's a slight miscommunication about our wrapper cookbooks on my part, and it may be that In the case of Furthermore, while we'd typically run only one stack on a host, we've also been thinking about our developer VM setup, where we might run multiple "stacks" on a single host. Since each stack originally was going to be self-contained, some of them may include Regardless, we've been discussing your comments a great deal over the past few days and it does seem like it will be possible to do the above without compound APIs, but again, it feels somewhat less straightforward than we feel like it would be with just using compound APIs - if it were possible - in the first place.
That's a good point, and it does address one need for having a compound API, so I've added that to the rules I shared in the beginning of this comment. ... I suppose I could summarize this comment as: it still seems like it would be nice to tell cookbook API developers that they don't have to avoid compound APIs, and to tell cookbook API consumers that they only ever have to use Is there some way to think about the attribute read/write dependency graph that makes this possible, and avoids the downsides of just blindly overriding things at runtime? |
We do that a lot. But
Once again, there's an implicit, but specific, agreement here: I'm using the Service cookbooks should not be included in The more I have this discussion the less I'm seeing a good usecase for compound APIs and the more I'm growing convinced there's some documentation missing on the basic structuring of how stuff should work together. In the cases I've run into that need this, usually I've found a few things to be true, that once I admitted, made me go another way:
To be clear, this is exactly how FB worked. The whole POINT of jamming the complexity into core cookbooks was because we expected every developer at FB to write their own cookbooks. The core team was 4 people and ONLY wrote core cookbooks, every SWE team wrote their own cookbooks for their own service (or PE/SRE team if they had one, but that was a small percentage). |
Ah, I'm somewhat confused here - I don't think I was saying that it should be a core cookbook. Your following
...and so that leads us to the "rules" I discussed in my post above. That is, if you want to change other APIs, you can't, and you instead need to move the logic elsewhere, like method calls or config files.
This is exactly the kind of mindset we wanted to have, and that's kind of why (2) could be an issue - if I'm running two services on one host, it should (if you agree that running two services on one host is defensible) "just work", but it turns out it wouldn't due to the cookbook's internals, which breaks the abstraction a bit.
I agree with this, and I'd be happy to help with that! I also realize that a lot of this discussion probably feels like you're having to help us organize our code, and I want to stress that I appreciate it a great deal. I also think that it may illustrate my point that if we could just write things "naturally", like if we could write a cookbook that has an API and it may change other APIs and it actually worked in all situations, then the existing documentation would suffice; these cookbooks would be just like any other cookbook. I'm sure if I gave you our web monolith's Chef code you could sort it out, and I'm sure we will be able to as well, but it's likely going to be a somewhat bespoke implementation rather than being just another API-driven cookbook, if that makes sense. And this is the thing I think I'm having a hard time succinctly describing - that in our monolith we have many different vhosts that we need to turn on or off, or vhosts that we need to flavor slightly differently on some hosts, and many different crons that we may or may not run or will run with slightly different parameters, etc., and figuring out the right way to slice and dice these into config-readers or method calls, while still trying to present a high-level "include this cookbook to get a batteries-included monolith tweaked for you", is more difficult and "free-form" than just writing it as an API-driven cookbook. |
Do you have an example that's not cron? I think 95% of cases are solved by rendering a template, which the cron reads (and becomes an authoritative source, other things can read it too). Most problems are solved by "let chef resolve the proper config, write it out, and let everything consume it from there."
This is an interesting thought. So like, two services under different vhosts? I think this "just works" if they're both well behaved, but you have to pick which one wins. If your runlist is:
And they both only touch their own vhosts, then this all "just works". If they touch some core configs, bar will win. Which is expected, it's later in the runlist. That should be fine. Maybe we should suss out a more specific example?
Not at all!! I'm stoked more people are using this. One problem at FB was a lot of this stuff was in my head for a long time, and people would present these problems and I'd go "ah here make a template" or "oh here, do X"... Naomi and Josh also became experts and together we tried to write it all down, but there's a lot more to document, I'm sure. |
Ha, that's a good question, and I had to think about this a bit. I think I've seen this with:
While each of the above seems like it could be solved by either (1) adding laziness to core cookbooks, or (2) moving to a config file, or - for "core" cookbooks only - (3) ordering, I don't want to lose the point that this is hard to explain to developers, and it'd be nice if there was a way to have things "just work" with normal
If I understand correctly, it's not about "which one wins", but which one works. In the following example:
I believe that That's essentially the other problem I'm grappling with; stacks are meant to be self-contained, but stacks with compound APIs don't play well with other stacks that use the same underlying API because of this. In the above example, we could solve this by forcing the runlist to look like:
and it would work, but we'd be giving up the self-contained nature of the cookbook, and it'd be yet another thing to explain to developers ("don't include the cookbook you're relying on, ask the caller to add it to their run list after your cookbook"). In the "rules" I posted this is solved by never having If we can confidently rule that out, then I think the approach of adding laziness to a few places in some of the core cookbooks, documenting that core cookbooks should expect laziness in a few "standard" ways (like accepting |
Oh, I like that idea a lot. (
Ugh, sorry you're right, the right way to do this is for these cookbooks to specify they expect to be included before fb_apache and that the user should include fb_apache after it and any other cookbooks that require it. This is what happens when you don't touch Chef for a few months. Again, the point here is to put as much onus on the API-cookbook authors and less on the leaf cookbook authors so that $random_developer doesn't need to understand too any of this to setup an instance of
I think that's a pretty straight-forward thing to document, it's not "in this case this, in this case this, in this case this.." The FB model is fundamentally backwards from the standard model. In the standard model, if you want to wrap a cookbook you set attrs, then you include it, our model is you include it, then you specify attrs. This isn't really different, it's just a different model and needs documentation. We had a lot of such wrappers, but it was incredibly rare that people wanted to double-wrap a cookbook in a single use-case. In the very, very few cases we did, we'd end up with something like a ::base and a ::default recipe and ::base would be just the API , and ::default would include ::base and fb_apache, so that if people doing weird stuff had a mechanism, and people doing the standard stuff could use the default recipe. Had we had a setup more like yours we might have made a standard practice that "when wrapping an API, you should always split things into ::api ::dependents" and told everyone to do things like: %w{api dependencies} |recipe|
%w{etsy_service_foo etsy_service_bar}.do each |svc|
include_recipe "#{svc}::#{recipe}"
end
end And in fact, this coalesces pretty well with the fact that already today cookbooks don't |
Context
At Etsy, we've gone all-in on using this repository and the philosophy it represents (which we call the "Facebook pattern"), and have been working on refactoring our Chef code repository in a greenfield project.
As such, we've written an
etsy_init
(i.e.fb_init
) cookbook that follows the same pattern asfb_init_sample
- we have a bunch of settings we configure for our "base", and then a number ofinclude_recipe 'fb_foo'
statements to match.We also apply our cookbooks with the run list as suggested by the README -
recipe[fb_init], recipe[$SERVICE]
- where in our case$SERVICE
ishostgroup_foo
, i.e. a cookbook that configures everything necessary for afoo
cluster.These "hostgroup" cookbooks often configure and include a cookbook that wraps some "Facebook-style" (our term) cookbooks, like
fb_apache
, into a reusable building block that can be shared by nearly similar hostgroups. We refer to these asstack
cookbooks, but the naming isn't particularly important - what is important is that these cookbooks will almost always have "compound API interactions".To make this more concrete, as an example we may have a
stack_etsyweb
cookbook that:fb_apache
fb_php
cookbook)...all for running "Etsyweb", our monolithic PHP application.
While the application itself is a monolith, we do run the application on separate clusters, and these clusters may be configured slightly differently. Due to this, our
stack_etsyweb
cookbook will expose an API that ahostgroup_foo
can configure differently than ahostgroup_bar
would, but otherwise they are exactly the same.The API exposed by
stack_etsyweb
is, as mentioned above, a compound API interaction, and so it will setfb
attributes at runtime, like so:The Problem
We've run into an issue where we cannot configure Facebook cookbooks (using the compound API interaction pattern above) that have been included in
fb_init
, becausefb_init
converges before our "tier" cookbooks converge.Building on the previous example, if our
fb_init
cookbook looked like:...and our
stack_etsyweb
cookbook looked like:...and our
hostgroup_bar
cookbook looked like:...and finally, our run list looks like (as described in the README):
...then we would unfortunately not see a
foo
cron in the resulting crontab, because again,fb_cron
has already converged by the time that thestack_etsyweb
cookbook is converging.While the example above demonstrates it with a cron, I'd like to emphasize that this happens for any compound API interaction involving a cookbook included in
fb_init
.This would also happen if we were to have two
stack
cookbooks with a common dependency included in a singlehostgroup
, like ifhostgroup_foobar
includedstack_foo
andstack_bar
, and ifstack_foo
andstack_bar
both usedfb_apache
. In this case,stack_bar
's compound API interactions withfb_apache
would never show up, sincestack_foo
would convergefb_apache
beforestack_bar
converged.Discussion
First off, I realize the above has a lot of our own terminology in it, but hopefully the underlying problem is clear: compound API interactions require careful ordering to ensure that attributes written the converge phase are converged before the cookbooks that use them are converged.
It seems that you all have considered this issue on some level, as there are comments in the
fb_init_sample
that mention ordering:chef-cookbooks/cookbooks/fb_init_sample/recipes/default.rb
Lines 143 to 148 in e17c840
We've brainstormed this a bit internally, so I'd like to share some of the ideas we've had so far about how to handle this, in no particular order:
Somehow implement a "post-compile, pre-any-cookbook-convergence" phase, and set attributes there.
Since the issue boils down to wanting to set an attribute in the convergence phase so that it happens after the compile phase, but that causes ordering issues with the things that are being converged, we could implement some way of setting the attributes before any resources actually converge.
This could be something like a
whyrun_safe_ruby_block
-esque custom resource that accumulates blocks at compile time and then have something at the top offb_init
that executes all of them, or something like a pattern where cookbooks can register arecipes/converged-attributes.rb
file that we dynamically execute at the top offb_init
somehow, etc.Make an "
fb_init
sandwich", and change the run list to something like['fb_init::part_one', 'hostgroup_foo', 'fb_init::part_two']
.I'm not a huge fan of this as we could have situations where we want something to live in "part one" and "part two", and so we'd have to pick whether we wanted to have it be set up early, or if we wanted to allow compound API interactions with it. It also doesn't solve the issue of having a
hostgroup_foobar
that includesstack_foo
andstack_bar
.Use
lazy
node attributes viaChef::DelayedEvaluator
.This one may require more explanation, but since this issue is already pretty long, I'll try to keep it brief. As of chef/chef#10345, Chef has some support for
Chef::DelayedEvaluator
instances inside the node attribute tree. This means that instead of using awhyrun_safe_ruby_block
for a compound API interaction, you could write:This reads a lot better than the
whyrun_safe_ruby_block
approach in my opinion, and also has the nice side-effect of still allowing for a "last-write wins" scenario where a later cookbook decides to overwrite thenode.default['fb_cron']['jobs']['foo_cron']
entry entirely at compile time.Unfortunately, this only works if the
fb
cookbook in question accesses the node attributes directly via[]
, and not via something likemap
,each_pair
,to_h
, etc. As far as I can tell, most or allfb
cookbooks, like the aforementionedfb_cron
, do not access the attributes via[]
, so this won't work.I've documented this as a bug in the Chef repository in chef/chef#14159, because I believe that accessing the node attributes without
[]
should still evaluate anyChef::DelayedEvaluator
s anyway, but alternatively Facebook cookbooks could be changed to expectChef::DelayedEvaluator
instances until the behavior is changed (if ever). This would potentially be something like walking the node attribute tree to evaluateChef::DelayedEvaluator
s manually before iterating over the node attributes, or always using[]
, or something else.We've considered doing that, but it means that we'd need to make a lot of changes to our fork of this repository. We wanted to get your opinion on the matter first before we more seriously considered this approach, since we'd ideally like to contribute these changes here.
Closing
Off the cuff, I think that being able to use
Chef::DelayedEvaluator
instances in the attribute tree is pretty attractive, but the "post-compile, pre-any-cookbook-convergence" phase idea is probably more tractable.In any case, we're interested in getting your thoughts on this issue, and we're curious if you all have implemented anything or follow any patterns internally in order to better allow for compound API interactions in all situations.
@jaymzh in particular, I'm curious what your take is on either this issue or on the one I filed with Chef at chef/chef#14159, since I saw you participated in chef/chef#10861.
Thanks for giving this a read, and let me know if I can clarify anything! I'm also happy to take this conversation elsewhere.
The text was updated successfully, but these errors were encountered: