Skip to content
slerner edited this page Jul 2, 2013 · 52 revisions

Tue, July 2, 2013

Comparison to PEC/XCert

  • PEC/XCert still has a formally gap

  • PEC/XCert had only a single property

  • Coq is great!

  • But Coq requires expertise and effort

  • [As outlined more in related work, previous efforts still require proving properties in coq and registering hints in databases, etc]

  • We present a DSL for reactive systems that offers push-button verification for a set of user-defined properties and for a novel formulation of non-interference well suited for non-deterministic reactive systems

  • We report on the key techniques, eg policy-language co-design, where we structure our policy language and DSL hand in hand

Wed, June 5, 2013

Main contribution: pick narrow domain for DSL and properties, and then get full proof automation.

List of technical points to make:

  • Deeply embedded dependent DSL where dependent types are inferred.

  • Reactive Non-interference in the face of non-determinism. Many things were tricky.

  • Dependently typed interpreter, symbolic evaluation, have to quantify over all the possible inputs in a type safe way.

  • To consider further: Bedrock reasons about idealized assembly, we reason about real code

Thu, May 30, 2013

Decided to focus on non-interference. Specify high components with a function from forall c : COMPT -> [[comp_conf_desc c]] -> bool. A simpler alternative would have been COMPT -> bool. This would preclude things like domain noninterference in Quark. However, it would also allow to treat entire handlers as high, since ever component of that type is high.

Definition non_interference :=
  forall st1 st2,
  Reach st1 ->
  Reach st2 ->
  no_env_interference st1.tr ->
  no_env_interference st2.tr ->
  high_recvs st1 = high_recvs st2 ->
  high_sends st2 = high_sends st2.

"mid" level environment non-interference:

Definition no_env_interference :=
  forall tr1 tr2 ARGS,
  high_calls tr1 = high_calls tr2 ->
  Call_oracle ARGS tr1 = Call_oracle ARGS tr2.

Next step: write these out in Coq and prove non-interference for simple examples.

Wed, May 27, 2013

Met with Sorin to discuss Reflex vs. Bedrock. Bedrock focuses on full functional correctness, we only care about some safety properties. We will exploit the domain for a higher degree of automation (no annotations?) and to lower programming effort (motherhood and apple pie). Our verification leverages the DSL. Bedrock is super low level (asm).

Also decided to switch focus from invariants to non-interference.

Tue, May 28, 2013

reflex module dependencies

Thu, May 23, 2013

We already found heuristic proof search is very difficult to reason about, but actually worse, it's incredibly brittle -- any change to the language or policy breaks the automation. We end up with a similar set of problems if we try to apply the foo approach: the choice of how to decompose the proof for any non-trivial property gets stuck when we need to find inductive invariants. So: why not just start inductive invariants? We shall.

Idea: provide mechanism for the user to specify inductive invariants and then proof machinery to verify the invariants. Next step will be heuristics to infer invariants.

Examples:


init:
  x = 1

handler:
  send 0
  send x

invariant:
  x = 1 /\ ImmBefore (Send 1) (Send 0)
init:
  x = 0

handler:
  A
  x++

handler:
  x--

handler:
  if x == 1:
    B

invariant:
  (x > 0 -> AlreadySent A) /\ Enables A B

TODO:

  1. Extend policy language with inequalities over kernel state and implications.
  2. Build a set of examples (start with those above).
  3. Figure out which real examples we want for the paper: Quark? SSH?

Branches are working, component patterns are better, and we added a send command back into the language.

Thu, May 16, 2013

Even without branches, state, and sendall, subtleties arise when developing a proof procedure for imm_before. We had hope to decompose the problem into showing that the policy holds on each handler (and init) and then get that it holds for the whole program. This strategy may still work, but we run into problems with component configuration, or in general, any policy involving the sender/receiver of messages.

init:
  Spawn T(0)

hdlr for T:
  A

policy:
  (Recv T(0) _) ImmBefore A

For now abandoning the purely local proof search strategy (sort of invariant free) and exploring weakest precondition based approach.

Symbolic evaluation doesn't give us anything -- basically the same as doing induction over Reach.

Exploring the formal sufficient condition approach next for imm_before.

Wed, May 15, 2013

Figured out design branches. Moving forward on axiomatic for now. Basically, assume there exists a "pure" version of spawn and call and add post conditions to those primitives that requires their return values to be equal to the pure version. Then we can use the pure version in Reach where necessary to get the non-det values that arise when the user's program is run.

Tue, May 14, 2013

Starting reviewing the big picture and bringing Sorin up to date on recent developments.

Going to push on imm_before completeness hard till the end of the week.

CONTRIBUTION : Important to prove tactics complete to provide user automation. If you don't prove completeness you will inevitably miss corner cases.

Ptival: The branch subtlety is the following: In the presence of branching, the non-deterministic input required by an execution of a handler depends on the path of execution. Furthermore, it is not obvious that this will never depend on the actual execution of the interpreter step by step. That is, we might not be able to compute the input type of the interpreter without running the interpreter... Consider the sequence:

A; if cond then B else C

If cond may depend on A's runtime behavior, then the input type of this code depends on A's execution.

Mon, May 13, 2013

Zach still needs to email Bill and work on infrastructure (fighting some stupid VM issues).

Dan wrote a decision procedure for imm_before that works on concrete cases:

forall tr A B,
decide (ImmBefore A B tr)

In an ideal world, we'd eventually get the general case:

forall A B,
decide (forall tr, Reach tr -> ImmBefore A B tr)

We have a tactic for imm_before that can prove the property sometimes, but we're not clear on when it works / what restrictions on the program it requires.

What we want: some strategy for deciding imm_before when there are no branches or kernel state used in the program. Make this strategy work on our examples. Once we have this we want a proof that the proof strategy is complete.

Dan proposed the following line attack: intro an intermediate program representation which restricts the code to the subset we're considering. Then write a decision in Gallina to handle that.

We discussed and decided to talk Ltac + paper proof path as rapid prototyping for eventual Gallina decision procedure. This strategy has added benefit of a safer failure mode.

Ptival continues on branches. There is some subtlety due to sequencing commands. The 2nd command's input depends on the computation done by the 1st.

Fri, May 10, 2013

Dan started on imm_before but it's more subtle than we thought. Even without conditions in the language, you may need to some programs invariants to prove anything. e.g.

Policy:
  ImmBefore (Send _ 1) (Send _ 0)

Init:
  x := 1

Hdlr:
  Send _ 0
  Send _ x

Possible solution: have programmer annotate w/ invariants. Perhaps we should do some program analysis instead. We'll have to think about this. May be able to compute weakest precondition of a policy with respect to the handlers. This should handle state and branches. The conjunction of the weakest preconditions of all the handlers should give us the loop invariant we need to prove.

The next simplification that Dan will focus on: in addition to branches in handlers, also assume no state.

In addition we should consider how to compute WP of our policies w.r.t. handler code.

Ptival had some new challenges with the binders for non-det values in the face of branches. Turns out they need to access state in their definition to be defined. Local var environments also cause a bit more pain, but may be easy to tweak.

Thu, May 9, 2013

Updated Sorin on current status.

Next step showing completeness of imm_before:

  1. assume no branches
  2. sketch proof strategy
  3. prove that if prop holds, proof strategy succeeds

After that will be to add branches and figure out phrasing of relative completeness.

Had good discussion about what "oracle" does for relative completeness: takes suffix of path and decides if branches are feasible. Turns out our view of the interface to the oracle is insufficient: actually need to do some program analysis or something offline to establish loop invariants ahead of time. Need to know which branches are feasible. Consider case where we need to know branches are dead.

TODO (Ptival) SUBTLETY interface to oracle: can you briefly describe challenge here, just that we can't only look at suffixes?)

@ztatlock will ask Bill about "relative" completeness related work.

Dan's non-interference hypothesis: "If there's no interference through the real world, then there's no interference through the kernel." There is another page in the wiki that describes in more detail all the problems and proposed solutions that we've had for non-interference: Non Interference.

Ptival will start adding branches to language.

@ztatlock will clean up repo and build system in particular.

Wed, May 8, 2013

We have been stuck on completeness of proof on tactics for primitives. imm_before and imm_after should be "easy" to do proofs for. Since we don't have completeness proofs for anything else, we'll do those first. Getting these ironed out should help us figure out the "relative" completeness: you have dead code in a handler that will be included in Reach relation, need to proof procedure to show it's infeasible. This has to do with the expressions used in branches in the handlers.

Proof by reflection is hard for Reflex because the goal is difficult to reify (?!).

Concretely, next steps: do completeness proofs in this "reflection-ish" style: computing over the spec and policy. Start with imm_before and once that works move on to enables. Develop successively stronger sufficient conditions on the spec for the policy to hold (e.g. start with "B never occurs" check for proving imm_before A B).

Ptival will merge everything into master. @ztatlock will help clean up infrastructure. We'll resume doing code reviews for merges into master.

@ztatlock will follow up with Ranjit on finding the reference monitor paper. In general, we should we keep expressiveness in mind as we move forward.

Reference monitor references:

Ulfar Erlingsson PhD Thesis

IRM for Java stack (Oakland 00)

Certified Reference Monitors in .NET (PLAS 06)

Non-interference in the face of non-determinism: solutions exist in the literature. World solution is there. If we use a bi-simulation style argument between the full system and a version of the system will all low components removed, then we need Reach to be tight. In particular Reach_spec_tr should imply some concrete execution can produce tr.

Non-interference references:

Information-Flow Security for Interactive Programs (High/Low sources of non-determinism)

Reactive Non-interference

TODO (Dan) Add reference to possible traces definition of non-interference

Add hypothesis when proving non-interference property: calls are deterministic w.r.t. history of high components. More generally, any user-provided filter: could filter nothing (strongest, implies no interfence through system or internet -- probably too strict) or could filter everything (weakest, implies that primitives are constant functions -- probably too loose).

The thing about non-interference in terms of wget over domains: the filter is not all or nothing, but more precise.

Apr 29, 2013

There are three directions that we talked about today. Feel free to add anything that I've missed.

  1. Expressiveness of the policy language. There are a number of ways we could show that our property language is expressive. For example, we could extend our language to a known language like LTL or regular expressions. We could also compare our language to an existing formalism (reference monitors) to show that we can express anything expressible in that formalism.

  2. Completeness of proof automation for the policy language. We want to show that our decision procedure(s) are complete for our property language, relative to the completeness of the decision procedure that handles whatever expressions we allow in “if then else” statements. The first step is to prove this by hand (excluding the broadcast primitive). Then we should include the broadcast primitive. After that, we can attempt proof by reflection if we feel like it.

  3. Non-interference in the presence of nondeterminism. We want to adapt the standard definition of noninterference (comparing pairs of traces) to our setting, in which there is nondeterministic input from the outside world (e.g. wget). We discussed several options for this, most of which we rejected (to be summarized in another document). One option that we have not yet rejected is to represent the world with some sort of “random” generator. This would allow us to have two (or more) definitions of non-interference. One definition might express the fact that there is no interference through any channel while another (weaker) definition could express the fact that there is no interference through the kernel.

April 19, 2013

This is a short description of the non-interference stuff we talked about yesterday with the purpose of filling Zach and Sorin in. Let me know if I've messed anything up.

We want to express the property that component A does not interfere with component B. We want to express this by saying that there exists a subset S of the components that contains B but does not contain A such that on any two reachable traces, if the kernel receives the same list of messages from S then it sends the same list of messages to S. Intuitively, S is isolated from the components not in S. Note that this is just the definition given in "Reactive Non-interference" with only two labels.

However, it's tricky to make this definition concrete because the set of components is dynamic. Two different traces may spawn a different set of components, so it's not entirely clear where S should come from. We have two proposed resolutions to this issue:

  1. Suppose there is a way to statically assign components to S. This would mean that one could write a function that takes any arbitrary component and returns a bool indicating whether that component is in S. Then A does not interfere with B if there exists some function (f:comp->bool) such that for any two reachable traces, if the kernel receives the same list of messages from all components c such that (f c = true), then the kernel sends the same list of messages to all components c such that (f c = true). In other words, there are two different sets of components S1 and S2 which are subsets of the components of tr1 and tr2 respectively. If the inputs from S1 on tr1 equal the inputs from S2 on tr2, then the outputs to S1 on tr1 must equal the outputs to S2 on tr2.

  2. For any reachable trace tr, there exists a subset S of the components of tr such that if we construct tr' by removing all handler iterations corresponding to messages from components not in S, then the output to S on tr equals the output to S on tr'.

April 15, 2013

(from Ptival)

I just pushed major changes to the dsl-sendall branches.

We now have a two-layered type of descriptions/payloads. The old ones are just like they used to be. The new ones are also allowed to contain entire components, restricted to a particular type of component.

This allows us to have certain fields of the kernel state and local environments be restricted to hold components of a certain type. I extended the language so that most commands now accept extended expressions/payloads.

Notice that Spawn now returns an entire component (actually, a dependent pair of a component and a proof that it has the right type), and puts it in the environment (so the slot types have changed).

Also notice that components are not allowed to go over the network: the SendAll command explicitly requires a component-less payload expression.

I extended the frontend functor with additional niceties, and updated all three examples.

Disclaimer: the naming is terrible, some stuff is redundant, some stuff isn't as nice as it could be...

(from Dan)

I tried to prove our two definitions of the "enables" primitive equivalent. As it turns out they aren't equivalent. They differ for the property "A enables A". The definition that I wrote always rejects "A enables A" while the definition you wrote always accepts "A enables A".

It's easy to modify either definition to either accept or reject "A enables A", so this isn't a fundamental limitation of either approach; we just happened to handle this case differently. Nevertheless, this is sort of a win for formal verification. I doubt I would have discovered this subtle difference if I had tried to prove the two definitions equivalent on paper.