Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IR Changes #188

Merged
merged 12 commits into from
Jun 11, 2024
Merged

IR Changes #188

merged 12 commits into from
Jun 11, 2024

Conversation

l-kent
Copy link
Contributor

@l-kent l-kent commented Feb 23, 2024

This is partially in response to discussions regarding #185 and #184 and does the following:

  • merges MemoryStore into MemoryAssign, so that stores cannot be contained in other types of statements (which never made any sense to begin with)
  • make BProcedure and BProcedure have default parameter values - this is just cleanup
  • make Register take an Int as parameter instead of BitVecType - this is just nicer
  • splits Memory into StackMemory and SharedMemory - this means we can match memory accesses on type to determine whether shared memory is being accessed, instead of matching the name "mem" vs. "stack". This will allow there to be multiple stack region variables in the future, for instance
  • rename LocalAssign to Assign to avoid confusion about what 'local' means in this context
  • a little bit of commenting to provide clarity on some of the more confusing parts of the IR

…ave default parameter values, make Register take an Int as parameter instead of BitVecType, split Memory into StackMemory and SharedMemory, rename LocalAssign to Assign to avoid confusion about what 'local' means in this context
@l-kent
Copy link
Contributor Author

l-kent commented Feb 23, 2024

Thinking about it a little further, Memory probably shouldn't be an Expr either

@l-kent l-kent requested a review from ailrst February 23, 2024 03:00
@l-kent
Copy link
Contributor Author

l-kent commented Feb 23, 2024

These were generally relics of the IR hewing closely to BAP and/or Boogie for convenience

@l-kent
Copy link
Contributor Author

l-kent commented Feb 23, 2024

The state register flags (ZF, NF, etc.) should probably be treated as globals and not locals because it probably isn't impossible the state of them in one procedure is relevant for the state in another procedure due to compiler optimisations?

@ailrst
Copy link
Contributor

ailrst commented Feb 23, 2024

This is a good start, some further ideas:

  • MemoryAssign(var mem: Memory, var index: Expr, var value: Expr assumes that the type of index and value is BitVecType(size) with the size matching mem.addressSize and mem.valueSize respectively, this could be typechecked (as in with conditional throw) or we could make all of them more concrete.
    • A more out-there idea is to have a pointer expression which carries the memory region with it, and have memory region be a parameter of the pointer/address rather than of the store/load operation, but this would mean describing pointer arithmetic.
  • Now the only difference between Assign and MemoryAssign is whether the LHS is a scalar object or a map object which is clearer but it would help analyses if we could syntactically distinguish shared assigns from unshared assigns. For example the formalism has the three cases (unshared := unshared, shared := unshared, unshared := shared). With this modification we still need to do a full AST walk of the RHS expression to find variables and memory accesses and check whether they are shared or not.
  • There are still the implicit invariants below which would be nice to validate or have a comment
    1. That the RHS of MemoryAssign does not contain a (shared?) load expression
    2. That the RHS of Assign contains at most one (shared?) load expression.

The state register flags (ZF, NF, etc.) should probably be treated as globals and not locals because it probably isn't impossible the state of them in one procedure is relevant for the state in another procedure due to compiler optimisations?

I agree I think this makes sense

@l-kent
Copy link
Contributor Author

l-kent commented Feb 23, 2024

MemoryAssign assumes that the type of index and value is BitVecType(size) with the size matching mem.addressSize and mem.valueSize respectively, this could be typechecked (as in with conditional throw) or we could make all of them more concrete.

I am not sure how you make these types more concrete. If these do not match, obviously invalid Boogie is generated which is why I have not yet bothered to implement type checking for these sizes. That could still be done but it's hardly a priority.

Now the only difference between Assign and MemoryAssign is whether the LHS is a scalar object or a map object which is clearer but it would help analyses if we could syntactically distinguish shared assigns from unshared assigns. For example the formalism has the three cases (unshared := unshared, shared := unshared, unshared := shared). With this modification we still need to do a full AST walk of the RHS expression to find variables and memory accesses and check whether they are shared or not.

If a MemoryAssign is a write to shared state, then its mem parameter will be a SharedMemory.

Which analysis is all this important for?

A more out-there idea is to have a pointer expression which carries the memory region with it, and have memory region be a parameter of the pointer/address rather than of the store/load operation, but this would mean describing pointer arithmetic.

This sounds needlessly convoluted for no real benefit.

l-kent added 6 commits February 23, 2024 15:03
# Conflicts:
#	src/main/scala/analysis/Analysis.scala
#	src/main/scala/analysis/SteensgaardAnalysis.scala
#	src/main/scala/ir/Statement.scala
#	src/main/scala/ir/Visitor.scala
#	src/main/scala/util/RunUtils.scala
# Conflicts:
#	src/test/scala/LiveVarsAnalysisTests.scala
#	src/test/scala/PointsToTest.scala
#	src/test/scala/ir/IRTest.scala
@l-kent
Copy link
Contributor Author

l-kent commented Jun 6, 2024

This is ready to merge @ailrst

# Conflicts:
#	src/main/scala/translating/GTIRBToIR.scala
#	src/main/scala/translating/SemanticsLoader.scala
Copy link
Contributor

@ailrst ailrst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is good you can ignore my comments if you want.

private val framePointer = Register("R29", BitVecType(64))
private val stackPointer = Register("R31", 64)
private val linkRegister = Register("R30", 64)
private val framePointer = Register("R29", 64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this change is worth it, it might cause problems if we want to change back to integers e.g. if we are interested in something like the SMACK int selection with cegar paper. Trying to make register an alias for global bitvector variable using inheritance is a bit too tricky imo. It seems cleaner to me to only have regular variables where scope is a parameter/field rather than a trait mixin. We could approach variables being shared similarly, but I do think it makes sense to have shared and unshared be type-incompatible. E.g.

enum VariableKind(name, type, scope, shared):
    case Memory(n,keytype,valtype) extends Variable(n,MapType(keytype,valtype), Global, Shared)
    case Variable(n,t, scope) extends Variable(n,t, scope, Unshared)
    case Register(n, sz) extends Variable(n, BitVecType(sz), Global, Unsharedl)

(as implemented as a trait/case class heirachy this just means moving name,type,scope,shared to the parent trait's constructor?)

Copy link
Contributor Author

@l-kent l-kent Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't make any changes to the scope. If you want to propose changes to how the scope works, you can do that separately. What you're suggesting would make the type system weaker though. What this PR does change is make it so Memory is not an Expr, and so (correctly) is no longer allowed to appear arbitrarily inside other Exprs.

I did consider the implications for supporting an integer representation (as in SMACK) but that would require wider changes anyway, as everything else already assumes everything is bitvectors - the Memory case class already does for instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree, if we want to lift registers to variables then we can lift registers to variables.

case localAssign: LocalAssign =>
m = m.diff(localAssign.rhs.variables)
if ignoreRegions.contains(localAssign.lhs) then m else m + localAssign.lhs
case assign: Assign =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still in favour of having a separate load instruction so that assign is clearly pure computation, but its not essential.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we've discussed before. this means creating a new temporary variable for nearly every load, adding a lot of bloat to the IR without actually improving anything.

# Conflicts:
#	src/main/scala/analysis/Analysis.scala
#	src/main/scala/analysis/InterprocSteensgaardAnalysis.scala
#	src/main/scala/analysis/RegToMemAnalysis.scala
#	src/main/scala/analysis/SSAForm.scala
@l-kent l-kent merged commit 7b27efd into main Jun 11, 2024
2 checks passed
@l-kent l-kent deleted the ir-cleanup branch July 16, 2024 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants