Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StackMapTable support #1

Draft
wants to merge 41 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
963ae07
Rename identifier
igstan May 2, 2017
ccc7e57
Remove unused code
igstan May 3, 2017
8f22adb
Add Instr.toString and LabeledInstr.toString
igstan May 3, 2017
a13d48b
First take at generating the StackMapTable attribute; incomplete and …
igstan May 3, 2017
1535232
Extend List structure with foldMapState
igstan May 4, 2017
6a73246
Remove unused as-pattern
igstan May 4, 2017
99cf07f
Reorder members in structure
igstan May 4, 2017
767f6f7
Compute a method's maximum locals
igstan May 4, 2017
a7281ca
Fix bug in the max stack calculation
igstan May 4, 2017
c8a3383
Complete TODO marker
igstan May 4, 2017
91e30ca
Use pattern matching instead of record accessors
igstan May 4, 2017
d89a10e
Reorder match cases from simple to complex
igstan May 4, 2017
b91f177
Typo
igstan May 4, 2017
24e2467
Formatting
igstan May 4, 2017
df0d984
Reduce line lengths using this weird trick
igstan May 4, 2017
5f18606
Convert match cases into helper functions
igstan May 4, 2017
a012ff8
Inline record
igstan May 4, 2017
9bda2fe
Convert match cases into helper functions
igstan May 4, 2017
ebf5957
Fix label handling bug in LabeledInstr
igstan May 5, 2017
7409abb
Add VerificationType.toString
igstan May 6, 2017
9fe9135
Merge frames on branch target instructions
igstan May 6, 2017
358f9e0
Add StackMapTable attribute only if not already present
igstan May 6, 2017
cca211f
Create a small framework for composing fold functions
igstan May 6, 2017
f4ab7e8
Reorder lines
igstan May 6, 2017
327bb62
Use optional pattern bars; supported by SuccessorML (sML)
igstan Apr 24, 2023
fd55fa4
Remove symbolic structure names, which are prohibited by the standard
igstan Apr 24, 2023
77aecec
Add scaffolding for tests using the sml-test library
igstan Apr 24, 2023
54f9b92
Use nested directory access syntax in lib.cm
igstan Apr 24, 2023
8ea44fb
Add the Work Log file
igstan Apr 24, 2023
497d802
Remove Fn extensions which are now part of sml-foundation
igstan Apr 24, 2023
eb924d4
Add notes about the handling of `_extended` frames
igstan Apr 24, 2023
0423db7
Leave some reference notes
igstan Apr 24, 2023
f3bb6ba
Add more detail to exception message
igstan Apr 24, 2023
5ec69a2
Support `ireturn` in the bytecode verifier
igstan Apr 24, 2023
5bef6aa
Create directory for test cases and first one: factorial
igstan Apr 24, 2023
12e9d93
Leave reminder of where I left off work
igstan May 20, 2024
06b2f04
Mass commit in case my laptop bricks again...
igstan May 20, 2024
1405607
Working `factorial` with StackMapTable attribute
igstan Oct 26, 2024
a289c58
Properly initialize locals for verification
igstan Oct 26, 2024
e19f921
Adjust exception message
igstan Oct 26, 2024
b8e7993
Add TODO item
igstan Oct 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions dev.cm
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
library (0.1.0)
library (lib.cm)
source (-)
is
$BUCHAREST-ML/sml-test/lib.cm

lib.cm

test (
all-suites.sml
frame-suite.sml
)

test/test-cases (
factorial/factorial.sml
)
56 changes: 36 additions & 20 deletions lib.cm
Original file line number Diff line number Diff line change
@@ -1,25 +1,41 @@
library (0.1.0)
library ($BUCHAREST-ML/sml-foundation/lib.cm)
source (-)
is
$/basis.cm
$/smlnj-lib.cm

src/attr.sml
src/class.sml
src/class-name.sml
src/const.sml
src/const-pool.sig
src/const-pool.sml
src/descriptor.sml
src/field.sml
src/instr.sml
src/labeled-instr.sml
src/main.sml
src/member.fun
src/method.sml
src/method-handle.sml
src/prim.sml
src/text.sig
src/text.sml
src/util.sig
src/util.sml
$BUCHAREST-ML/sml-foundation/lib.cm

src (
attr.sml
class.sml
class-name.sml
const.sml
const-pool.sig
const-pool.sml
descriptor.sml
field.sml
instr.sml
java.sml
labeled-instr.sml
main.sml
member.fun
method.sml
method-handle.sml
prim.sml
text.sig
text.sml
util.sig
util.sml

extensions/list.sig
extensions/list.sml
extensions/word8-vector.sml

stack-map/frame.sml
stack-map/stack-lang.sml
stack-map/stack-map.sml
stack-map/verification-type.sml
stack-map/verifier.sig
stack-map/verifier.sml
)
8 changes: 8 additions & 0 deletions lib/core/lib.cm
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
library (0.1.0)
signature LIST
structure List
is
$/basis.cm

src/list.sig
src/list.sml
8 changes: 8 additions & 0 deletions lib/core/src/list.sig
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
signature LIST =
sig
include LIST

val countWhere : ('a -> bool) -> 'a list -> int

val foldl : 'a list -> { seed : 'b, step : 'a * 'b -> 'b } -> 'b
end
31 changes: 31 additions & 0 deletions lib/core/src/list.sml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
structure List : LIST =
struct
open LIST

fun foldl list { seed, step } = List.foldl step seed list
fun foldr list { seed, step } = List.foldr step seed list

fun takeWhile _ [] = []
| takeWhile p (x :: xs) = if p x then x :: takeWhile p xs else []

fun takeUntil p = takeWhile (not o p)

fun countWhere predicate list =
raise Fail "not implemented"

fun bound xs ys a b =
case (xs, ys) of
(x, []) => a
| ([], y) => b
| (_ :: xs, _ :: ys) => recur xs ys a b

(**
* Returns the list with more elements.
*)
fun max a b = bound a b a b

(**
* Returns the list with fewer elements.
*)
fun min a b = bound a b b a
end
11 changes: 11 additions & 0 deletions lib/core/src/other.sml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
signature FOLDABLE =
sig
type 'a t
type ('a, 'b) arrow = 'a -> 'b
type 'a monoid = {
zero : 'a,
plus : 'a * 'a -> 'a
}

val foldMap : 'm monoid -> ('a -> 'm) -> 'a t -> 'm
end
175 changes: 175 additions & 0 deletions log.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
# Work Log

## 2024-05-20 11:19:12

https://github.com/GaloisInc/jvm-verifier

Continue here:

```
- CM.make "dev.cm"; Factorial.main ();
[scanning dev.cm]
[scanning $BUCHAREST-ML/sml-test/lib.cm]
[scanning $BUCHAREST-ML/sml-foundation/lib.cm]
[scanning (dev.cm):lib.cm]
[New bindings added.]
val it = true : bool

uncaught exception Fail [Fail: not implemented: leastUpperBound: Reference =/= Integer]
raised at: src/stack-map/verification-type.sml:47.19-47.98
```

## 2023-04-24 09:56:20

> Each stack map frame described in the entries table relies on the previous
> frame for some of its semantics. The first stack map frame of a method is
> implicit, and computed from the method descriptor by the type checker
> (§4.10.1.6). The stack_map_frame structure at entries[0] therefore describes
> the second stack map frame of the method.

— JVMS20, §4.7.4

Okay... so I'll have to generate all the stack frames and just after that drop
the ones that are not required by the specification:

> It is illegal to have code after an **unconditional branch** without a stack
> map frame being provided for it.

— JVMS20, §4.10.1.6

> • Conditional branch: ifeq, ifne, iflt, ifle, ifgt, ifge, ifnull, ifnonnull, if_icmpeq, if_icmpne, if_icmplt, if_icmple, if_icmpgt if_icmpge, if_acmpeq, if_acmpne.
> • Compound conditional branch: tableswitch, lookupswitch.
> • **Unconditional branch**: goto, goto_w, jsr, jsr_w, ret.

— JVMS20, §2.11.7

Steps:

1. Build a control-flow graph of the instructions
2. Basic blocks

Could we generate the StackMap frames without building the CFG? And only use
a CFG when we want to keep just the required frames?

---

From: From Stack Maps to Software Certificates [slides], slide 6

> [Java bytecode verification] is formalized as a data flow problem.

---

Slide 27:

> BCV typing with interfaces
> Two kind of reference types : classes and interfaces.
> Interfaces introduces a form of intersection types. that must be represented in the type hierarchy.
> The byte code verifier opts for an alternative solution :
> Treat interfaces as java.lang.Object and defer type checking to run-time.
> Type checking rule :
> isJavaAssignable(class(_,_), class(To, L)) :- loadedClass(To, L, ToClass), classIsInterface(ToClass).

## Efficient Bytecode Verification and Compilation in a Virtual Machine

Figure 2.1, page 5

```
todo ← true
while todo = true do
todo ← false
for all i in all instructions of a method do
if i was changed then
todo ← true
check whether stack and local variable types match definition of i
calculate new state after i
for all s in all successor instructions of i do
if current state for s ̸= new state derived from i then
assume state after i as new entry state for s
mark s as changed
end if
end for
end if
end for
end while
```

## 2023-04-22 22:06:59

Taken from a draft in igstan.ro:

---
title: JVM Bytecode Verification
author: Ionuț G. Stan
---

I'll briefly describe the process of bytecode verification that the JVM
performs during the loading of .class files.

## Motivation

Why was verification needed?

## Complications

There's a hierarchy of verification types that induces a notion of subtyping.
Store and load instructions need to check subtyping conformance and this
requires loading external classes into the system. I'm hoping to avoid this
somehow, as it requires the side-effect of reading files form disk. In
addition, it means that I might need to write a parser for .class files and a
JAR (which are ZIP archives) reader.

## 2022-04-16 11:37:19

> An additional problem with compile-time checking is **version skew**. A user
> may have successfully compiled a class, say PurchaseStockOptions, to be a
> subclass of TradingClass. But the definition of TradingClass might have
> changed since the time the class was compiled in a way that is not compatible
> with pre-existing binaries.

— JVMS8, §4.10

> The intent is that a **stack map frame** must appear at the beginning of each
> **basic block** in a method. The stack map frame specifies the verification
> type of each operand stack entry and of each local variable at the start of
> each basic block.

— JVMS8, §4.10.1

### Read

- Lightweight Bytecode Verification

## 2022-04-17 14:09:19

### Java Bytecode Verification — An Overview

For every instruction `i`:

```
i : in(i) → out(i)
in(i) = lub { out(j) | j predecessor i }
```

i₀ = first instruction

```
in(i₀) = (ε, (P₀ ... Pₙ₋₁, ⊤ ... ⊤))
```

Pₖ are the types of the parameter methods

> The dataflow framework presented above requires that the type algebra,
> ordered by the subtyping relation, constitutes a semi-lattice.
— §3.3

### Dataflow Analysis [slides]

> Available expressions is a **forward must** analysis
>
> • **Forward** = Data flow from in to out
> • **Must** = At join points, only keep facts that hold on all paths that are joined

> Liveness is a **backwards may** analysis
> • To know if a variable is live, we need to look at the future uses of it. We
> propagate facts backwards, from Out to In.
> • Variable is live if it is used on some path
22 changes: 22 additions & 0 deletions readme.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# JVM bytecode assembler in Standard ML

## Similar Projects

- BiteScript: https://github.com/headius/bitescript

## Running Main

```
$ sml
Standard ML of New Jersey v110.80 [built: Sun Aug 28 21:15:09 2016]
Expand Down Expand Up @@ -36,3 +42,19 @@ Hello, World!
val it = () : unit
-
```

## Running Tests

```
- CM.make "dev.cm"; AllSuites.run ();
[scanning dev.cm]
[scanning $BUCHAREST-ML/sml-test/lib.cm]
[scanning $BUCHAREST-ML/sml-foundation/lib.cm]
[scanning (dev.cm):lib.cm]
[New bindings added.]
val it = true : bool
[+] 1 < 2
[-] 1 = 1 — NotEqual
val it = () : unit
-
```
Loading