Special thanks to Zerumi Coder whose purest soul helped me on this long journey! The Guy who supported me on every tricky junction of the long night road. The Guy, who lit the way for me, who has enlighten me!
Thank you, Zerumi!
Thanks to Lannee the Great, who supported the Magnificent Order of Rust!
Thanks to Local Piper the Funniest, who didn't let me give up! His majesty supported me with marvelous jokes!
Thanks to all other fellow comrades who stood with me, who fought bravely, who has won this battle!
- Тернавский Константин Евгеньевич. P3206
asm | acc | neum | mc -> hw | tick -> instr | struct | stream | port | pstr | prob2 | cache
- Базовый вариант
- assembler
- accumulator
- Von Neumann (same memory for commands and data)
- Microcode
- accurate up to tick
- code is stored as High-Level structure
- Stream IO (no interrupts)
- IO devices addressed by ports. Separate IO instruction
- Pascal strings (length + content)
- Prob 2. Even Fibonacci numbers
- Cache (not implemented)
number
is 16-bit numbernumber(n)
is n-bits number
Assembly:
program ::= lines
lines ::= line | line lines
line ::= new_line | statement new_line
statement ::= item | label item
label ::= word ":"
item ::= empty | command | directive
// command names are matched case-insensitive
command ::= command_none
| command_address
| command_immediate
| command_port
command_none ::= "inc"
| "shift_left"
| "shift_right"
| "nop"
| "halt"
command_address ::= opcode_address address
opcode_address ::= "load"
| "store"
| "add"
| "and"
| "cmp"
| "jzc"
| "jzs"
| "jz" // alias for jzs
| "jcc"
| "jcs"
| "jc" // alias for jcs
| "jump"
address ::= address_relative
| address_absolute
| address_indirect
address_relative ::= actual_address
address_absolute ::= "!" actual_address
address_indirect ::= "(" actual_address ")"
actual_address ::= word | number
command_immediate ::= "andi" number
command_port ::= opcode_port port
opcode_port ::= "in" | "out"
port ::= number(8)
// supports numbers:
// decimal: 145, 0001
// hex: 0xaf
// bin: 0b010101
number ::= "^(?P<prefix>0[xb])?(?P<number>[\dabcdef_]+)"
directive ::= directive_word | directive_org
directive_word ::= "word" word_arguments
word_arguments ::= word_argument | word_argument word_arguments
word_argument ::= number(32) | label
directive_org ::= "org" number
Note: space symbols are not considered and are skipped Space symbols are defined as following set: " "
- strategy of computation: sequential
- label's scope: global
For number endian see Memory section.
We use notion of number types similar to rust's one:
first letter says the type and affects sign extension:
u
no sign extension happens. Value treated as UnsignedThe letter is followed by a number. The number signifies amount of bits available to represent a number.
Examples:
u32
- 32-bit unsigned numberu16
- 16-bit unsigned number
All literals are numbers with different range of values. All numbers have the same syntax. Range is determined based on usage context: command's argument type defines literal meaning. For details see command's argument types table.
Number literals support following prefixes:
0x
- for hex numbers0b
- for binary numbers
Example:
decimal: 145, 0001
hex: 0xaf
bin: 0b010101
Numbers without prefix are treated as decimals.
Numbers may contain any amount of _
at any point after prefix (or anywhere if there is no prefix):
Example:
4000
4_000
4_____000
___4_000___
0x____fa0
are all mean 4000
decimal value.
none
- command requires no arguments. Placing anything would result into error.port
- command requires single number which denotes IO device's address. Number is treated asu8
immediate
- command requires single number. Number is treated asu16
label
- special argument type. It requires single word. Word is a sequence of unicode letters. It may contain any number of_
in any position. Used within composite typeaddress
and withword
directive.address
- composite type. see notes and table below- There are other types. They are special and used in conjunction with Assembly directives
Address
type is either number treated as u16
or label. Address allows modifiers to switch addressing mode.
In the following table, strings enclosed with ""
means literal characters present in source code. |
means alternative. ()
are used to group items.
That is "!" (u16 | label)
means exclamation mark followed by either number literal or label where number literal is interpreted as 16-bit unsigned number
Each addressing in this table corresponds to appropriate addressing mode of CPU.
Mode | Syntax | Example |
Relative | u16 | label |
load 0x55 |
Absolute | "!" (u16 | label) |
store !0x55 |
Indirect | "(" u16 | label ")" |
load (some_label_ptr) |
data
- special argument type used with assembler directiveword
. It requires one or more numbers each seperated with at least one space. Each number is treated asu32
. That is, although, value0xff
perfectly fits into one byte,word 0xff
occupies whole memory cell. Notice, that no sign extension takes place! Value is placed as is:0x00_00_00_ff
Assembler directives are not represented in CPU's memory. They are special commands intended to assist you write assembly code.
Directive | Allowed argument types | Comment |
word |
data | label |
places numbers into memory as is. can be used with labels to make pointers. I.e. word some_label places an address and NOT the content at the address which some_labels refers to. |
org |
u16 |
instructs assembler where to place next code item (be it a raw value or a command) into cell with address ADDRESS. Subsequent code items will be placed after ADDRESS one by one |
Example:
org 0x04f
VAR1: word 0x45a9 0xff
add 0xf
cmp VAR1
Here 0x0000_45a9
will be placed in memory as is at address 0x04f
; 0x0000_00ff
at 0x0050
. add 0xf
at 0x0051
and so on
This section describes opcodes and operand type suggested for use with them.
Notice, that every command may theoretically work with every operand type except none
operand type, though that has not been tested and may lead to undefined behaviour.
Providing none
operand type for command which requires any other type results in CPU panicking.
Commands which suggest none
operand type will simply ignore any operand, although operand fetch would still be executed. So that you can but you should not specify any operand type except none
for commands which except none
.
For this table let's introduce notion of special operand type operand
. It requires operand type to be any of Relative|Indirect|Immediate|Absolute
. Notice that none
is forbidden for this speical type.
IN immediate - read data from IO device
OUT immediate - write data to IO device
LOAD operand - load value into accumulator
STORE operand - store value from accumulator into memory cell
ADD operand - well... add a number?
INC none - add 1 to accumulator
// (to check for even values by applying 0x1 mask)
AND operand
CMP operand - subtract number from accumulator without
storing result anywhere. Sets status flags.
Useful for branching
SHIFT_LEFT none
SHIFT_RIGHT none
JZC operand - Jump if Zero Clear
JZS operand - Jump if Zero Set
JCC operand - Jump if Carry Clear
JCS operand - Jump if Carry Set
JUMP operand - Unconditional jump
NOP none - does nothing
HALT none - Stops the simulation
Assembler supports variants of some instructions with immediate argument. Namely:
AND -> ANDI // useful for masking
For more, please, see syntax section.
Every instruction occupies exactly one memory cell.
All instruction if would be represented in binary has following format:
opcode | argument type | argument |
---|---|---|
1 byte | 1 byte | 2 bytes |
4 bytes |
- Instruction fetch
- fetches instruction from memory to cmd register
- operand decode
- determines type of operand
- load operand
- execution
- determines microinstruction number by opcode
- execute instruction
- None: jumps straight to command execution
- Absolute: operand -> address -> [mem] -> data
- Relative: pc + operand -> address -> [mem] -> data
- Indirect: pc + operand -> address -> [mem] -> data -> address -> data
- Immediate: operand -> data
This CPU uses von Neumann memory model: both data and code are stored in the same memory.
You neither can interpret data as instruction nor instruction as data though: limitation of current implementation.
Memory consists of 2**16
memory cells. Each memory cell holds either a single 32-bit big-endian number without sign extension or an instruction.
The whole memory is addressable by u16
address on per memory cell basis. That is by referring to 0x0000
you can fetch an u32
number or an instruction.
Address space starts from zero.
address | content |
---|---|
0x0000 |
0x0000_00ff |
0x0001 |
0xdead_beaf |
... | ... |
0xffff |
0x0000_0000 |
Registers support either u32
or u16
values. If you attempt to write u32
value into u16
-capable destination (that is either register or ALU) then sixteen most-significant bits are silently discarded.
If you attempt to write u16
value into u32
-capable destination, then value is zero-extended to 32 bits.
Notice that NO sign-extension takes place!
- accumulator (
u32
) (least-significant byte is connected to IO) - data (connected to memory) (MemoryItem:
u32
|Command
) - status (zero, carry)
- address (
u16
) - program counter (
u16
) - cmd(opcode, opcode_type, arg:
u16
)
ALU operates on two u32
values, outputting u32
result and optionally setting status
register with zero
and carry
flags.
| Full name | alg | loc | bytes | instr | exec_instr | tick | variant |
|Тернавский Константин Евгеньевич | hello_world | 35 | - | 13 | 105 | 647 | asm | acc | neum | mc -> hw | tick -> instr | struct | stream | port | pstr | prob2 | cache|
|Тернавский Константин Евгеньевич | hello_username | 168 | - | 81 | 434 | 2706 | asm | acc | neum | mc -> hw | tick -> instr | struct | stream | port | pstr | prob2 | cache|
|Тернавский Константин Евгеньевич | prob2 | 137 | - | 48 | 630 | 3896 | asm | acc | neum | mc -> hw | tick -> instr | struct | stream | port | pstr | prob2 | cache|