A reduced instruction set designed to be the first compiler target, opcodes and behaviors might change in future releases.
To compiling the project, you will need the zig compiler
Then, in the project folder
zig build runwhich should compile everything, including tests, but they are small so it is fine
- diagnostic options
-hor--help- print help to the screen
-vor--version- output the version of the compiled project into the screen
-por--properties- print the properties to the screen about the current vm in a human readable format
- configuration options
--thread-count=[count](no effect)- set current amount of concurrent threads to spawn for a process, defaults to one
--start-thread=[id](no effect)- set on which thread the code should start
--add-search-path [path](no effect)- add paths to search for modules
--add-module [name]- add modules to the vm
- sandbox flags
--memory-limit=[size][prefix]- define the biggest size the vm memory pointer can handle, prefix
needed
borBfor bytes,size* 1kfor kilobytes,size* 1000Kfor kibibytes,size* 1024mfor megabytes,size* 1000000Mfor mebibytes,size* 1048576gfor gigabytes,size* 1000000000Gfor gibibytes,size* 1073741824tfor terabytes,size* 1000000000000Tfor tebibytes,size* 1099511627776
- not setting this value before can cause errors if
main_header.memory_sizeis corrupted or set to be a value greater than needed
- define the biggest size the vm memory pointer can handle, prefix
needed
--load-modules=(true|false)- if
false, all modules are not loaded and the entire running code is sandboxed and very little features are available (only modules with i/o interfaces) - default is true
- if
--enable-[instr]=(true|false)- enable certain instruction groups, disabling can be used to emulate even more reduced instruction sets, usually defaults true
--enable-div: enable integer division instructionsudivr,udivi,sdivr,sdivi--enable-int: enable interrupt instructions--enable-float: enable all floating point instructions--enable-ioint: enable interrupts triggered by i/o ports--enable-stack: enable stack instructions (defaults to false)
This table defines how are the instruction types laid out, bit by bit, with the most significant bit first.
R type instructions (R standing for register) are instructions that
use 3 registers and normally operate as rd <- r1 ○ r2, as r1
referring to the first argument (and not register r01) and r2
referring to the second argument (and not register r02). Register
rd is the destination register, nothing stops rd to be r1 or
r2
| instruction type | bits 63-20 | bits 19-16 | bits 15-12 | bits 11-8 | bits 7-0 |
|---|---|---|---|---|---|
| R type | unused | rd | r2 | r1 | op |
| instruction type | bits 63-16 | bits 15-12 | bits 11-8 | bits 7-0 |
|---|---|---|---|---|
| S type | immediate | rd | r1 | op |
| instruction type | bits 63-12 | bits 11-8 | bits 7-0 |
|---|---|---|---|
| L type | immediate | r1 | op |
| register index | used as | requirements |
|---|---|---|
| r00 | zero register | none, read-"only" because writing is discarded |
| r01 - r12 | general use | none |
| r13 | pcall return |
save before pcall, general use otherwise |
| r14 | pcall return |
save before pcall, general use otherwise |
| r15 | pcall parameter |
pcall switch, general use otherwise |
-
group zero: bitwise instruction group [opcodes
0x00 - 0x0F]-
andr: [opcode
0x00, R type]- executes a bitwise AND between
r1andr2, result onrd - executes:
rd <- r1 & r2
- executes a bitwise AND between
-
andi: [opcode
0x01, S type]- executes a bitwise AND between
r1and a mask immediate, result onrd - executes:
rd <- r1 & imm
- executes a bitwise AND between
-
xorr: [opcode
0x02, R type]- executes a bitwise XOR between
r1andr2, result onrd - executes:
rd <- r1 ^ r2
- executes a bitwise XOR between
-
xori: [opcode
0x03, S type]- executes a bitwise XOR between
r1and a mask immediate, result onrd - executes:
rd <- r1 ^ imm
- executes a bitwise XOR between
-
orr: [opcode
0x04, R type]- executes a bitwise OR between
r1andr2, result onrd - executes:
rd <- r1 | r2
- executes a bitwise OR between
-
ori: [opcode
0x05, S type]- executes a bitwise OR between
r1and a mask immediate, result onrd - executes:
rd <- r1 | imm
- executes a bitwise OR between
-
not: [opcode
0x06, R type]- executes a one's complement on
r1, discardr2, result onrd - executes:
rd <- ~r1
- executes a one's complement on
-
cnt [opcode
0x07, S type]- executes a population count on register
r1, excluding the highestimmbits, result onrd - executes:
rd <- popcnt(r1 & ((1 << imm) - 1)) - edge case:
- if
imm >= 64, clearrd
- if
- executes a population count on register
-
llsr [opcode
0x08, R type]- executes a logical left shift on register
r1r2, result onrd - executes:
rd <- r1 << r2
- executes a logical left shift on register
-
llsi [opcode
0x09, S type]- executes a logical left shift on register
r1immbits, result onrd - executes:
rd <- r1 << imm - edge case:
- if
imm >= 64, clearrd
- if
- executes a logical left shift on register
-
lrsr [opcode
0x0A, R type]- executes a logical right shift on register
r1forr2bits, result onrd - executes:
rd <- r1 >> r2 - edge case:
- if
r2 >= 64, clearrd
- if
- executes a logical right shift on register
-
lrsi [opcode
0x0B, S type]- executes a logical right shift on register
r1forimmbits, result onrd - executes:
rd <- r1 >> imm - edge case:
- if
imm >= 64, clearrd
- if
- executes a logical right shift on register
-
reserved instructions block 0: opcodes [
0x0Cuntil0x0F]
-
-
group one:
-
addr [opcode
0x10, R type]- adds
r2tor1and setrdas the result - executes:
rd <- r1 + r2 - edge case:
- overflow is discarded
- adds
-
addi [opcode
0x11, S type]- adds
immtor1and setrdas the result - executes:
rd <- r1 + imm - edge case:
- overflow is discarded
- adds
-
subr [opcode
0x12, R type]- subtracts
r2fromr1and setrdas the result - executes:
rd <- r1 - r2 - edge case:
- overflow is discarded
- subtracts
-
subi [opcode
0x13, S type]- subtracts
immfromr1and setrdas the result - executes:
rd <- r1 - imm - edge case:
- overflow is discarded
- subtracts
-
umulr [opcode
0x14, R type]- set
rdtor2 (unsigned)timesr1 (unsigned) - executes:
rd <- u64(r1) * u64(r2) - edge case:
- overflow is discarded
- set
-
umuli [opcode
0x15, S type]- set
rdtoimm (unsigned)timesr1 (unsigned) - executes:
rd <- u64(r1) * u64(imm) - edge case:
- overflow is discarded
- set
-
smulr [opcode
0x16, R type]- set
rdtor2 (signed)timesr1 (signed) - executes:
rd <- i64(r1) * i64(r2) - edge case:
- overflow is discarded
- set
-
smuli [opcode
0x17, S type]- set
rdtoimmtimesr1 - executes:
rd <- i64(r1) * i64(imm) - edge case:
- overflow is discarded
- set
-
udivr [opcode
0x18, R type]- set
rdtor1 (unsigned)divided byr2 (unsigned) - executes:
rd <- u64(r1) / u64(r2) - edge case:
- overflow is discarded
r2 = 0triggerspcall 1
- set
-
udivi [opcode
0x19, S type]- set
rdtor1 (unsigned)divided byimm (unsigned) - executes:
rd <- u64(r1) / u64(imm) - edge case:
- overflow is discarded
imm = 0triggerspcall 1
- set
-
sdivr [opcode
0x1A, R type]- set
rdtor1 (signed)divided byr2 (signed) - executes:
rd <- i64(r1) / i64(r2) - edge case:
- overflow is discarded
r2 = 0triggerspcall 1
- set
-
sdivi [opcode
0x1B, S type]- set
rdtor1 (signed)divided byimm (signed) - executes:
rd <- i64(r1) / i64(imm) - edge case:
- overflow is discarded
imm = 0triggerspcall 1
- set
-
call [opcode
0x1C, R type] (deprecated)- change execution context to another place
- semantic renaming:
call rd, r1, r2->call addr, sp, bp - executes:
u64[sp + 0] <- bpu64[sp + 8] <- pc + 8sp <- sp + 16bp <- sppc <- addr
-
push [opcode
0x1D, S type] (deprecated)- push a value into given stack
- semantic renaming
push rd, r1, imm->push rv, sp, imv - executes:
u64[sp] <- rv + imvsp <- sp + 8
-
retn [opcode
0x1E, R type] (deprecated)- return execution to previous context
- semantic renaming
retn rd, r1, r2->retn x0, sp, bp - executes:
sp <- sp - 16bp <- u64[sp + 0]pc <- u64[sp + 8]
x0is ignored
-
pull [opcode
0x1F, S type] (deprecated)- pull a value out of a given stack
- semantic renaming
pull rd, r1, imm->pull rv, sp, #0 - executes:
sp <- sp - 8rv <- u64[sp]
#0is ignored
-
-
group two:
-
ldb [opcode
0x20, S type]- load byte from memory into a register
- executes:
rd <- r0 | u8[r1 + imm] - side effects:
- if
r1 + immis bigger than memory size,pcall 4is triggered
- if
-
ldh [opcode
0x21, S type]- load half word from memory into a register
- executes:
rd <- r0 | u16[r1 + imm] - side effects:
- if
r1 + immis bigger than memory size,pcall 4is triggered
- if
-
ldw [opcode
0x22, S type]- load word from memory into a register
- executes:
rd <- r0 | u32[r1 + imm] - side effects:
- if
r1 + immis bigger than memory size,pcall 4is triggered
- if
-
ldd [opcode
0x23, S type]- load double word from memory into a register
- executes:
rd <- u64[r1 + imm] - side effects:
- if
r1 + immis bigger than memory size,pcall 4is triggered
- if
-
stb [opcode
0x24, S type]- store byte from register into memory
- executes:
u8[rd + imm] <- u8(r1) - side effects:
- if
rd + immis bigger than memory size,pcall 4is triggered
- if
-
sth [opcode
0x25, S type]- store half word from register into memory
- executes:
u16[rd + imm] <- u16(r1) - side effects:
- if
rd + immis bigger than memory size,pcall 4is triggered
- if
-
stw [opcode
0x26, S type]- store word from register into memory
- executes:
u32[rd + imm] <- u32(r1) - side effects:
- if
rd + immis bigger than memory size,pcall 4is triggered
- if
-
std [opcode
0x27, S type]- store half word from register into memory
- executes:
u64[rd + imm] <- r2 - side effects:
- if
rf + immis bigger than memory size,pcall 4is triggered
- if
-
jal [opcode
0x28, L type]- jump to a place in memory
- executes:
rd <- pc + 8pc <- pc + i50[imm]
-
jalr [opcode
0x29, S type]- jump to a place in memory
- executes:
rd <- pc + 8pc <- pc + r1 + imm
-
je [opcode
0x2A, S type]- jump to a place in memory when
rd == r1 - executes:
if rd ^ r1 == 0pc <- pc + imm * 8
elsepc <- pc + 8
- jump to a place in memory when
-
jne [opcode
0x2B, S type]- jump to a place in memory when
rd != r1 - executes:
if rd ^ r1 != 0pc <- pc + imm * 8
elsepc <- pc + 8
- jump to a place in memory when
-
jgu [opcode
0x2C, S type]- jump to a place in memory when
rd > r1, both unsigned - executes:
if (u64(rd) - u64(r1)) & (sign bit)pc <- pc + imm * 8
elsepc <- pc + 8
- jump to a place in memory when
-
jgs [opcode
0x2D, S type]- jump to a place in memory when
rd > r1, both signed - executes:
if (i64(rd) - i64(r1)) & (sign bit)pc <- pc + imm * 8
elsepc <- pc + 8
- jump to a place in memory when
-
jleu [opcode
0x2E, S type]- jump to a place in memory when
rd <= r1, both unsigned - executes:
if (i64(rd) - i64(r1)) & (sign bit) == 0pc <- pc + imm * 8
elsepc <- pc + 8
- jump to a place in memory when
-
jleu [opcode
0x2F, S type]- jump to a place in memory when
rd <= r1, both signed - executes:
if (i64(rd) - i64(r1)) & (sign bit) == 0pc <- pc + imm * 8
elsepc <- pc + 8
- jump to a place in memory when
-
-
group three:
-
setgur [opcode
0x30, R type]- set
rdto1in caseu64(r1) > u64(r2), else0 - executes:
rd <- u64(r1) > u64(r2) ? 1 : 0
- set
-
setgui [opcode
0x31, S type]- set
rdto1in caseu64(r1) > u64(imm), else0 - executes:
rd <- u64(r1) > u64(imm) ? 1 : 0
- set
-
setgsr [opcode
0x32, R type]- set
rdto1in casei64(r1) > i64(r2), else0 - executes:
rd <- i64(r1) > i64(r2) ? 1 : 0
- set
-
setgsi [opcode
0x33, S type]- set
rdto1in casei64(r1) > i64(imm), else0 - executes:
rd <- i64(r1) > u64(imm) ? 1 : 0
- set
-
setgur [opcode
0x34, R type]- set
rdto1in caseu64(r1) > u64(r2), else0 - executes:
rd <- u64(r1) > u64(r2) ? 1 : 0
- set
-
setgui [opcode
0x35, S type]- set
rdto1in caseu64(r1) > u64(imm), else0 - executes:
rd <- u64(r1) > u64(imm) ? 1 : 0
- set
-
setgsr [opcode
0x36, R type]- set
rdto1in casei64(r1) > i64(r2), else0 - executes:
rd <- i64(r1) > i64(r2) ? 1 : 0
- set
-
setgsi [opcode
0x37, S type]- set
rdto1in casei64(r1) > i64(imm), else0 - executes:
rd <- i64(r1) > u64(imm) ? 1 : 0
- set
-
lui [opcode
0x38, L type]- set the highest bits of
rdto the value ofimm, - executes:
rd <- u64(imm) << 12
- set the highest bits of
-
auipc [opcode
0x39, L type]- set
rdto the sum of the address that theaupicinstruction is located (pc) with animmon the highest bits - executes:
rd <- pc + (u64(imm) << 12)
- set
-
pcall [opcode
0x3A, L type]- call the processor to execute certain subroutines, execution is lef for the implementation
-
pret [opcode
0x3B, L type]- return from a
pcallsubroutine
- return from a
-
pcall -1: Processor interfacepcall 0: Division by zeropcall 1: General faultpcall 2: Double faultpcall 3: Triple faultpcall 4: Invalid instructionpcall 5: Page faultpcall 6: Invalid IO
everything after this is programmable (in theory), but it is reserved
for any other virtual machines to implement until pcall 0x1F.
As the name suggests, this program call is triggered every time there is a division by zero on the program. A compiler can simply put a divide by zero instruction on a program and call it a breakpoint.
General faults occur by any kind of unhandled exception the processor is not able to detect or recognize.
A double fault occurs when any interrupt is called/triggered by a general fault.
A triple fault is one of the fatal faults inside the processor. There is no way to handle a triple fault as it suggests a fault in the error handling system itself. By not making software handle a triple fault, it prevents crash loops. Whe, if ever, it is triggered, the implementation can choose to go for a reset or a shutdown.
The invalid instruction is thrown every time the instruction decoder couldn't find a reasonable instruction to execute, and sets r15 to the value of the instruction it tried to parse;
This interrupt is triggered when a there is any "wrong" access to memory, being either mapped into an unmapped area, or not having enough permissions into a memory region, sets r15 to the unusable address
Only pcall -1 is hardware/vm defined, all the other pcall -1.
The interface defined uses r15 split in two 32 bit areas space:switch
as interrupt space and functionality switches, while other registers are
used accordingly as each function needs.
Normally, switch = 0 will be a space implementation check, behaving
as a orr r14 r0 r0 in case its features are not implemented.
pcall -1 functions
intspace = 0: interrupt vector functions-
fswitch = 0: interrupt vector check
-
fswitch = 1: interrupt vector enable
intspace = 1: paging functions-
fswitch = 0: paging check
-
fswitch = 1: paging enable
intspace = 2: model information-
fswitch = 0: model check
intspace = 3: hyper functions-
fswitch = 0is hosted
-
fswitch = 1return to host
-
Input: none
-
Output:
r14:0if no interrupts are possible, makingpcall 0:0shadoworr r14, r0, r0.1means interrupts are possible, but only in the address specified byr12,2means they are possible anywhere defined by the program,
-
r13: in caser14 == 1, sets bit flags to which hardware interrupts are supported in caser14 == 2, defines the amount of interrupts the processor is able to handle
trashed registers: none
input registers:
r14(possibly): ifpcall 0:0returned2, set the interrupt vector register to the specified pointer, ignored if not
output registers: none
trashed registers: none
input registers: none
output registers:
r14: set to the processor's amount of page level reach, 0 is unimplemented, 1 is linear paging or≥ 2for multiple levelsr13: in caser31is not zero, returns the processor's page size
trashed registers: none
input registers: none
output registers:
r15: set if the processor is able to give more information about itself
trashed registers: none
-
Input: none
-
Output:
r14: the boolean value indicating if current processor is emulated
-
Input:
r1: exit code -
Side Effects:
- if this code was being ran from a virtual machine, it sends a program end signal, to stop execution
- if this code is in user mode, it sends control back to the kernel
- this function has not yet defined behavior for kernel mode