Titanite bytecode compiler

Following along the book Crafting interpreters.

Requirements

GCC

Install gcc on ubuntu 20.04

sudo apt install build-essential
gcc --version # check if the installation succeeded

Usage

Go to the Autoconfig explanation for more info.

If you are not interested run the following:

sudo apt install autoconf
autoreconf -i
./configure
make
./titanite_compiler ./example.ti

Done

Chunk array (implement dynamicly growing/shrinking array for our compiler)

Contributing

Look at the list of TODOs below to find something you are interested in or just go through the code to find something.

Guideline

Invariants

When adding a new statement make sure that

after a statement is finished executing the stack should be as tall as before.

Specify why adding a dependency.

We want to pollute are files as little as possible. This means, if you add a header file for the first time in the c file from the standard c library, write down why you added it.

For example:

#include <string.h> // for memcpy

TODOs

There are a lot of TODO types. Look through them and search for those in the code, if you find a specific TODO type interesting.

TODO(breaking-bug): this needs imidiate attention. Fix this first if possible.
TODO(perf): search for this if you want to improve the performance of the language
TODO(memory): search for this if you want to make the language consume less memory
TODO(typing): search for this if you want to improve the typing/error messaging of the language
TODO(needed): this indicates an important TODO, they are usually things that go wrong when you would write a lof of code (e.g. the stack size is currently limited and does not grow dynamically)
TODO(tweak): Nice to have, but not very relevant
TODO(error-handling): Improvements for better error messaging for the user. Somewhat important.
TODO(threading): Improvements that provide better thread safety (or thread safety at all). Not really important until we get threads going, if we ever do.

Important things that are missing

List of features:

const keyword specification
switch statement
else if/elif branches
continue and break statements

Before doing anything of this, it's a good idea to talk with someone about this.

Implement const keyword

Well this is very self-explanatory. you can reassign a different value to a variable which for declared with let. For example this is valid.

let a = 2;
a = 3;

But we also want a const keyword, which has to be initialized the moment it is declared. For example:

const a; // this is illegal and should throw and error

const b = 3;
b = 4 // this is illegal and should throw and error

const c = 6; // we can only use this variable later
let d = c+3; // this is fine.

Allow more that 256 variables at the same time.

Right now the number of local variable of the compiler cannot be more than 256, which is restricted by this variable in common.h.

#define UINT8_COUNT (UINT8_MAX + 1)

and this typedefinition in compiler.c which uses the previous constant as the array length of the list where the locals get saved.

typedef struct
{
    Local locals[UINT8_COUNT]; // this should start grow dynamically.
    int localCount;
    int scopeDepth;
} Compiler;

The restriction is enforced in the addLocal function. This could be a starting point to implement this feature.

/**
 * This initializes the next available Local in the compiler’s array of variables.
 * It stores the variable’s name and the depth of the scope that owns the variable.
 */
static void addLocal(Token name)
{
    // If a 257th variable would be added right now, this if branch prevents it by throwing a compiler error.
    if (current->localCount == UINT8_COUNT)
    {
        error("Too many local variables in function.");
        return;
    }
    Local *local = &current->locals[current->localCount++];
    local->name = name;
    local->depth = current->scopeDepth;
}

Line encoding

Our encoding of line information is hilariously wasteful of memory. Given that a series of instructions often correspond to the same source line, a natural solution is something akin to run-length encoding of the line numbers.

Devise an encoding that compresses the line information for a series of instructions on the same line. Change writeChunk() to write this compressed form, and implement a getLine() function that, given the index of an instruction, determines the line where the instruction occurs.

Hint: It’s not necessary for getLine() to be particularly efficient. Since it is called only when a runtime error occurs, it is well off the critical path where performance matters.

Too little space for constants

Because OP_CONSTANT uses only a single byte for its operand, a chunk may only contain up to 256 different constants. That’s small enough that people writing real-world code will hit that limit. We could use two or more bytes to store the operand, but that makes every constant instruction take up more space. Most chunks won’t need that many unique constants, so that wastes space and sacrifices some locality in the common case to support the rare case.

To balance those two competing aims, many instruction sets feature multiple instructions that perform the same operation but with operands of different sizes. Leave our existing one-byte OP_CONSTANT instruction alone, and define a second OP_CONSTANT_LONG instruction. It stores the operand as a 24-bit number, which should be plenty.

Implement this function:

void writeConstant(Chunk* chunk, Value value, int line) { // Implement me... } It adds value to chunk’s constant array and then writes an appropriate instruction to load the constant. Also add support to the disassembler for OP_CONSTANT_LONG instructions.

Defining two instructions seems to be the best of both worlds. What sacrifices, if any, does it force on us?

Reimplement or_ function

Right now the function which handles the logical or is written purely with functions which were are also used by if and else. So the implementation is not ideal. OR is slower than AND. And this should not be the case. Search for or_ in compiler and make it better.

Own realloc function.

I've already implemented my own malloc, realloc, calloc and free. But mine are not thread safe. Altough they are very memory safe and they are 8-bit-aligned.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
src		src
.gitignore		.gitignore
AUTOCONFIG.md		AUTOCONFIG.md
DOCS.md		DOCS.md
Makefile.am		Makefile.am
README.md		README.md
configure.ac		configure.ac
example.ti		example.ti
main.c		main.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Titanite bytecode compiler

Requirements

Install gcc on ubuntu 20.04

Usage

Done

Contributing

Guideline

Invariants

Specify why adding a dependency.

TODOs

Important things that are missing

Implement const keyword

Allow more that 256 variables at the same time.

Line encoding

Too little space for constants

Reimplement or_ function

Own realloc function.

About

Releases

Packages

Languages

nachthammer/titanite-bytecode-c

Folders and files

Latest commit

History

Repository files navigation

Titanite bytecode compiler

Requirements

Install gcc on ubuntu 20.04

Usage

Done

Contributing

Guideline

Invariants

Specify why adding a dependency.

TODOs

Important things that are missing

Implement const keyword

Allow more that 256 variables at the same time.

Line encoding

Too little space for constants

Reimplement or_ function

Own realloc function.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages