JanB1 wrote:Grumblesaur wrote:I nuked a lot of memory leaks in a
programming language project I'm working on with a friend, and we're beginning to implement functions.
This is really cool.
I wouldn't even be able to know where to start for developing a new programming language.
We're going the easy route. We're using GNU Bison to generate a parser and GNU Flex to create a lexical analyzer. This isn't always as fast as writing these programs by hand, but it's a lot easier to reason about.
The lexical analyzer reads a file and uses regular expressions to match string patterns, and will spit out a token (which is just a number, but on value literals in the source file, it'll also store the value) indicating the element of code identified. The parser takes these tokens and looks to see if they match up with grammatical patterns. When the tokens match up with a pattern, the parser will add nodes to the syntax tree that correspond to the statement or expression it parsed from the tokens.
Once the whole tree is constructed (and it didn't error out from a syntax issue), the interpreting stage takes place. The interpreter recursively walks the syntax tree and executes instructions based on the type of node present, its child nodes, and any value data contained therein.
All of this is compiled up as one executable, so that when you invoke the interpreter on a file, you're also invoking the lexer and parser (you'd be hard pressed to find a language implementation that
didn't work that way. Compilers are a little different in that they translate to assembly/binary, but the lex-parse-(compile/interpret) stages are usually bundled up as one package).
The way we're going to handle functions is by making them objects. Their value will be a pointer to a node in the syntax tree where the definition for a function's code starts, and the function itself will be stored in a map of identifier : mg_obj. When the function's identifier is invoked after its definition, its code in the syntax tree will be called with the provided arguments, which will be placed in an identifier : object map within the function. This emulates a stack.
At least, that's what we're thinking. The implementation details may differ significantly once we move on to it.