I like to analyze the calculator in Expressions chapter of C++ Programming Language. The calculator is a tiny compiler actually so I think it is worth digging the idea behind its implementation. Here is the grammar for the input language like "10*2+3":
program: end //end is end-of-input expr_list end expr_list: expression print // print is newline or semicolon expression print expr_list expression: expression + term expression − term term term: term / primary term ∗ primary primary primary: number //number is a floating-point literal name //name is an identifier name = expression −primary ( expression )
What is says is that we call whole input string a program. The program is consisting of end and expression_list. Now what is that: expression print [expression_list] in which the print is a character to mark the end of an expression like new line or semicolon. It says till now that a program may look like this:
"expression;expression;expression;"
It continues with defining what an expression is: a term or a term + or - an expression. A term is a primary or a primary + or - a term. What a primary is? It can be one of four things: a number, a name, a name=expression(and we know what an expression is already), -primary(negative of a primary) and an expression between parenthesis. According to this a number is a primary is a term and also an expression. If we add a semicolon it will be expr_list if our string is consisting of only this expr_list it is also a program.
So we give a string with this syntax to our program and the program calculates the results. Here is the implementation in C++:
#include #include // strings #include
So lets look how it works. It simply has a function for each expression and breaks them down as grammar dictates. Real work starts in calculate(). It ts.get() to get the first token and calls expr(false) on it. false says don't read another token and handle the current one (as one is already read). expr() calls term on current token and term calls prim on it. This way expr() breaks down all strings to primary tokens. expr() goes ahead for all + and -s term() goes ahead for all * and /s and prim() goes ahead for numbers and names. expr() calls term(), term calls prim(). prim() handles neg, parenthesis numbers, and names. It calls expr() for whatever between the parenthesis. Term handles multiplications and divisions and returns the result to expr(). Look what happens for a string like this: 10+8/4-14.
- Reads 10 and calls expr(). Our program knows the type of read token. In this case it is a number. expr() puts 10 in left an reads next token to see the next token is + or - by calling term() which consequently calls prim(). It calls term for the remaining part and adds or subtracts whatever is returned by term(). term returns numbers immediately after reading them.
- term() is called when expr() says case Kind::plus: left += term(true);
- term() reads the next token and sees it as a division so it calculates 8/4 and gives it to expr();
- expr() adds it to the previous value. So now we have 12 - 14.
- When expr() calls term() again in its loop it will get 14 and subtracts it from 12
- This time the next token is not + nor - so expr() finishes by returning the result.
Now to see it graphically we have:
exp-------term-------prim
|>---------->---------->|
|<----10---<---10----<|
10+|>------>--------->|
|<---8---<|
8/|>------->|
8/|<---4---<|
10+|<--2--<|
12-|>------->--------->|
12-|<--14-<----14---<|
-2