1. Introduction
In this tutorial, we’ll discuss the algorithm and code for converting the infix notation of a mathematical expression to a postfix notation.
2. Expressions in Java
A programming language like Java allows us to define and work with different mathematical expressions. An expression can be written through a combination of variables, constants, and operators.
Some common types of expressions in Java include arithmetic and logical expressions.
2.1. Arithmetic Expressions
Arithmetic expressions include operators such as addition(+), subtraction(-), multiplication(*), division(/) and modulus(%). These operators used in conjunction with variables or constants result in an arithmetic evaluation:
int x = 100;
int y = 50;
int sum = x + y;
int prod = x * y;
int remainder = x % y;
2.2. Logical Expressions
Logical expressions employ logical operators in place of the arithmetic operations used earlier. The most common logical operators include the logical AND, OR, NOT, and XOR:
boolean andResult = (true && false); // Logical AND
boolean orResult = (true || false); // Logical OR
boolean notResult = !false; // Logical NOT
Relational expressions are used mostly in comparison-based logic and produce boolean values true or false:
int x = 10;
int y = 8;
boolean bigger = x > y; // true
3. Notations
There are different possible ways of writing a mathematical expression. These are called notations, and they change based on the placement of the operators and the operands.
3.1. Infix Notation
In an infix notation expression, the operation sits in between the operands, making it the most common expression notation:
int sum = (a + b) + (c * d);
It should be noted here that operator precedence is a cause for ambiguity in infix expressions and plays a major role. Parenthesis is common in infix notations to enforce precedence.
3.2. Prefix Notation
Prefix, also known as Polish Notation, are expressions where the operators precede the operands:
int result = * + a b - c d;
3.3. Postfix Notation
Postfix, or Reverse Polish Notation, implies that the operators should come after the operands:
int result = a b + c d - *;
We should note here that both prefix and postfix notations of the same expression remove the obvious ambiguity with operator precedence and eliminate the need for parenthesis. They are efficient in expression evaluation for the same reason.
4. Problem Statement
Now that we have reviewed the basics of the different notations of mathematical expressions, let’s move on to the problem statement.
Given an Infix expression as an input, we should write an algorithm that converts it and returns the Postfix or the Reverse Polish Notation of the same expression.
Let’s understand through an example:
Input: (a + b) * (c - d)
Output: ab+cd-*
Input: a+b*(c^d-e)^(f+g*h)-i
Output: abcd^e-fgh*+^*+i-
The examples above show that the input is an infix expression where the operator is always between a pair of operands. The output is the corresponding postfix expression of the same. We can assume that the input is always a valid infix expression; therefore, there is no further need to validate the same.
5. Solution
Let’s build our solution by breaking down the problem into smaller steps.
5.1. Operators and Operands
The input will be a string representation of the infix expression. Before we implement the conversion logic, it is crucial to determine the operators from the operands.
Based on the input examples, operands can be lower or upper-case English alphabets:
private boolean isOperand(char ch) {
return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z');
}
The input contains 2 parenthesis characters and 5 operators in addition to the above operands.
5.2. Precedence and Associativity
We should also define the precedence of each operator we might encounter in our input and assign them an integer value. The ^(exponentiation) operator has the highest precedence, followed by *(multiply) and /(division), which have similar precedence. Finally, the +(addition) and -(subtraction) operators have the least precedence.
Let’s write a method to mimic the above logic:
int getPrecedenceScore(char ch) {
switch (ch) {
case '^':
return 3;
case '*':
case '/':
return 2;
case '+':
case '-':
return 1;
}
return -1;
}
When unparenthesized operators of the same precedence are scanned, associativity, or the scanning order, is generally left to right. The only exception to this rule is with the exponentiation operator, where the order is assumed to be right to left:
char associativity(char ch) {
if (ch == '^') {
return 'R';
}
return 'L';
}
6. Conversion Algorithm
Each operator in an infix expression refers to the operands surrounding it. In contrast, for a postfix one, each operator refers to the two operands that come before it in the input String.
For an expression with multiple infix operations, the expressions within the innermost parenthesis must first be converted to postfix. This gives us the advantage of treating them as single operands for the outer operations. We continue this and successively eliminate parenthesis until the entire expression is converted. Within a group of parenthesis, the last parenthesis pair that is eliminated is the first operation in the group.
This Last In First Out behavior is suggestive of the use of the Stack data structure.
6.1. The Stack and Precedence Condition
We’ll use a Stack to keep track of our operators. However, we need to define a rule to determine which operator has to be added to the postfix expression and which operator needs to be kept in the stack for the future.
If the current symbol is an operator, we have two options. We can either push it onto the stack or put it directly in the postfix expression. If our stack is empty, which is the case when encountering the first operator, we can simply push the current operator onto the stack.
On the other hand, if the stack is not empty, we need to check the precedence to determine, which way the operator will go. If the current character has a higher precedence than that of the top of the stack, we need to push it onto the top. For example, encountering + after * will result in pushing the + onto the stack above *. We’ll do the same if the precedence scores are equal as well, and the associativity is the default left to right.
We can condense the above logic as:
boolean operatorPrecedenceCondition(String infix, int i, Stack<Character> stack) {
return getPrecedenceScore(infix.charAt(i)) < getPrecedenceScore(stack.peek())
|| getPrecedenceScore(infix.charAt(i)) == getPrecedenceScore(stack.peek())
&& associativity(infix.charAt(i)) == 'L';
}
6.2. Scanning the Infix Expression and Converting
Now that we have the precedence condition set up, let’s discuss how to perform step-by-step scanning of the infix operation and convert it correctly.
If the current character is an operand, we add it to our postfix result. If the current character is an operator, we use the comparison logic discussed above and determine if we should add it to the stack or pop. Finally, when we finish scanning the input, we pop everything in the stack to the postfix expression:
String infixToPostfix(String infix) {
StringBuilder result = new StringBuilder();
Stack<Character> stack = new Stack<>();
for (int i = 0; i < infix.length(); i++) {
char ch = infix.charAt(i);
if (isOperand(ch)) {
result.append(ch);
} else {
while (!stack.isEmpty() && (operatorPrecedenceCondition(infix, i, stack))) {
result.append(stack.pop());
}
stack.push(ch);
}
}
while (!stack.isEmpty()) {
result.append(stack.pop());
}
return result.toString();
}
6.3. Example Dry Run
Let’s understand the algorithm using our first example: a + b * c – d. The first character we encounter, a, can be immediately inserted into the final result as it is an operand. The + operator, however, cannot be inserted without going over the second operand associated with it, which in this case is b. As we need to store the + operator for future reference, we push it onto our stack:
Symbol Result Stack
a a []
+ a [+]
b ab [+]
As we encounter b operand, we push it onto the postfix result, which is now a b. We cannot pop the operator + from the stack just yet because we have the * operator in our input. As we mentioned in the previous section, the * operator commands a higher precedence over +, which is at the top of the stack. Therefore, this new symbol is pushed on top of the stack:
Symbol Result Stack
a a []
+ a [+]
b ab [+]
* ab [+,*]
We continue scanning the input infix expression as we encounter the next operand c, which we add to the result. When we encounter the final symbol -, which has a lower precedence over the operator *, we pop the elements from the stack and append it to the postfix expression until the stack is empty or the top of the stack has a lower precedence. The expression is now abc*+. The current operator – is pushed to the stack:
Symbol Result Stack
a a []
+ a [+]
b ab [+]
* ab [+,*]
c abc [+,*]
- abc*+ [-]
d abc*+d- []
Finally, we append the last operand, d, to the postfix operation and pop the stack. The value of the postfix operation is abc*+d-.
6.4. Conversion Algorithm With Parenthesis
While the above algorithm is correct, infix expressions use parenthesis to solve the ambiguity that arises with operator precedence. Therefore, it is crucial to handle the occurrence of parenthesis in the input string and modify the algorithm accordingly.
When we scan an opening parenthesis (, we push it onto the stack. And, when we encounter a closing parenthesis, all the operators need to be popped off from the stack into the postfix expression.
Let’s rewrite the code by adjusting for parenthesis:
String infixToPostfix(String infix) {
StringBuilder result = new StringBuilder();
Stack<Character> stack = new Stack<>();
for (int i = 0; i < infix.length(); i++) {
char ch = infix.charAt(i);
if (isOperand(ch)) {
result.append(ch);
} else if (ch == '(') {
stack.push(ch);
} else if (ch == ')') {
while (!stack.isEmpty() && stack.peek() != '(') {
result.append(stack.pop());
}
stack.pop();
} else {
while (!stack.isEmpty() && (operatorPrecedenceCondition(infix, i, stack))) {
result.append(stack.pop());
}
stack.push(ch);
}
}
while (!stack.isEmpty()) {
result.append(stack.pop());
}
return result.toString();
}
Let’s take the same example we explored above but with parenthesis and do a dry run:
Input: (a+b)*(c-d)
Symbol Result Stack
( [(]
a a [(]
+ a [(, +]
b ab [(, +]
) ab+ []
* ab+ [*]
( ab+ [*, (]
c ab+c [*, (]
- ab+c [*, (, -]
d ab+cd [*, (, -]
) ab+cd-* []
We should notice how the placement of the parenthesis changes the evaluation and also the corresponding postfix expression.
7. Additional Thoughts
While infix expressions are more common in texts, the postfix notation of an expression has many benefits. The need for parenthesis in postfix expressions is not there as the order of the operations is clear due to the sequential arrangement of operands and operators. Furthermore, the ambiguity that might arise due to operator precedence and associativity in an infix operation is eradicated in its postfix form.
This also makes them the de-facto choice in computer programs, especially in programming language implementations.
8. Conclusion
In this article, we discussed infix, prefix, and postfix notations of mathematical expressions. We focussed on the algorithm to convert an infix to a postfix operation and saw a few examples of it.
As usual, the source code from this article can be found over on GitHub.