I think that your classes do not reflect the structure of expressions right. Let me use EBNF to write down a possible syntax of such expressions (without claim of completeness):
Expression = Term { ("AND" | "OR") Term }.
Term = "(" Expression ")" | Comparison.
Comparison = name ("=" | "!=" | "<" ...) Value.
Value = stringConstant | number.
One important thing we see, is that a Term can be either an Expression or a Comparison. This means that we must be able to plug in the one or the other to both sides of an AND or OR. Example: Name = "Maulik" AND (Id = 2 OR Id = 3). To the left of the AND we have a comparison, to the right an expression.
Therefore, we must make these two things assignment compatible, e.g., by deriving them from the same base class.
public abstract class Term
{
}
public Expression : Term
{
public Term FirstTerm { get; set; }
public List<(BooleanOperator operator, Term term)> SucceedingTerms { get; } = new();
}
public Comparison : Term
{
public string Name { get; set; }
public ComparisonOperator Operator { get; set; }
public object Value { get; set; }
}
Another important fact is that the syntax is recursive. I.e., an expression contains terms and terms can contain another expression. This is necessary to be able to represent parenthesized expressions with multiple levels of nesting.
This recursive nature of the syntax has two consequences:
- The class structure must allow recursive structures. This is given by the fact that the
Expression class has a list containing Terms and those can be Expressions again.
- The parser shown later must be recursive. It contains indirect recursive calls (
ParseExpression calls ParseTerm and ParseTerm calls ParseExpression).
Note that I am using ValueTulpes as list element consisting of an operator and a term.
Now, to the conversion itself. You need a compiler. Either
- Build your own custom compiler by hand.
Niklaus Wirth's book Compiler Construction (2 PDFs) is an introduction to the theory and the techniques of compiler construction. It gives you a general idea of what a compiler is and what it does.
- Use a tool or library to generate a compiler: for example ANTLR.
Since the syntax is simple, you might dare to write the compiler yourself.
Divide the task into lexical analysis (done by a lexer) and syntax analysis (done by a parser). Besides analyzing the syntax, the parser also creates the required data structure as output. The lexer just returns a stream of symbols. E.g.
enum Symbol { EndOfInput, LeftPar, RightPar, AndOperator, OrOperator, Identifier, Number,
StringLiteral, ...}
private string _input;
private int _currentPos;
private string _identifier;
private string _stringValue;
private decimal _number;
private Symbol _symbol;
/// Does the lexical analysis.
private void GetSymbol()
{
// Get next `_symbol` (and associated variables) from `_input` at `_currentPos` and
// update `_currentPos`.
}
Using this basic infrastructure, you can create the parser. The parser consists of methods closely reflecting the structure of the syntax (as given in EBNF above).
// Expression = Term { ("AND" | "OR") Term }.
void Expression ParseExpression()
{
var expression = new Expression();
expression.FirstTerm = ParseTerm();
while (_symbol == Symbol.AndOperator || _symbol == Symbol.OrOperator) {
var op = _symbol == Symbol.AndOperator
? BooleanOperator.And
: BooleanOperator.Or;
GetSymbol();
term = ParseTerm();
expression.SucceedingTerms.Add((op, term));
}
return expression;
}
// Term = "(" Expression ")" | Comparison.
void Term ParseTerm()
{
Term term = null;
if (Symbol == Symbol.LeftPar) {
GetSymbol();
term = ParseExpression();
if (Symbol == Symbol.RightPar) {
GetSymbol();
} else {
Error("\")\" expected.");
}
} else {
term = ParseComparison();
}
return term;
}
This is not complete, but you get the idea. Since the structure of this parser depends on the syntax production (EBNF), do not start with it before you have nailed down the exact syntax.
Pay attention to always have decidable paths in the syntax given by the next symbol. First, I had it wrong. My term was given by Term = "(" Expression | Comparison ")" | Comparison.. The problem here is that an expression starts with a term and a term can be a comparison. The comparison starts with a name. Therefore, both Expression and Comparison can start with a name. When parsing Expression | Comparison and the next symbol is a name, we are unable to decide whether we must parse an expression or a comparison.
The updated syntax is Term = "(" Expression ")" | Comparison.. Now, we know that if the next symbol is a left parenthesis, we must parse an expression and otherwise a comparison.