Answer:
Before we discuss the evaluation of a query expression, let us briefly explain how a SQL query may be represented. Consider the following student and marks relations:
STUDENT (enrolno, name, phone)
MARKS (enrolno, subjectcode, grade)
To find the results of the student(s) whose phone number is ‘1129250025’, the following query may be given.
SELECT enrolno, name, subjectcode, grade
FROM STUDENT s, MARKS m
WEHRE s.enrolno=m.enrolno AND phone= ’1129250025’
The equivalent relational algebraic query for this will be:
This is a very good internal representation however, it may be a good idea to represent the relational algebraic expression to a query tree on which algorithms for query optimisation can be designed easily. In a query tree, nodes are the operators and relations represent the leaf. The query tree for the relational expression above would be:
In the previous section, we have seen the algorithms for individual operations. Now let us look at the methods for evaluating an entire expression. In general we use two methods:
- Materialisation
- Pipelining.
Materialisation
Evaluate a relational algebraic expression from the bottom-up, by explicitly generating and storing the results of each operation in the expression. For example, in Figure 1 compute and store the result of the selection operation on STUDENT relation, then take the join of this result with MARKS relation and then finally compile the projection operation. Materialised evaluation is always possible though; the cost of writing/reading results to/from disk can be quite high.
Pipelining
Evaluate operations in a multi-threaded manner, (i.e., passes tuples output from one operation to the next parent operation as input) even as the first operation is being executed. In the previous expression tree, it does not store (materialise) results instead, it passes tuples directly to the join. Similarly, does not store results of join, and passes tuples directly to the projection. Thus, there is no need to store a temporary relation on a disk for each operation. Pipelining may not always be possible or easy for sort, hash-join.
One of the pipelining execution methods may involve a buffer filled by the result tuples of lower level operation while, records may be picked up from the buffer by the higher level operation.
Complex Joins
When an expression involves three relations then we have more than one strategy for the evaluation of the expression. For example, join of relations such as:
For each tuple m in MARKS, look up the corresponding tuples in STUDENT and the corresponding tuples in SUBJECTS. Each tuple of MARKS will be examined only once.
Strategy 3 combines two operations into one special-purpose operation that may be more efficient than implementing the joins of two relations.