View on GitHub Donate

Relational

Educational tool for relational algebra

Optimizations

Relational provides a way to perform various optimizations on queries. It assumes that the query is correct.

An optimized query must return the same result of the original query. If you find a non working query please submit a bug report.

General optimizations

This class of optimizations does not need to have informations on the relations.

They will work on the expression tree and will modify it. The return value is the number of changes performed on the tree.

Duplicated select

Original	Optimized
σ _k ( σ _k(C))	σ _k (C)
σ _k ( σ _j(C))	σ _{k ⋀ j} (C)

The first one will work only if the expression is exactly the same. If the expression is equivalent but different (for example 3+2 is equivalent but different to 2+3) the 2nd kind of optimization will be used.

Down to unions subtractions intersections

Original	Optimized
σ _k (a ᑌ b)	σ _k(a) ᑌ σ _k(b)
σ _k (a ᑎ b)	σ _k(a) ᑎ σ _k(b)
σ _k (a - b)	σ _k(a) - σ _k(b)

It is an optimization because the selection is a O(n) operation while unions, subtractions or intersections are O(n²) operations. So pushing down the selection, we hope to reduce the size of the problem that the O(n²) operation will have to solve.

Duplicated projection

Original	Optimized
π _i ( π _j (R))	π _i (R)

Doing two projection and in fact ignoring the result of the inner one is useless.

Selection inside projection

Original	Optimized
σ _j (π _k (R))	π _k (σ _j (R))

Performing the selection first, gives us hope that the projection operation (more complex) will be performed on a smaller set.

Swap rename select

Original	Optimized
σ _k(ρ _j(R))	ρ _j(σ _k(R))

Renaming the attributes used in the selection, so the operation is still valid.
This is not really an heavy optimization, select is O(n) and rename is O(1), but it might make other optimizations possible as well, in the next step.

Futile renames

Original	Optimized
ρ _k➡k(R)	completely removes them. If some renames in the list are valid and some are not, the valid ones will be kept.

This optimization is performed before performing subsequent renames optimization.

Original	Optimized
ρ _k(R)(ρ _j(R))	ρ _j,k(R)

Using a single rename operation.
If j,k will contain things like a➡b,b➡c, they will be replaced with a➡c.
If j.k will contain things like a➡b,b➡a, they will be removed. If all the transformations are removed, the rename itself is removed.

Futile union intersection subtraction

A ⋈ A=A ⧑ A=A ⧒ A=A⧓A = A ᑌ A=A ᑎ A=A.
So this optimization tries to locate unions, intersections and joins that share the same left and right operand, and replaces them with the operand itself.
Also A - A=∅.
So it locates subtractions that share the same left and right operand and replaces them with σ _False(A). This is not as fast as replacing with an empty relation, but there is no such operator in relational algebra. Anyway Selection is O(n) and subtraction is O(n²), so we are saving time anyway.

This function locates things like:

Original	Optimized
R ᑌ R	R
R ᑎ R	R
R - R	σ _False (R)
σ _k (R) ᑌ R	R
σ _k (R) ᑎ R	σ _k (R)
σ _k (R) - R	σ _False (R)
R - σ _k (R)	σ _¬k (R)

R doesn't have to be a relation. It can be a subtree too.

Swap union renames

Original	Optimized
ρ _a➡b(R) ᑌ ρ _a➡b(Q)	ρ _a➡b(R ᑌ Q)
ρ _a➡b(R) ᑎ ρ _a➡b(Q)	ρ _a➡b(R ᑎ Q)
ρ _a➡b(R) - ρ _a➡b(Q)	ρ _a➡b(R - Q)

This will save the space taken by an extra relation needed to perform the 2nd rename.

Swap rename projection

Original	Optimized
π _k(ρ _j(R))	ρ _j(π _k(R))

This will let rename work on a hopefully smaller set and more important, will hopefully allow further optimizations.

Union and product

Original	Optimized
A B ∪ A * C*	A (B ∪ C)*

Select union intersect subtract

Original	Optimized
σ_i(a) ᑌ σ_q(a)	σ_{i ∨ q}(a)

This will allow the removal of an O(n²) operation like the union.

Both select must work on the same expression, the selects will be united into one, according to the following table:

original query	resulting query
σ_i(a) ᑌ σ_q(a)	σ_{i ∨ q}(a)
σ_i(a) ᑎ σ_q(a)	σ_{i ∧ q}(a)
σ_i(a) - σ_q(a)	σ_{i ∧ ¬q}(a)

Specific optimizations

This class of optimizations requires to have knowledge of the specific relations used (meaning that it will need to have access to real instances of the relations to work).

Projection and union

Original	Optimized
π a,b,c(A) ∪ π a,b,c(B)	π a,b,c(A ∪ B)

If A and B are union compatible.

Selection and product

Original	Optimized
σ _k (RQ)*	σ _l (σ _j (R) σ _i (Q))*

Where j contains only attributes belonging to R, i contains attributes belonging to Q and l contains attributes belonging to both.

Useless projection

If a projection is done on all the attributes, it can be removed.