Experiments of Query Rewriting Algorithms (Pure and Pure_C)

Benchmarks

As benchmarks dedicated to existential rules are not available yet, we considered rule bases obtained by translation of description logic ontologies, more specifically DL-Lite_R ontologies (which allows to compare with DL tools producing unions of conjunctive queries):

the widely used benchmark composed of ontologies Adolena (A), StockExchange(S), University (U) and Vicodi (V). This benchmark was introduced in (Pérez-Urbina, Horrocks and Motik, ISWC 2009) and is available at www.cs.ox.ac.uk/projects/requiem/evaluation.html.
the DL-Lite versions of OpenGALEN2 (www.opengalen.org) and OBOprotein (www.obofoundry.org/), which come from a test suite proposed with the tool Rapid (Trivela, Stoilos, Chortaras and Stamou, DL 2013), available at www.image.ece.ntua.gr/~achort/rapid/. These two ontologies are respectively denoted by G and O hereafter.

Each ontology is provided with 5 handcrafted queries.

Table 1 shows the size of these ontologies, as well as the number of hierarchical and compilable rules (see our paper at IJCAI 2015 for more details).

**Table 1:** Types of rules in the ontologies
Ontology	# Hier. rules	# Non-hier. compilable rules	# Compilable rules	# All rules
A	72	4	76	102
S	16	28	44	52
U	36	36	72	77
V	202	20	222	222
G	26980	836	27816	50764
O	35390	0	35390	43351

Download the full benchmark: AGOSUV-bench.tar.gz.
This archive contains the DL-Lite and Dlgp v2 versions of each ontology, as well as the associated queries in Sparql, Dlgp v2 and in the REQUIEM query syntax.

Experiments

(May 2014)

In the following, PURE denotes our basic query rewriting algorithm and PURE_C its optimization with compiled rules. The output of PURE_C is called a ``pivotal UCQ''; it can be unfolded into a regular UCQ (or transformed into a Datalog query, or evaluated as such, depending on the capabilities of the query engine or assumptions on the data).

Timeout was set to 10 minutes.

Impact of rule compilation on the number of rewritings

Table 2 shows the size of the output UCQ (which is the same for all systems outputing a minimal, sound and complete UCQ), the size of the pivotal UCQ computed by PURE_C, the number of generated queries in each case (excluding the initial query, hence zero may occur) and the number of CQs generated by PURE_C by brute unfolding of the pivotal UCQ (i.e., unfolding without elimination of redundant queries). Missing values are due to timeouts. For Q₃, Q₄ and Q₅ on O, PURE did not end before the timeout, hence the size of the UCQ was computed with other tools (see the last table). Note that for Q₄, the size of the UCQ is precisely approximated by the brute-force unfolding (i.e., no redundancy elimination is actually needed).

**Table 2:** Impact of rule compilation on sizes
		UCQ	pivotal-UCQ	# generated CQs		brute unfolding
			PURE_C	PURE	PURE_C	PURE_C
A	Q1	27	2	459	13	702
	Q2	50	2	171	1	104
	Q3	104	1	316	0	104
	Q4	224	2	826	5	520
	Q5	624	1	1416	0	624
S	Q1	6	1	9	0	6
	Q2	2	1	137	0	2
	Q3	4	1	275	0	4
	Q4	4	1	450	0	4
	Q5	8	1	688	0	8
U	Q1	2	1	1	0	2
	Q2	1	1	105	0	1
	Q3	4	1	42	0	4
	Q4	2	1	2142	0	2
	Q5	10	1	153	0	10
V	Q1	15	1	14	0	15
	Q2	10	1	9	0	10
	Q3	72	1	117	0	72
	Q4	185	1	328	0	185
	Q5	30	1	59	0	30
G	Q1	2	1	2	0	2
	Q2	1152	1	1275	0	1152
	Q3	488	5	1514	4	488
	Q4	147	1	154	0	147
	Q5	324	19	908	18	324
O	Q1	27	20	28	21	27
	Q2	1356	1264	1355	1263	1356
	Q3	33887	1	-	0	33887
	Q4	34733	682	-	793	34733
	Q5	36612	-	-	-	-

We find a huge gap between the sizes of the output; the pivotal UCQ is often restricted to a single CQ even when the UCQ has thousands of CQs (which also explains the gap between the numbers of generated queries).

Impact of rule compilation on rewriting times

Table 3 shows the runtimes for PURE and PURE_C. Timeout (10 minutes) is denoted by TO. Time is measured by increments of 10 ms, hence 0 means less than 10 ms. All tests were performed on a DELL machine with a processor at 3.60 GHz and 16 GB of RAM.

**Table 3:** Impact of rule compilation on times (ms)
		Rewriting time	Unfolding time with redundancy elimination	Total time
		PURE_C	PURE_C	PURE	PURE_C
A	Q1	10	120	180	130
	Q2	0	40	90	40
	Q3	10	20	170	30
	Q4	0	130	280	130
	Q5	0	440	1500	440
S	Q1	0	0	0	0
	Q2	0	0	100	0
	Q3	0	0	70	0
	Q4	0	0	110	0
	Q5	10	10	120	20
U	Q1	0	0	10	0
	Q2	0	0	100	0
	Q3	0	0	30	0
	Q4	0	0	1500	0
	Q5	0	10	20	10
V	Q1	0	0	10	0
	Q2	0	10	10	10
	Q3	0	70	110	70
	Q4	0	60	120	60
	Q5	0	10	10	10
G	Q1	0	10	0	10
	Q2	50	570	1060	620
	Q3	70	190	1020	260
	Q4	10	0	20	10
	Q5	30	60	890	90
O	Q1	130	10	440	140
	Q2	1110	760	1160	1870
	Q3	90	557900	TO	557990
	Q4	430	TO	TO	TO
	Q5	TO	TO	TO	TO

We see that PURE_C remains faster than PURE when we include the time required to unfold the pivotal UCQ into a UCQ, except for Q2 on O, which comes from the fact that the pivotal UCQ is almost as large as the UCQ. Almost all the unfolding time is actually spent in checking redundancies (in the current implementation, this test is performed at the end by a pairwise comparison of queries).

Comparaison with other query rewriting tools

We compared Pure and Pure_C (version of April 2014) with the following query rewriting tools:

Nyaya: version of October 2013
processes general existential rules (as PURE and PURE_C)
tree-witness rewriting (OnTop): version of April 2014 on www.dcs.bbk.ac.uk/~roman/tw-rewriting/, run in Full mode
dedicated to DL-Lite
Rapid: version of May 2014 on www.image.ece.ntua.gr/~gstam/
dedicated to DL-Lite
Requiem: version of April 2014 on www.cs.ox.ac.uk/projects/requiem/home.html, run in optimized Full modality
dedicated to DL-Lite
Iqaros: version v0.2 on code.google.com/p/iqaros/
dedicated to DL-Lite

The reported runtime is computed internally by each tool. The runtime of PURE and PURE_C is the CPU time (computed with the getCurrentThreadCpuTime() method from the ThreadMXBean class; it was measured by increments of 10 ms by the OpenJDK 6 Java Runtime Environment).

Note that we do not include here experiments with tools that do not provide time measurements or gave erroneous results.

**Table 4:** Runtime comparison
		PURE	PURE_C	PURE_C to UCQ	Nyaya	tw-rew	Rapid	Requiem	Iqaros
A	Q1	180	10	130	1122	12	18	264	50
	Q2	90	0	40	862	11	23	105	59
	Q3	170	10	30	2363	9	34	134	190
	Q4	280	0	130	5557	12	48	252	136
	Q5	1500	0	440	33206	10	93	467	571
S	Q1	0	0	0	4	5	7	6	5
	Q2	100	0	0	4	7	9	164	17
	Q3	70	0	0	46	8	13	443	79
	Q4	110	0	0	7	8	12	674	65
	Q5	120	10	20	8	8	15	5981	296
U	Q1	10	0	0	8	8	6	5	13
	Q2	100	0	0	4	6	9	128	21
	Q3	30	0	0	12	8	7	211	47
	Q4	1500	0	0	6	8	13	1091	57
	Q5	20	0	10	10	8	15	2943	93
V	Q1	10	0	0	13	5	9	10	10
	Q2	10	0	10	51	9	5	10	13
	Q3	110	0	70	21	8	25	62	22
	Q4	120	0	60	28	8	32	136	35
	Q5	10	0	10	22	10	26	73	42
G	Q1	0	0	10	-	5	5	47	49
	Q2	1060	50	620	-	11	74	209048	5864
	Q3	1020	70	260	-	21	59	259103	9184
	Q4	20	10	10	-	6	10	190250	771
	Q5	890	30	90	-	26	40	238459	7401
O	Q1	440	130	140	-	19	23	3447	6673
	Q2	1160	1110	1870	-	578	953	21786	27818
	Q3	TO	90	557990	-	77	611	TO	TO
	Q4	TO	430	TO	-	1238	14698	TO	139980
	Q5	TO	TO	TO	-	TO	562223	TO	TO

We see that PURE_C (with or without unfolding) scales well on large DL-Lite ontologies (except for extreme cases which are difficult for all tools, see Ontology O). We emphasize that, being devoted to general existential rules, it does not exploit the specificities of DL-Lite. See our paper at IJCAI 2015 for further comments on these results.