Monotonic aggregations
The monotonic
library provides stateful (aka impure) functions for incremental and recursion-friendly
computation of aggregate values. These are available in streaming reasoning modes.
These functions are allowed in recursive rules. The currently supported functions include:
incremental computation of sums |
|
incremental computation of products |
|
incremental computation of counts |
|
|
incremental computation of minima CURRENTLY UNAVAILABLE |
|
incremental computation of maxima CURRENTLY UNAVAILABLE |
Upon invocation, all functions return the currently accumulated value for the respective aggregate.
sum
This function computes the incremental sum.
The syntax is the following:
sum(X, [K1, ..., Kn], [C1, ..., Cm])
Where:
-
X
is the value to be used in the incremental computation of the aggregation -
[K1, …, Kn]
is the list of group-by variables -
[C1, …, Cm]
is the list of contributors
a("one", 3, "a", 10).
a("one", 6, "c", 30).
a("one", 1, "b", 20).
a("one", 2, "c", 30).
a("two", 5, "f", 60).
a("two", 3, "e", 50).
a("two", 6, "g", 70).
a("two", 2, "d", 40).
a("two", 3, "d", 40).
@library("m:", "monotonic").
ssum(K, Sum) :- a(K, X, Z, U), Sum = m:sum(X, [K],[]).
ssumZ(K, Sum) :- a(K, X, Z, U), Sum = m:sum(X, [K],[Z]).
@output("ssum").
@output("ssumZ").
ssum("one", 3.0)
ssum("one", 9.0)
ssum("one", 10.0)
ssum("one", 12.0)
ssum("two", 5.0)
ssum("two", 8.0)
ssum("two", 14.0)
ssum("two", 16.0)
ssum("two", 19.0)
ssumZ("one", 3.0)
ssumZ("one", 9.0)
ssumZ("one", 10.0)
ssumZ("two", 5.0)
ssumZ("two", 8.0)
ssumZ("two", 14.0)
ssumZ("two", 16.0)
ssumZ("two", 17.0)
Since the order of processing the input tuples is undetermined, the only tuples that are guaranteed to be present are
ssum("one", 12.0)
and ssum("two", 19.0)
.
As for ssumZ("one", 12.0)
, since in both groups there are two facts with the same contributor Z, but the reasoner will always choose to add to the sum the one with the greatest X.
In the case of K = "one" there are two facts with contributor Z = "c":
-
a("one", 6, "c", 30)
-
a("one", 2, "c", 30)
As they are encountered in this order, the reasoner will sum a("one", 6, "c", 30)
, ignore a("one", 2, "c", 30)
, and the final sum will be ssumZ("one", 10.0)
.
In the case of group K = "two", instead:
-
a("two", 2, "d", 40)
-
a("two", 3, "d", 40)
The fact a("two", 2, "d", 40)
will be added to the incremental sum (ssumZ("two", 16.0)
); however once the reasoner encounters a("two", 3, "d", 40)
, since 3 > 2, it will subtract 2 and add 3 instead, reaching the final sum of ssumZ("two", 17.0)
.
Now consider the company control problem. Every company controls itself and any other company of which it contains more than 50% via companies that it controls. This can be encoded using the following rules.
owns("source", "c1", 1.0).
owns("source", "c2", 1.0).
owns("source", "c3", 1.0).
owns("c1", "c4", 0.3).
owns("c2", "c4", 0.3).
owns("c4", "sink", 0.3).
owns("c3", "sink", 0.3).
controls(I, I), controls(J, J) :- owns(I, J, _).
controls(I, K) :- controls(I, J), owns(J, K, W2), W = m:sum(W2, [I, K], []), W > 0.5.
@library("m:", "monotonic").
@output("controls").
controls('source', 'source').
controls('c1', 'c1').
controls('c2', 'c2').
controls('c4', 'c4').
controls('c3', 'c3').
controls('sink', 'sink').
controls('source', 'c1').
controls('source', 'c2').
controls('source', 'c3').
controls('source', 'c4').
controls('source', 'sink').
In addition to controls(C, C)
for each C
in ["source", "c1", "c2", "c3", "sink"]
, the engine will also infer
controls("source", C)
, for each C
in ["c1", "c2", "c3"]
, since source
fully owns each such C
. Consequently,
controls("source", "c4")
will also be inferred, since source
contains 60% of c4
via the now controlled
c1
and c2
. Finally, controls("source", "sink")
will also be inferred since source
contains 60% of sink
via the controlled c3
and c4
.
prod
Incrementally multiplies the facts.
prod(X, [K1, ..., Kn], [C1, ..., Cm])
Where:
-
X
is the value to be used in the incremental computation of the aggregation -
[K1, …, Kn]
is the list of group-by variables -
[C1, …, Cm]
is the list of contributors
a("one", 3, "a", 10).
a("one", 6, "c", 30).
a("one", 1, "b", 20).
a("one", 2, "c", 30).
a("two", 5, "f", 60).
a("two", 3, "e", 50).
a("two", 6, "g", 70).
a("two", 2, "d", 40).
a("two", 3, "d", 40).
@library("m:", "monotonic").
pprod(X, Sum) :- a(K, X, Z, U), Sum = m:prod(X, [K],[]).
pprodU(X, Sum) :- a(K, X, Z, U), Sum = m:prod(X, [K],[U]).
@output("pprod").
@output("pprodU").
pprod("one", 3)
pprod("one", 18)
pprod("one", 36)
pprod("two", 5)
pprod("two", 15)
pprod("two", 90)
pprod("two", 180)
pprod("two", 540)
pprodU("one", 3)
pprodU("one", 18)
pprodU("one", 6)
pprodU("two", 3)
pprodU("two", 18)
pprodU("two", 6)
pprodU("two", 5)
pprodU("two", 15)
pprodU("two", 90)
pprodU("two", 180)
Similarly to sum
, the pprod
relation should contain at least the tuples pprod("one", 36.0)
and pprod("two", 540.0)
.
In the case of pprodZ
, using the contributor U, only the fact with the contributor with the smallest X will be chosen; hence we will have that the product for K = "one" will first multiply by a("one", 6, "c", 30)
(getting pprodU("one", 18)
), but since it will later encounter a("one", 2, "c", 30)
and 2 < 6 it will divide by 6 and multiply by 2 (getting pprodU("one", 6)
), while for K = "two" it will multiply by a("two", 2, "d", 40)
and ignore a("two", 3, "d", 40)
since 2 < 3.
count
Incrementally counts the number of facts based on the group identifier.
count(K1, K2, ...)
Where:
-
[K1, …, Kn]
is a list of values, called keys, to be used as a group identifier (i.e. they play the role of group by variables in standard SQL).
a("one", 3, "a", 10).
a("one", 6, "c", 30).
a("one", 1, "b", 20).
a("one", 2, "c", 30).
a("two", 5, "f", 60).
a("two", 3, "e", 50).
a("two", 6, "g", 70).
a("two", 2, "d", 40).
a("two", 3, "d", 40).
@library("m:", "monotonic").
ccount(X, Sum) :- a(X, Y, Z, U), Sum = m:count(X).
ccount_other(X, Sum) :- a(X, Y, Z, U), Sum = m:count(X, Y).
@output("ccount").
@output("ccount_other").
The output of count
is fully determined and should contain all tuples count("one", i)
, for i = 1, …, 4
, and
count("two", i)
, for i = 1, …, 5
.
ccount_other('one', 1).
ccount_other('two', 1).
ccount_other('two', 2).
ccount('one', 1).
ccount('one', 2).
ccount('one', 3).
ccount('one', 4).
ccount('two', 1).
ccount('two', 2).
ccount('two', 3).
ccount('two', 4).
ccount('two', 5).
In the case of ccount
we have that the facts for each group X are counted (final count: 4 for group "one", 5 for group "two"). When using a second key group identifier, like Y, we see that we have 1 for group "one" and 2 for "two", as X = "one" and Y = 3 appears twice in the data.