Monotonic aggregations

The monotonic library provides stateful (aka impure) functions for incremental and recursion-friendly computation of aggregate values. These are available in streaming reasoning modes. These functions are allowed in recursive rules. The currently supported functions include:

sum

incremental computation of sums

prod

incremental computation of products

count

incremental computation of counts

min

incremental computation of minima CURRENTLY UNAVAILABLE

max

incremental computation of maxima CURRENTLY UNAVAILABLE

Upon invocation, all functions return the currently accumulated value for the respective aggregate.

sum

This function computes the incremental sum.

The syntax is the following:

sum(X, [K1, ..., Kn], [C1, ..., Cm])

Where:

  • X is the value to be used in the incremental computation of the aggregation

  • [K1, …​, Kn] is the list of group-by variables

  • [C1, …​, Cm] is the list of contributors

Example
a("one", 3, "a", 10).
a("one", 6, "c", 30).
a("one", 1, "b", 20).
a("one", 2, "c", 30).

a("two", 5, "f", 60).
a("two", 3, "e", 50).
a("two", 6, "g", 70).
a("two", 2, "d", 40).
a("two", 3, "d", 40).

@library("m:", "monotonic").

ssum(K, Sum) :- a(K, X, Z, U), Sum = m:sum(X, [K],[]).
ssumZ(K, Sum) :- a(K, X, Z, U), Sum = m:sum(X, [K],[Z]).

@output("ssum").
@output("ssumZ").
Expected result
ssum("one", 3.0)
ssum("one", 9.0)
ssum("one", 10.0)
ssum("one", 12.0)

ssum("two", 5.0)
ssum("two", 8.0)
ssum("two", 14.0)
ssum("two", 16.0)
ssum("two", 19.0)

ssumZ("one", 3.0)
ssumZ("one", 9.0)
ssumZ("one", 10.0)

ssumZ("two", 5.0)
ssumZ("two", 8.0)
ssumZ("two", 14.0)
ssumZ("two", 16.0)
ssumZ("two", 17.0)

Since the order of processing the input tuples is undetermined, the only tuples that are guaranteed to be present are ssum("one", 12.0) and ssum("two", 19.0).

As for ssumZ("one", 12.0), since in both groups there are two facts with the same contributor Z, but the reasoner will always choose to add to the sum the one with the greatest X.

In the case of K = "one" there are two facts with contributor Z = "c":

  • a("one", 6, "c", 30)

  • a("one", 2, "c", 30)

As they are encountered in this order, the reasoner will sum a("one", 6, "c", 30), ignore a("one", 2, "c", 30), and the final sum will be ssumZ("one", 10.0).

In the case of group K = "two", instead:

  • a("two", 2, "d", 40)

  • a("two", 3, "d", 40)

The fact a("two", 2, "d", 40) will be added to the incremental sum (ssumZ("two", 16.0)); however once the reasoner encounters a("two", 3, "d", 40), since 3 > 2, it will subtract 2 and add 3 instead, reaching the final sum of ssumZ("two", 17.0).

Now consider the company control problem. Every company controls itself and any other company of which it contains more than 50% via companies that it controls. This can be encoded using the following rules.

Example
owns("source", "c1", 1.0).
owns("source", "c2", 1.0).
owns("source", "c3", 1.0).

owns("c1", "c4", 0.3).
owns("c2", "c4", 0.3).

owns("c4", "sink", 0.3).
owns("c3", "sink", 0.3).

controls(I, I), controls(J, J) :- owns(I, J, _).
controls(I, K) :- controls(I, J), owns(J, K, W2), W = m:sum(W2, [I, K], []), W > 0.5.

@library("m:", "monotonic").

@output("controls").
Expected result
controls('source', 'source').
controls('c1', 'c1').
controls('c2', 'c2').
controls('c4', 'c4').
controls('c3', 'c3').
controls('sink', 'sink').
controls('source', 'c1').
controls('source', 'c2').
controls('source', 'c3').
controls('source', 'c4').
controls('source', 'sink').

In addition to controls(C, C) for each C in ["source", "c1", "c2", "c3", "sink"], the engine will also infer controls("source", C), for each C in ["c1", "c2", "c3"], since source fully owns each such C. Consequently, controls("source", "c4") will also be inferred, since source contains 60% of c4 via the now controlled c1 and c2. Finally, controls("source", "sink") will also be inferred since source contains 60% of sink via the controlled c3 and c4.

prod

Incrementally multiplies the facts.

prod(X, [K1, ..., Kn], [C1, ..., Cm])

Where:

  • X is the value to be used in the incremental computation of the aggregation

  • [K1, …​, Kn] is the list of group-by variables

  • [C1, …​, Cm] is the list of contributors

Example
a("one", 3, "a", 10).
a("one", 6, "c", 30).
a("one", 1, "b", 20).
a("one", 2, "c", 30).

a("two", 5, "f", 60).
a("two", 3, "e", 50).
a("two", 6, "g", 70).
a("two", 2, "d", 40).
a("two", 3, "d", 40).

@library("m:", "monotonic").

pprod(X, Sum) :- a(K, X, Z, U), Sum = m:prod(X, [K],[]).
pprodU(X, Sum) :- a(K, X, Z, U), Sum = m:prod(X, [K],[U]).

@output("pprod").
@output("pprodU").
Expected results
pprod("one", 3)
pprod("one", 18)
pprod("one", 36)

pprod("two", 5)
pprod("two", 15)
pprod("two", 90)
pprod("two", 180)
pprod("two", 540)

pprodU("one", 3)
pprodU("one", 18)
pprodU("one", 6)

pprodU("two", 3)
pprodU("two", 18)
pprodU("two", 6)
pprodU("two", 5)
pprodU("two", 15)
pprodU("two", 90)
pprodU("two", 180)

Similarly to sum, the pprod relation should contain at least the tuples pprod("one", 36.0) and pprod("two", 540.0).

In the case of pprodZ, using the contributor U, only the fact with the contributor with the smallest X will be chosen; hence we will have that the product for K = "one" will first multiply by a("one", 6, "c", 30) (getting pprodU("one", 18)), but since it will later encounter a("one", 2, "c", 30) and 2 < 6 it will divide by 6 and multiply by 2 (getting pprodU("one", 6)), while for K = "two" it will multiply by a("two", 2, "d", 40) and ignore a("two", 3, "d", 40) since 2 < 3.

count

Incrementally counts the number of facts based on the group identifier.

count(K1, K2, ...)

Where:

  • [K1, …​, Kn] is a list of values, called keys, to be used as a group identifier (i.e. they play the role of group by variables in standard SQL).

Example
a("one", 3, "a", 10).
a("one", 6, "c", 30).
a("one", 1, "b", 20).
a("one", 2, "c", 30).

a("two", 5, "f", 60).
a("two", 3, "e", 50).
a("two", 6, "g", 70).
a("two", 2, "d", 40).
a("two", 3, "d", 40).

@library("m:", "monotonic").

ccount(X, Sum) :- a(X, Y, Z, U), Sum = m:count(X).
ccount_other(X, Sum) :- a(X, Y, Z, U), Sum = m:count(X, Y).

@output("ccount").
@output("ccount_other").

The output of count is fully determined and should contain all tuples count("one", i), for i = 1, …​, 4, and count("two", i), for i = 1, …​, 5.

Expected results
ccount_other('one', 1).
ccount_other('two', 1).
ccount_other('two', 2).
ccount('one', 1).
ccount('one', 2).
ccount('one', 3).
ccount('one', 4).
ccount('two', 1).
ccount('two', 2).
ccount('two', 3).
ccount('two', 4).
ccount('two', 5).

In the case of ccount we have that the facts for each group X are counted (final count: 4 for group "one", 5 for group "two"). When using a second key group identifier, like Y, we see that we have 1 for group "one" and 2 for "two", as X = "one" and Y = 3 appears twice in the data.