Temporal Reasoning

Temporal reasoning adds the native support for temporal intervals to the data. It supports the operators of DatalogMTL and includes some additional functionality.

Following restrictions have been made:

  • Joins: Only NestedLoopJoins are supported

  • Temporal Operators: You cannot nest them and write multiple rules, e.g. B :- [-]<t1,t2><→<t3,t4> A is forbidden and has to be writen as B1 :- <→<t3,t4> A. B :- [-]<t1,t2> B1.

Temporal Intervals

For supporting the operations, we have two types of intervals:

  • absolute (datetime) intervals of the form <YYYY-MM-DD HH:mm:ss,YYYY-MM-DD HH:mm:ss> for facts. We support absolute (integer, double) intervals for facts as alternative in case you have integers you want to work with (e.g., timestamps). You are not allowed to mix different data types of intervals in the program.

  • relative intervals of the form <x,y>, where x and y are numerics and x⇐y for operators.

Note that < is either open "(" or close "[" and > is either open ")" or close "]"

Example
[2020-01-01,2021-01-01)
(3,8)
(2.5,4]
[2020-01-01,2020-01-31 23:59:59]

Temporal Facts

In order to make a fact temporal, one has to annotate the fact with an absolute interval. Such a fact has the form A@<t1,t2>

Example
A@[2020-01-01,2021-01-01).

Temporal Input/Output

When data is retrieved via or sent to record managers, one has to specify which columns should be mapped to the temporal interval. This is done with one the following annotations: * @temporalMapping annotation, one for each temporal metadata element (left/right bracket, left/right endpoint); * @temporalMappings annotation, one for the complete set of temporal metadata (all brackets and endpoints).

@temporalMapping (one element = one annotation)

The syntax is the following:

@temporalMapping(predicateName, position, temporalMappingType, defaultValue).

Additionally, the annotation can have a fifth, optional parameter, columnName:

@temporalMapping(predicateName, position, temporalMappingType, defaultValue, columnName).

Where:

  • predicateName is the name of the predicate

  • position is the column position in the record manager or -1 if the defaultValue is used or if the value does not need to be mapped to an output

  • temporalMappingType is the element of the temporal metadata the field represents. It can be:

    • LEFT_ENDPOINT

    • RIGHT_ENDPOINT

    • LEFT_BRACKET

    • RIGHT_BRACKET

  • defaultValue is the default value: a date, in case of the [LEFT|RIGHT]_ENDPOINT and a boolean value (#T or #F) in case of [LEFT|RIGHT]_BRACKET, or -1 if a position is provided.

  • columnName indicates the name of the column of the database to for that field (most useful in output bindings).

Please note that a value that is read as temporal metadata by the @temporalMapping annotation will not be available for use as data.

This annotation, with a different syntax, was once called @timeMapping. This name is no longer supported.

Example

Input File has the following fields in this order:

ID, Name, ExtraDate, StartDate, Surname, EndDate

While Output File has the following fields in this order:

StartDate, ID, EndDate, Name, isClosedEnd

@input("a").
@bind("a","csv useHeaders=true","programs_data","time_mapping.csv").
@temporalMapping("a",3,"LEFT_ENDPOINT",-1).
@temporalMapping("a",5,"RIGHT_ENDPOINT",-1).
@temporalMapping("a",-1,"LEFT_BRACKET",#T).
@temporalMapping("a",-1,"RIGHT_BRACKET",#T).

@output("b")."
@timeMapping("b",0,2,-1,4).
@temporalMapping("b",0,"LEFT_ENDPOINT",-1).
@temporalMapping("b",2,"RIGHT_ENDPOINT",-1).
@temporalMapping("b",-1,"LEFT_BRACKET",#T).
@temporalMapping("b",4,"RIGHT_BRACKET",#T).
@bind("b","csv useHeaders=true","programs_data","time_mapping_output.csv").

b(ID,Name) :- a(ID,Name,ExtraDate,Surname).

@temporalMappings (all elements in one annotation)

The @temporalMappings (please note the final s, making it plural) works as an alternative to @temporalMapping in order to specify the mapping of the temporal metadata elements all in just one annotation.

@temporalMappings(predicateName, posStart, posEnd, posStartClosed, posEndClosed, defaultTemplate).

Where:

  • predicateName is the name of the predicate

  • posStart is the column position of the left endpoint in the record manager or -1 if the defaultValue is used or if the value does not need to be mapped to an output

  • posEnd is the column position of the right endpoint in the record manager or -1 if the defaultValue is used or if the value does not need to be mapped to an output

  • posStartClosed is the column position of the left bracket in the record manager or -1 if the defaultValue is used or if the value does not need to be mapped to an output

  • posEndClosed is the column position of the right bracket in the record manager or -1 if the defaultValue is used or if the value does not need to be mapped to an output

  • defaultTemplate is a special string that allows the user to specify the defaults in form of a template in the form of (LEFT_BRACKET)(LEFT_ENDPOINT),(RIGHT_ENDPOINT)(RIGHT_BRACKET):

  • LEFT_BRACKET can be < for not specified, ( for open and [ for closed

  • LEFT_ENDPOINT can be _ (not specified), a default date in the yyyy-MM-dd or yyyy-MM-dd HH:mm:ss format, an int or a double

  • RIGHT_ENDPOINT can be _ (not specified), a default date in the yyyy-MM-dd or yyyy-MM-dd HH:mm:ss format, an int or a double

  • RIGHT_BRACKET can be > for not specified, ) for open and ] for closed

Go-to templates for the most common cases are:

  • <_,_> for a blank template: all the pos values are different than -1, so we don’t need defaults

  • [,] an interval closed on both sides (where the left and right endpoint are mapped in the record manager)

  • (,) an interval open on both sides (where the left and right endpoint are mapped in the record manager)

  • [,) the interval is only left-closed (normal form)

  • (,] the interval is only right-closed

Example

Input File has the following fields in this order:

ID, Name, ExtraDate, StartDate, Surname, EndDate

While Output File has the following fields in this order:

StartDate, ID, EndDate, Name, isClosedEnd

@input("a").
@bind("a","csv useHeaders=true","programs_data","time_mapping.csv").
@temporalMappings("a",3,5,-1,-1,"[_,_]").

@output("b")."
@timeMapping("b",0,2,-1,4).
@temporalMapping("b",0,2,-1,4,"[_,_>").
@bind("b","csv useHeaders=true","programs_data","time_mapping_output.csv").

b(ID,Name) :- a(ID,Name,ExtraDate,Surname).

@temporalType

In order to work with temporal metadata of different types, we need to specify the type in use for each programs. This is done with the annotation @temporalType.

@temporalType(type).

Where type can be: * "date" for dates. This is also the default setting: if no @temporalType, Vadalog will assume the time is in dates; * "int" for integers; * "double" for doubles.

Please note that you can use only one temporal type in a program, so all temporal predicates should have the same type.

Example
@input("a").
@bind("a","csv useHeaders=true","programs_data","time_mapping.csv").
@temporalMappings("a",3,5,-1,-1,"[_,_]").
@temporalType("int").

@output("b")."

b(ID,Name) :- a(ID,Name,ExtraDate,Surname).

@timeGranularity

This annotation allows to specify the granularity for relative validity intervals. The following possibilities are offered: - milliseconds - seconds - minutes - hours - days - months - years - no_date (use if you do not use dates for facts, but integer or doubles)

The conversion factor between days and months is an estimated average number of days per month (a year has 365.2425) Therefore, we do not recommend to mix days (and smaller time units) with months (and higher time units).

Example
@timeGranularity("months").

Operations with granularity in Months and Years

Since the duration of months and years is not fixed (months can have 28-31 days, years can have 365-366 days), in order to make operations with the granularity of months and years possible the property temporal.standardDurations allow to decide whether one could consider standard durations for months and years.

temporal.standardDurations=true
temporal.standardMonthDuration=30
temporal.standardYearDuration=365

If temporal.standardDurations is set to false Vadalog Engine will launch an exception when trying to run operations on time granularity months or years.

In order to switch between different granularities of operations, you should instead use relative validity intervals in ISO 8601 durations.

ISO 8601 Relative Validity Intervals

In order not to stick to a specific granularity, and instead to be able to easily specify complex relative intervals, i.e. "6 years, 2 months, 13 days and 10 minutes", Temporal Vadalog allows the user to specify such intervals in the standard ISO 8601 Duration, preceded by a hash ('#'), in the following format:

#P[n]Y[n]M[n]DT[n]H[n]M[n]S

Where:

  • # trailing hash to differentiate from other tokens

  • P stands for Period

  • [n]Y, [n]M, and [n]D are the blocks corresponding to years, months and days (date blocks). [n] is replaced with the value of the element of date that follows (i.e. 9Y = 9 years). [n] can be a decimal number (using either '.' or ',' as a separator). Each block is optional.

  • T stands for Time and separates the date blocks from the time blocks; it is included only if there is a time part.

  • [n]H, [n]M, and [n]S are the blocks corresponding to hours, minutes and seconds (time blocks). They follow the same rules as the date blocks.

[More information on ISO 8601 Duration on Wikipedia](https://en.wikipedia.org/wiki/ISO_8601#Durations)

For example, this is how we would specify "6 years, 2 months, 13 days and 10 minutes" in a relative interval:

#P6Y2M13DT10M

Some other example for ease of use:

  • #P1Y: 1 year

  • #P1M: 1 month

  • #P1D: 1 day

  • #P1Y1D: 1 year and 1 day

  • #PT1H: 1 hour

  • #PT1M: 1 minute

  • #PT1S: 1 second

  • #P1YT1D1S: 1 year, 1 day, and 1 second

  • #PT2.5H: 2 and a half hours

Additionally, it is possible to specify relative intervals also in weeks with the week format:

#P[n]W

Where the only block present is [n]W to specify the number of weeks.

Example, to specify 4 weeks we will use:

#P4W

The Datetime format and the Week format can be mixed in the same relative interval (i.e. [#P1D, #P1W)), but they cannot be mixed with the other time granularities.

@temporal Annotation

In order to limit the reasoning to a certain time interval, the @temporal annotation is used. @temporal takes as parameters start and end of an interval (absolute value, same format as intervals used for facts), and when employed limits the reasoning to the interval set, returning only results that are also in that interval.

Examples

Relative:

@temporal(1,3).

Absolute:

@temporal(2021-02-01,2021-02-10).

@config Annotation

The @config annotation allows the programmer to choose the merge strategy for intervals and other settings for the reasoning on time. It takes two parameters: the property key and the property values.

Possible property keys:

  • "mergeTemporalDataBeforeOutput". Values: "true" or "false" (default). If true, adjacent intervals are merged before producing the output.

  • "mergeTemporalDataAfterInput". Values: "true" or "false" (default). If true, adjacent intervals are merged directly after the input.

  • "enableTemporalMerge" - Allows to enable merging for non-required operations (e.g., diamond operation). Values: "true" or "false" (default).

  • "discreteDomain" - In case, we work on integer domain, we can tell the engine. This allows for optimizations. At the moment only used to convert facts to closed interval representation.

  • "temporalMergeStrategy" - One of the following values can be chosen as value:

    • "default" - Uses the default value chosen by the engine (e.g. for example an all scan for the output, since we only want to produce the biggest intervals).

    • "all" - Uses the default value for the all strategy. This scan is a merge scan that merges all entries, which can be currently fetched before forwarding anything to the following scans. Then it forward directly only the biggest intervals to next scan. Note that this scan is not fully optimized yet since the change tracking is not fully integrated and depends on buffer cache filtering.

    • "forward" - Uses the default value for the forward strategy. This strategy uses the streaming mode of Vadalog as known, i.e., it reads a fact, merges it with existing read intervals and directly forwards the merged interval.

  • "temporalMergePlacement" - to decide where to merge intervals in the pipeline:

    • a) "ONLY_REQUIRED": only when required ONLY_REQUIRED

    • b) "EARLIEST_WHEN_REQUIRED" in the earliest position in the pipeline when there is an operator that needs it

    • c) "ALWAYS" always, whether there is an operator that requires it or not, in the earliest position a merge can be done

Example
@config("discreteDomain","true").

These settings default can be set in the vada.properties file. However, as the functioning of these different configurations are heavily dependent on the program, they should remain, normally, as the following:

# TEMPORAL CONFIG
temporal.mergeTemporalDataBeforeOutput=false
temporal.enableTemporalMerge=false
temporal.temporalMergeStrategy=forward
temporal.temporalMergeStrategy.output=all

Temporal Operators

Temporal operators are applied to temporal predicates.

In general, a rule with a temporal operator has the form

 B :- <-><t1,t2> A
 B :- [-]<t1,t2> A
 B :- <+><t1,t2> A
 B :- [+]<t1,t2> A
 B :- © A

where A is a temporal predicate, and <t1,t2> is a relative temporal interval.

The interpretation of the operators are as follows:

Closing:

  • c: If Fact A is valid in the interval <x,y>, then B is valid in [x,y]

Forward Propagating (derive facts in the future):

  • <→: If Fact A is valid in the interval <x,y>, then B is valid in <x+t1,y+t2>

  • [-]: If Fact A is valid in the interval <x,y>, B is valid in <x+t2,y+t1> iff x+t2 ⇐ y+t1.

  • Since B :- A1 S<t1,t2> A2: You can define the since operator as follows:

S1 :- © A1, A2
S2 :- <-><t1,t2> S1
B :- S2, © A1

Backward Propagating (derive facts in the past):

  • <+>: If Fact A is valid in the interval <x,y>, then B is valid in <x-t2,y-t1>

  • [+]: If Fact A is valid in the interval <x,y>, B is valid in <x-t1,y-t2> iff x-t1 ⇐ y-t2.

  • Until C :- A1 U<t1,t2> A2: You can define the until operator as follows:

S1 :- (c) A1, A2.
S2 :- <+><t1,t2> S1.
B :- S2, (c) A1.

The endpoints are defined as follows, in case it is not infinity (always open): * <→: The interval is left-closed (right-closed), if the fact and the operator is left-closed (right-closed) * <+>: The interval is left-closed (right-closed), if the fact is left-closed (right-closed) and the operator is right-closed (left-closed) * [-]: The interval is left-closed (right-closed), if the fact is left-closed (right-closed) or the operator is right-open (left-open) * [+]: The interval is left-closed (right-closed), if the fact is left-closed (right-closed) or the operator is left-open (right-open) * (c): The interval is left-closed and right-closed

Example
a(1)@(2020-02-10,2020-03-11].
b(X) :- <->(3,7.5] a(X).
c(X) :- [-](2,4] a(X).
@output("b").
@output("c").
@timeGranularity("days").

The above example outputs b@(2020-01-04,2020-01-13 12:00:00) and c@[2020-01-05,2020-01-08].

Additional Operations

In addition, we support the following operator for facts of type date:

Triangle Up Operator

B :-/\{x} {units} [offset] A

where x is a number and units is a time granularity (plural) or no_date in case of wanting to use the operator to aggregate temporal data with integers or doubles. This operator extends the interval to the length of the provided unit. offset offsets the grouping of time units with respect to the default one. (See Offset for more info.)

In case x > 1, then it extends the interval to a multiple of it (e.g., to handle periods) For example, 4 months creates the intervals (Jan, Feb, Mar, Apr), (May, Jun, Jul, Aug) (Sep, Oct, Nov, Dec) and 3 months creates the intervals (Jan, Feb, Mar), (Apr, May, Jun), (Jul, Aug, Sep), (Oct, Nov, Dec) and must be a divisor of the time granularity.

That is for:

  • years: any value >= 1

  • months: 1, 2, 3, 4, 6, 12

  • days: 1

  • hours: 1, 2, 3, 4, 6, 8, 12, 24

  • minutes: divisors of 60

  • seconds: divisors of 60

B :-/\3 months A

Offset

To make the operator more flexible to a variety of scenarios, it is possible to tell the operator to offset the default intervals. For example, to make the 3-months intervals (Mar-May),(Jun-Aug) instead of (Jan-Mar) and (Apr-Jun), we will use the offset 2.

Example
B :-/\3 months 2 A
Example
a(1,2)@(2020-02-10,2020-03-11].
c(X,Y) :- /\1 months a(X,Y).
d(X,Y) :- /\4 months a(X,Y).
@output(\"c\").
@output(\"d\").
@timeGranularity(\"days\").

The above example outputs c(1,2)@[2020-02-01,2020-04-01) and d(1,2)@[2020-01-01,2020-05-01).

Triangle Down Operator

B :- \/{x} units [offset] A

where x is a number and units is a time granularity or no date in case the temporal type is integer or double. This operator reduces the interval to the length of the provided unit. offset offsets the grouping of time units with respect to the default one. (See Offset for more info.)

If the Triangle Up can be seen as a sort of ceiling operator, the Triangle Down is the corresponding flooring operator: it returns atoms with the largest interval made of blocks of x units included in the interval provided by the original atoms.

It is useful, for example, to get the full trimesters covered in a large interval of time.

In case x > 1, then it reduces the interval to a multiple of it (e.g., to handle periods) For example, 4 months creates the intervals (Jan, Feb, Mar, Apr), (May, Jun, Jul, Aug) (Sep, Oct, Nov, Dec) and 3 months creates the intervals (Jan, Feb, Mar), (Apr, May, Jun), (Jul, Aug, Sep), (Oct, Nov, Dec) and must be a divisor of the time granularity.

That is for:

  • years: any value >= 1

  • months: 1,2,3,4,6,12

  • days: 1

  • hours: 1,2,3,4,6,8,12,24

  • minutes: divisors of 60

  • seconds: divisors of 60

Example
a(1,2)@(2020-02-10,2020-03-11].
c(X,Y) :- \/1 months a(X,Y).
d(X,Y) :- \/4 months a(X,Y).
@output(\"c\").
@output(\"d\").
@timeGranularity(\"days\").

The above example outputs c(1,2)@[2020-02-01,2020-04-01) and d(1,2)@[2020-01-01,2020-05-01).

Offset

To make the operator more flexible to a variety of scenarios, it is possible to tell the operator to offset the default intervals. For example, to make the 3-months intervals (Mar-May),(Jun-Aug) instead of (Jan-Mar) and (Apr-Jun), we will use the offset 2.

Usage:

B :-\/3 months 2 A
Example
a(1,2)@(2020-03-10,2021-02-11].
c(X,Y) :- /\3 months a(X,Y).
d(X,Y) :- /\3 months 1 a(X,Y).
@output(\"c\").
@output(\"d\").
@timeGranularity(\"days\").

The above example outputs c(1,2)@[2020-04-01,2021-01-01) and d(1,2)@[2020-05-01,2021-02-01).

Conversion Between Temporal And Non-Temporal Reasoning

@temporalAtom

In case you want to manipulate the time outside of temporal reasoning, it is possible to convert temporal predicates to non-temporal ones and non-temporal predicates to temporal ones. For this you annotate the predicate with temporalAtom:

Unwrapping (Temporal to Non-Temporal):

b(X,A,B,C,D) :- a(X)@temporalAtom(A,B,C,D).

Wrapping (Non-Temporal to Temporal):

d(X)@temporalAtom(A,B,C,D) :- c(X,A,B,C,D).