Data Types, Constants and Variables

Data Types

The table below shows the data types supported by Vadalog, along with the literals for the respective constants.

Data type

Examples of constant literals

string

"string literal", "a string", ""

integer

1, 3, 5, -2, 0

double

1.22, 1.0, -2.3, 0.0

date

2012-10-20, 2013-09-19 11:10:00

boolean

#T, #F

set

{1}, \{1,2}, \{}, \{"a"}, \{2.0,30}

list

\[1\], \[1,2\], \[\], \["a"\], \[2.0,30\]

unknown

Constants

Constants are immutable literal values of a specific data type.

String literals

A string value is a finite sequence of Unicode characters. For example, "hello" and "Bob" are strings.

A string literal is a double quote character (", U+0022), followed by zero or more character specifiers, followed by another double quote character.

Each character specifier determines one character that will be included in the string. The possible character specifiers are as follows:

Any character except a double quote, a backslash (\\, U+005C), or a newline (U+000A). The character specifies itself for inclusion in the string.

\", indicating a double quote character (U+0022).

\b, indicating a backspace (U+0008).

\t, indicating a tab character (U+0009).

\n, indicating a newline character (U+000A).

\f, indicating a form feed character (U+000C).

\r, indicating a carriage return character (U+000D).

\\\\, indicating a single backslash (U+005C) (but see the note below!).

\', indicating a single quote character (U+0027).

\u followed by exactly four hexadecimal digits, indicating the Unicode character with the code point given by those hex digits. Hexadecimal digits that are letters may be given in upper or lower case.

Integer literals

Values of type int are the usual mathematical integers. These values are internally represented as 32 bit two’s complement binary numbers. They must therefore be in the range -(2^32) through 2^(32)-1, or from -2147483648 to 2147483647, inclusive.

The operations use the integer arithmetic of the underlying hardware. So, for example, a result greater than the maximum value may be silently converted to a negative number.

Double literals

Values of type double are binary floating-point numbers, represented according to the IEEE 754 floating point specification. Internally, their representation uses 64 bits.

A floating-point is encoded with an integer part, a decimal fractional part with a dot prefix and/or an exponent over base 10 with prefix E, and a suffix f. For example, a floating-point number can be written as 2.71f with a decimal part, 2E3f with an exponent part (equivalent to 2000.0f), or 2.71E3f with both decimal and exponent parts (equivalent to 2710.0f).

The internal representation of floating-point numbers uses the base 2.

If an arithmetic operation produces NaN (for instance, through division by 0), the value is not stored and the operation fails.

If an arithmetic operation produces a -0, it is converted and stored as a +0.

If an arithmetic operation results in a number that cannot be stored using a 64-bit representation, it is stored as either positive infinity +inf or negative infinity -inf. Two +inf values, even if resulting from different computations, are considered equal. Similarly with two -inf values. Note that Vadalog does not provide any literal representation for infinite float values. Nor is there any explicit way to check whether a value is infinite.

Date literals

Values of type date are literals of the form YYYY-MM-DD HH24:MI:SS. So for example 2010-12-31 12:35:20 is a date literal. If HH24:MI:SS is omitted, 00:00:00 is assumed. Internally, they are handled as UTC values.

Boolean literals

The only two possible values for type Boolean are #T (true) and #F (false).

Set literals

Values of type set are literals of the form {} (empty set), {1,2}, {"a","b"}, {#T,#F}. Duplicates are automatically eliminated, and the order is neither meaningful nor memorized.

Sets must be homogeneous, that is, they must contain elements of the same data types.

Unknown literals

Unknown data types have no possible literals.

Variables

Variables have different interpretations in different programming paradigms. In imperative languages, they are essentially memory locations that may have some contents and the contents may change over time.

In algebra or physics, a variable represents a concrete value and its function is somewhat similar to a pronoun in natural language. In effects, once we replace variables with concrete actual values, we have relations between concrete arithmetic expressions.

Variables in Vadalog are more like variables in algebra than like those of imperative programming languages.

Specifically, a good interpretation is the following.

Variables in Vadalog are like variables in first-order logic.

For example, consider the following statements:

"For any man X there exists a father Y"
"Every father X is a man"

These statements can be true or false depending on how we choose to instantiate X and Y, which means, what specific concrete values we choose. There are quantifiers "for any", "every" (or "for all"), namely universal quantification, and "there exists", namely existential quantification.

It should be noted that a Vadalog variable is local to the rule in which it occurs. This means that occurrences of the same variable name in different rules refer to different variables.

Variables cannot occur in facts.

A variable such as X is just a placeholder. In order to use it in a computation, we must instantiate it, i.e., replace it with a concrete value. The value is called the instantiation or binding.

There are several ways in which a Vadalog variable can be instantiated.

If the variable occurs in an atom in the body of a rule, the variable can then become instantiated to a value derived from the values in the predicate.

In general, a bound variable should be positively bound, i.e., it should have a binding occurrence that is not in the scope of a negation.

Variables in Vadalog need to be capitalized, and can contain underscores.

Anonymous Variables

To ignore certain predicates in a rule body, one can use anonymous variables using the underscore symbol. Such as in the following example:

t("Text", 1, 2).
t("Text2", 1, 2).
b(X) :- t(X, _, _).
@output("b").

Marked Nulls

A marked null represents an identifier for an unknown value. Marked nulls are produced as a result of: 1. nulls in the data sources (unless the data source supports marked nulls, all of them are assumed to have different identifiers); 2. existential quantification.

The type of a marked null is always unknown.

The following two examples show possible uses of marked nulls. Many more are indeed possible and important in ontological reasoning.

Example 1

employee(1).
employee(2).
manager(Y,X) :- employee(X).
@output("manager").

This ontology represents that every employee has a manager. The expected result is:

Expected result

manager(z1,1). manager(z2,2).

where z1 and z2 are marked nulls, representing that there must be a manager for each of the employees, but their identity is unknown.

Example 2

employee("Jack").
contract("Jack").
employee("Ruth").
contract("Ruth").
employee("Ann").
hired("Ann","Ruth").
manager(Z,X) :- employee(X).
hired(Y,X) :- manager(Y,X),contract(X).
contractSigned(X) :- hired(Y,X),manager(Y,Z).
@output("contractSigned").

Example 6 expresses a simple ontology stating that every employee X has a manager Y. If the manager Y sees that there is a pending contract for the respective employee X, then he hires the employee. Once a manager Y has hired an employee X, the respective contract X is signed. If someone has been hired for some reason by an employee who is not a manager, then the contract will not be signed. Observe that the name of the manager is unknown throughout the entire processing. The expected result is:

Expected result

contractSigned("Jack"). contractSigned("Ruth").