py-typedlogic tutorial

This tutorial walks you through some of the basic features of the typedlogic library, and how to define logical formulae for Python data structures and programs.

The tutorial makes use of a simple database of Link structures, which represent direct steps in a path through different US states. We will define a predicate Path that represents the transitive closure of the Link predicate.

For example, if we have Links:

CA -> NV
NV -> AZ
AZ -> UT

Then we have 3 paths of length 1 (corresponding to the links), as well as paths of length 2 from CA to AZ and NV to UT, and a path of length 3 from CA to UT. For now, we assume unidirectional links.

Installation

To install the core package:

pip install typedlogic

This tutorial makes use of the Snakelog and Z3 integrations, so you would need these two extras:

pip install 'typedlogic[snakelog,z3]'

This tutorial is also a Jupyter notebook, which can be executed interactively.

Defining our core data structures

You can define your data model through a different mechanisms. Currently, the recommended ways are either

Pydantic classes, OR
Python dataclasses

Support for other mechanisms (e.g SQLModels, Pandera) is planned.

For this tutorial, we will use a ready-made pydantic file containing the definition of our data model, which will consist of two classes:

Link, representing a direct edge between two entities
Path, representing a path of one or more edges

(For the purposes of this tutorial, we will use a show function to display the contents of our program, but this step is not necessary for everyday use).

We will use this to show the contents of the paths/model.py file.

# convenience function for running in a notebook
from utils import show
show("paths/model.py")

from typedlogic.integrations.frameworks.pydantic import FactBaseModel

ID = str

class Link(FactBaseModel):
    source: ID
    target: ID

class Path(FactBaseModel):
    source: ID
    target: ID

We can use this just as we would any other Pydantic schema. Here we will import it, and create instances of the Link class, representing direct connections between US states. For our purposes here, we will consider these directed.

from paths.model import Link

data = [("CA", "NV"), ("NV", "AZ"), ("AZ", "UT"), ("CA", "OR"), ("OR", "WA"), ("WA", "ID"), ("ID", "MT")]
links = [Link(source=source, target=target) for source, target in data]
links[0:2]

[Link(source='CA', target='NV'), Link(source='NV', target='AZ')]

Because we chose to use Pydantic, we get excellent runtime validation on object creation. Our schema so far has minimal constraints, but we have declared that the types of the two fields are strings. This means that we will get a validation error if we try to create a Link with non-string values.

from pydantic import ValidationError

try:
    new_link = Link(1,2)
except ValidationError as e:
    print("We got a validation error!")
    print("** Don't worry, this is expected **")
    print(e)

We got a validation error!
** Don't worry, this is expected **
2 validation errors for Link
source
  Input should be a valid string [type=string_type, input_value=1, input_type=int]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type
target
  Input should be a valid string [type=string_type, input_value=2, input_type=int]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type

So far everything we have done in standard Pydantic! Now, onto so unique typedlogic features:

Adding axioms

Next we want to define logical axioms that determine what inferences to expect from our data model. In particular, we want to define axioms (in this case, simple rules) that

Derives one-hop Paths from the Link predicate.
Derives multi-hop Paths from the transitive closure of the one-hop Path predicate.

We have a ready made program for doing this:

show("paths/axioms.py")

from paths.model import Link, Path, ID
from typedlogic.decorators import axiom

@axiom
def link_implies_path(x: ID, y: ID):
    """
    The presence of a link implies the existence of a (one-hop) path.
    """
    if Link(source=x, target=y):
        assert Path(source=x, target=y)

@axiom
def transitivity(x: ID, y: ID, z: ID):
    """
    If there is a path from x to y and a path from y to z,
    then there is a path from x to z.
    """
    if Path(source=x, target=y) and Path(source=y, target=z):
        assert Path(source=x, target=z)

Note that typedlogic is based in Python, we can take advantage of python constructs like import to modularize our theory.

This uses the @axiom decorator to define the rules. These may look a little unusual at first - they are not intended to be executed directly, instead they provide a declarative specification of the rules of our system.

If you are familiar with first-order logic, you can read each decorated function as a "forall" (universal quantifier) statement, where the arguments are the (typed) variables in the quantifier. If you are not, don't worry for now. You can just read each of the axioms as a simple rule.

Type Checking

We can type check our program using the validate method on PythonParser.

Behind the scenes, this is delegating to mypy

from typedlogic.parsers.pyparser import PythonParser
python_parser = PythonParser()
import pathlib
python_parser.validate(pathlib.Path("paths/axioms.py"))

[]

As expected, no issues

Solvers

The rules are not executed directly - instead they are used as inputs for solvers. Different solvers will operate differently, but broadly they can be thought of as fulfilling two roles:

Finding new facts: Given a set of rules and a set of facts, the solver can derive new facts that are logically entailed by the rules and the facts.
Checking whether a set of facts is consistent with the rules, or whether the rules themselves are consistent.

Not all solvers are capable of fulfilling both roles, and solvers may vary in what expressions they understand and how fast they run. py-typedlogic provides as easy bridge onto a number of different solvers, through its integrations mechanism.

We will start with a simple pure python builtin solver called SnakeLog. This implements only a subset of logic (formally, it is a small subset of datalog), but it has the advantage of requiring no additional installation steps, and being fast for large datasets.

First we'll create the solver object, and load the axioms into it.

import paths.axioms as paths_axioms
from typedlogic.integrations.solvers.snakelog import SnakeLogSolver
solver = SnakeLogSolver()
solver.load(paths_axioms)

Next we'll add our data to the solver. In this case, we'll add the Link objects we created earlier.

for link in links:
    solver.add(link)

Now we will ask the solver for a model.

Note the term "model" is a bit ambiguous. If you are coming from a data modeling and Pydantic background, "model" means schema or data model. The use in formal systems and "model theory" is a bit different. For now you can think of it as representing a universe of things that might possibly be true.

We'll get onto what it would mean for there to be multiple models later, but our simple snakelog solver (like many datalog systems) will just yield a single model, so we'll get that.

model = solver.model()

We can ask the model to retrieve all ground terms (or facts) that can be proved and are consistent in that model.

Here we use iter_retrieve to get all the Path objects that can be derived from the axioms

for path in model.iter_retrieve("Path"):
    print(path)

Path(AZ, UT)
Path(CA, AZ)
Path(CA, ID)
Path(CA, MT)
Path(CA, NV)
Path(CA, OR)
Path(CA, UT)
Path(CA, WA)
Path(ID, MT)
Path(NV, AZ)
Path(NV, UT)
Path(OR, ID)
Path(OR, MT)
Path(OR, WA)
Path(WA, ID)
Path(WA, MT)

We can see it has given us all transitive / multi-hop paths that can be derived from the starting facts.

We can also derive the one-hop links, but this is not very interesting, corresponding to the initial database (because we provided no axioms to entail new links).

Let's check this:

original_links = {(link.source, link.target) for link in links}
entailed_links = {term.values for term in model.iter_retrieve("Link")}
assert original_links == entailed_links

We can encode more complex inference rules, using predicates with more arguments, encoding the complexity of our domain. We will return to some examples later.

For now, let's switch to a different use case, checking our data and our data models.

Satisfiability Checking

For this we're going to switch to a more expressive solver, the Z3 Prover. You will need to install this as an extra, either:

pip install 'typedlogic[z3]'

We will set up the solver in the same way.

This time we won't ask for a model, we will instead run check which checks for satisfiability.

from typedlogic.integrations.solvers.z3 import Z3Solver

solver = Z3Solver()
solver.load(paths_axioms)
for link in links:
    solver.add(link)
solver.check()

Solution(satisfiable=True)

So far so good! Our data and theory is consistent and satisfiable. But this is not surprising as we have not specified any axioms involving negation.

Let's add an axiom that says that there are no cycles, i.e there is no path that leads back to itself. We can write this in FOL as:

$$ \forall x. \neg Path(x, x) $$

But here we will write this in Python syntax using the @axiom decorator:

@axiom
def acyclicity(x: ID, y: ID):
    assert not (Path(source=y, target=x) and Path(source=x, target=y))

Here is what our modified program looks like:

show("paths/axioms_strict.py")

from paths.model import Link, Path, ID
from typedlogic.decorators import axiom

@axiom
def link_implies_path(x: ID, y: ID):
    """Same as before"""
    if Link(source=x, target=y):
        assert Path(source=x, target=y)

@axiom
def transitivity(x: ID, y: ID, z: ID):
    """Same as before"""
    if Path(source=x, target=y) and Path(source=y, target=z):
        assert Path(source=x, target=z)

@axiom
def acyclicity(x: ID, y: ID):
    """No path should lead from a node back to itself"""
    assert not (Path(source=y, target=x) and Path(source=x, target=y))

Now let's set up the solver with the new axioms and check for satisfiability again. We will also add a link that creates a cycle.

solver = Z3Solver()
import paths.axioms_strict as paths_axioms_strict
solver.load(paths_axioms_strict)
for link in links:
    solver.add(link)
solver.add(Link(source="MT", target="CA"))
solver.check().satisfiable

False

Hurrah! As expected, the solver has found that the data is not satisfiable, because we have added a cycle.

Currently there is no single easy way to see an explanation, but this will be provided in the future.

Multiple Models

Next we will explore what happens when we add sentences that lead to multiple possible worlds in which all sentences are true. These separate "worlds" are typically called "models" in formal logic.

We will use the Clingo solver, which is based on Answer Set Programming (ASP) which allows for enumeration of these models.

We will set up our base knowledge base as before:

from typedlogic.integrations.solvers.clingo import ClingoSolver
solver = ClingoSolver()
import paths.axioms_strict as paths_axioms_strict
solver.load(paths_axioms_strict)
for link in links:
    solver.add(link)

Next we'll add uncertainly using the ExactlyOne boolean sentence type:

from typedlogic import ExactlyOne
solver.add(ExactlyOne(Link("MT", "ND"), Link("MT", "SD")))

We'll check this gives two worlds:

models = list(solver.models())
assert len(models) == 2

Next we will explore which paths are present in each world:

for n, model in enumerate(models):
    print(f"Model: {n}")
    for t in model.iter_retrieve("Path", "CA", "ND"):
        print(f" has path CA to ND: {t}")
    for t in model.iter_retrieve("Path", "CA", "SD"):
        print(f" has path CA to SD: {t}")

Model: 0
 has path CA to SD: Path(CA, SD)
Model: 1
 has path CA to ND: Path(CA, ND)

As expected there is one world with a path from CA to SD, and another with a path from CA to ND.

We can collapse the number of universes by ruling out a path. So far we have only asserted the base one-hop links, but there is nothing to stop us making assertions about multi-hop paths. We can do this by adding a negation of a path to the solver, here stating there is no path from CA to ND:

from paths.model import Path
solver.add(~Path("CA", "ND"))

This will collapse the number of models to one:

models = list(solver.models())
assert len(models) == 1