# Josh Munn's Website

microkanren, cons, and Python

## Table of Contents

I've been implementing µKanren^{1}^{, }^{2} in Python. One challenge so far has been creating a sequence abstraction with semantics similar to Lisp lists. This is important as it's necessary to be able to decompose sequences into parts for unification, such that:

- I can get the first element of a sequence;
- I can get the rest of the sequence; and
- the "rest" of the sequence might be a sequence or a non-iterable value (including logic variables).

## Cons

The ubiquitous data structure in Lisp-like languages is a two part cell known as a cons or a pair. It can be thought of as a struct with two fields, instances of which are created using the function `cons`

. Conceptually, `(cons 1 2)`

yields the pair containing `1`

in its first part and `2`

in its second part.

┌───┬───┐ │ 1 ┊ 2 │ └───┴───┘ (cons 1 2)

Lists are composed of pairs - a proper list is a pair in which the second part contains either a proper list or the empty list.

`(cons 1 (cons 2 (cons 3 '())))`

creates the list containing 1, 2, and 3 ^{3}^{, }^{4}:

┌───┬───┐ ┌───┬───┐ ┌───┬───┐ │ 1 ┊ ╾────┤ 2 ┊ ╾────┤ 3 ┊ ∅ │ └───┴───┘ └───┴───┘ └───┴───┘ A proper list ┌───┬───┐ ┌───┬───┐ │ 1 ┊ ╾────┤ 2 ┊ 3 │ └───┴───┘ └───┴───┘ An improper list (not terminated by the empty list)

## Python

Getting back to Python: first you might try to emulate the behaviour of Lisp lists using the builtin `list`

(or `tuple`

), which meets our first two criteria and can be conveniently decomposed with unpacking assignments:

the_list = [1, 2, 3, 4] first, *rest = the_list # also the_list[0], the_list[1:] first, rest

(1, [2, 3, 4])

This unfortunately fails the third criteria - I want to be able to get the "rest" of the list, allowing the "rest" to be a list *or* something else.

I would like to be able to construct a sequence such that the terminating element is some entity that represents that the sequence is not yet fully instantiated - I don't know what the rest of the sequence is, yet - but it may become a proper sequence at some point in time. That last sentence seems a little "spooky" in the Fox Mulder sense, so here's an example:

`(1, 2, 3 ? x)`

represents a sequence in which the suffix is unknown;- if at some point in time
`x`

is found to represent the sequence`(4, 5)`

, the original sequence is now known to be`(1, 2, 3, 4, 5)`

; - if
`x`

is found to represent the single element`4`

, the sequence is now known to be`(1, 2, 3, 4)`

; and - if
`x`

is found to represent the empty sequence, the sequence is now known to be`(1, 2, 3)`

.

## A new data type

One could try to conditionally handle the retrieval of the "rest" part of a sequence everywhere it is needed, but I think a cleaner approach is to encapsulate the desired behaviour in a new data type. My first attempt was a class that encapsulated the semantics of unpacking assignments, `UnpackingAssignment`

. It looked something like this:

class UnpackingAssignment: def __init__(self, *heads, glob=None, rest=None): self.heads = heads self.glob = glob self.rest = rest pack = UnpackingAssignment # conflicting naming, but it's a convenient shorthand pack(1, 2, glob=[3, 4], rest=[5]) # a, b, *c, d = [1,2,3,4,5] pack(glob=[3, 4], rest=[5, 6]) # *a, b, c = [3,4,5,6] pack(1, glob=[2, 3]) # a, *b = [1,2,3]

The variadic nature of the fields makes unification tricky, particularly once you add logic variables and arbitrarily deeply nested structures into the mix. One area of difficulty is unifying `UnpackingAssignment`

instances that contain "overlapping" fields. In the following examples, variable names represent logic variables, `≡`

represents unification, and `{ u₀ ↦ v₀, u₁ ↦ v₁, ... }`

represents the resulting set of mappings.

Some simple cases:

`pack(1, glob=x) ≡ [1, 2, 3]`

`{ x ↦ [2, 3] }`

`pack(x, glob=[2, 3]) ≡ [1, 2, 3]`

`{ x ↦ 1 }`

`pack(glob=x, heads=[3]) ≡ [1, 2, 3]`

`{ x ↦ [1, 2]}`

`pack(x, glob=[2, 3], rest=[y]) ≡ [1, 2, 3, 4]`

`{ x ↦ 1, y ↦ 4 }`

And some more interesting ones:

`pack(1, glob=x, rest=y) ≡? [1, 2, 3, 4]`

- intuition says that the set of results should be:
`{ x ↦ [], y ↦ [1, 2, 3, 4] }`

`{ x ↦ [1], y ↦ [2, 3, 4] }`

`{ x ↦ [1, 2], y ↦ [3, 4] }`

(and so on)

- intuition says that the set of results should be:
`pack(1, 2, glob=x, rest=[4, 5]) ≡? pack(1, glob=y, pack(glob=x, rest=[4, 5]))`

- intuitively these terms should unify with
`{ x ↦ [3], y ↦ [2] }`

- intuitively these terms should unify with
`pack(x, glob=[2, 3, 4], rest=[5, 6, 7]) ≡? pack(1, 2, glob=[3], rest=y)`

- again, there is an intuitive set of results (
`{ x ↦ 1, y ↦ [4, 5, 6, 7] }`

), but translating these intuitions into an implementation feels a bit like solving the Towers of Hanoi in the dark.

- again, there is an intuitive set of results (