# Joshua Munn's Website

## About

## Log

### - microKanren, cons, and Python (part I)

I've been implementing µKanren^{1} in Python. One challenge so far has been creating a sequence abstraction with semantics similar to Lisp lists. This is important as it's necessary to be able to decompose sequences into parts for unification, such that:

- I can get the first element of a sequence;
- I can get the rest of the sequence; and
- the "rest" of the sequence might be a sequence or a non-iterable value (including logic variables).

#### Cons

The ubiquitous data structure in Lisp-like languages is a two part cell known as a cons or a pair. It can be thought of as a struct with two fields, instances of which are created using the function `cons`

. Conceptually, `(cons 1 2)`

yields the pair containing `1`

in its first part and `2`

in its second part.

┌───┬───┐ │ 1 ┊ 2 │ └───┴───┘ (cons 1 2)

Lists are composed of pairs - a proper list is a pair in which the second part contains either a proper list or the empty list.

`(cons 1 (cons 2 (cons 3 '())))`

creates the list containing 1, 2, and 3 ^{2}^{, }^{3}:

┌───┬───┐ ┌───┬───┐ ┌───┬───┐ │ 1 ┊ ╾────┤ 2 ┊ ╾────┤ 3 ┊ ∅ │ └───┴───┘ └───┴───┘ └───┴───┘ A proper list ┌───┬───┐ ┌───┬───┐ │ 1 ┊ ╾────┤ 2 ┊ 3 │ └───┴───┘ └───┴───┘ An improper list (not terminated by the empty list) I do enjoy box diagrams!

#### Python

Getting back to Python: first you might try to emulate the behaviour of Lisp lists using the builtin `list`

(or `tuple`

), which meets our first two criteria and can be conveniently decomposed with unpacking assignments:

the_list = [1, 2, 3, 4] first, *rest = the_list # also the_list[0], the_list[1:] first, rest

(1, [2, 3, 4])

This unfortunately fails the third criteria - I want to be able to get the "rest" of the list, allowing the "rest" to be a list *or* something else.

I would like to be able to construct a sequence such that the terminating element is some entity that represents that the sequence is not yet fully instantiated - I don't know what the rest of the sequence is, yet - but it may become a proper sequence at some point in time. That last sentence seems a little "spooky" in the Fox Mulder sense, so here's an example:

`(1, 2, 3 ? x)`

represents a sequence in which the suffix is unknown;- if at some point in time
`x`

is found to represent the sequence`(4, 5)`

, the original sequence is now known to be`(1, 2, 3, 4, 5)`

; - if
`x`

is found to represent the single element`4`

, the sequence is now known to be`(1, 2, 3, 4)`

; and - if
`x`

is found to represent the empty sequence, the sequence is now known to be`(1, 2, 3)`

.

#### A new data type

One could try to conditionally handle the retrieval of the "rest" part of a sequence everywhere it is needed, but I think a cleaner approach is to encapsulate the desired behaviour in a new data type. My first attempt was a class that encapsulated the semantics of unpacking assignments, `UnpackingAssignment`

. It looked something like this:

class UnpackingAssignment: def __init__(self, *heads, glob=None, rest=None): self.heads = heads self.glob = glob self.rest = rest pack = UnpackingAssignment # conflicting naming, but it's a convenient shorthand pack(1, 2, glob=[3, 4], rest=[5]) # a, b, *c, d = [1,2,3,4,5] pack(glob=[3, 4], rest=[5, 6]) # *a, b, c = [3,4,5,6] pack(1, glob=[2, 3]) # a, *b = [1,2,3]

The variadic nature of the fields makes unification tricky, particularly once you add logic variables and arbitrarily deeply nested structures into the mix. One area of difficulty is unifying `UnpackingAssignment`

instances that contain "overlapping" fields. In the following examples, variable names represent logic variables, `≡`

represents unification, and `{ u₀ ↦ v₀, u₁ ↦ v₁, ... }`

represents the resulting set of mappings.

Some simple cases:

`pack(1, glob=x) ≡ [1, 2, 3]`

`{ x ↦ [2, 3] }`

`pack(x, glob=[2, 3]) ≡ [1, 2, 3]`

`{ x ↦ 1 }`

`pack(glob=x, heads=[3]) ≡ [1, 2, 3]`

`{ x ↦ [1, 2]}`

`pack(x, glob=[2, 3], rest=[y]) ≡ [1, 2, 3, 4]`

`{ x ↦ 1, y ↦ 4 }`

And some more interesting ones:

`pack(1, glob=x, rest=y) ≡? [1, 2, 3, 4]`

- intuition says that the set of results should be:
`{ x ↦ [], y ↦ [1, 2, 3, 4] }`

`{ x ↦ [1], y ↦ [2, 3, 4] }`

`{ x ↦ [1, 2], y ↦ [3, 4] }`

(and so on)

- intuition says that the set of results should be:
`pack(1, 2, glob=x, rest=[4, 5]) ≡? pack(1, glob=y, pack(glob=x, rest=[4, 5]))`

- intuitively these terms should unify with
`{ x ↦ [3], y ↦ [2] }`

- intuitively these terms should unify with
`pack(x, glob=[2, 3, 4], rest=[5, 6, 7]) ≡? pack(1, 2, glob=[3], rest=y)`

- again, there is an intuitive set of results (
`{ x ↦ 1, y ↦ [4, 5, 6, 7] }`

), but translating these intuitions into an implementation feels a bit like solving the Towers of Hanoi in the dark.

- again, there is an intuitive set of results (

### - A miniKanren quine

Here's a miniKanren quine that runs on the miniKanren implementation from The Reasoned Schemer:

((lambda (_) (run* x (conde ((≡ x _)) ((≡ x (list 'quote _)))))) '(lambda (_) (run* x (conde ((≡ x _)) ((≡ x (list 'quote _)))))))

Depending on how one interprets the rules, the quine posted below may not qualify. From Wikipedia:

A quine is a computer program which takes no input and produces a copy of its own source code as its only output.

If you take `output`

to mean something like *"textual data written to a file"*, then it does satisfy the criteria. However, if you extend `output`

to mean *"the result of evaluation"*, it does not, as the result of evaluation is:

('(q:={q!r},print(f{q!r}))', None)

I guess this is indicative of differences between Python and languages in the Lisp family. The miniKanren quine above plays loose with line breaks, in the spirit of its output being the result of evaluating it in a REPL.

### - A Python quine

Here's a Python 3 quine I wrote:

(q:='(q:={q!r},print(f{q!r}))',print(f'(q:={q!r},print(f{q!r}))'))

## Footnotes:

^{1}

the (very readable) paper, graciously hosted by William Byrd