The Egison Programming Language

In this and the next chapters, we briefly explain the syntax of Egison. This chapter explains an aspect of Egison as an ordinal purely functional programming language. We will explain patterns and pattern-matching, the most important feature of Egison in the next chapter. The readers familiar to other functional programming languages can skip this chapter.

Top Expressions

define and test Expressions
load and load-file Expressions

Built-in Data

Boolean Values
Integers
Rational Numbers
Floats
Characters
Strings
Undefined

Objects

Six Enclosures
Inductive Data
Tuples (Multiple Values)
Collections
Tensors
Hash Maps

Syntax of Egison

lambda expressions
cambda expressions
let expressions
let* expressions
letrec expressions
if expressions
capply expressions
generate-tensor expressions
tensor-shape expressions
seq expressions

Top Expressions

`define` and `test` Expressions

We first explain two kinds of top expressions, define expressions and test expressions. A define expression binds the variable to the value or the function. A test expression evaluates the given expression. We can try a function defined with a define expression using a test expression. We can omit test.

(define $x 2)
(test (+ x 3));=>5
  
(define $f (lambda [$x $y] [(+ x y) (* x y)]))
(test (f 2 4));=>[6 8]
; We can omit 'test'.
(f 2 4);=>[6 8]

`load` and `load-file` Expressions

We can load Egison libraries with load expressions.

To load your own program, we can use load-file expressions. A load-file expression takes a full-path or a relative path to the Egison program file.

; Load Egison library
(load "lib/core/number.egi")

; Load your program
(load-file "myfile.egi")

Built-in Data

Boolean values, integers, float numbers, and characters are implemented as built-in data in Egison.

Boolean Values

#t represents true.
#f represents false.

;; Boolean values
#t ; True
#f ; False

Integers

A string which consists of only numbers is a number literal. An integer literal is a number literal or concatenation of - and a number literal. We support complex numbers as built-in data.

;; Integers
1
0
-100

;; Gaussian integers
(+ 1 i)
(* 2 i)

Rational Numbers

We are supporting rational numbers.

;; Rational numbers
(/ 3 2)
(/ 6 4);=>(/ 3 2)

Floats

A floating-point literal is concatenation of an integer literal, ., and a number literal.

;; Float numbers
10.2
-100.1

;; Complex numbers
1.0+1.0i
-3.5i

Characters

A character literal is a character enclosed in single quotes.

;; Characters
c#a
c#1
c#\n

Strings

A string literal is a string enclosed in double quotes.

;; Strings
"Hello world!\n"

Undefined

undefined is a useful built-in data that you can put where you have not written yet.

undefined

Objects

Six Enclosures

First of all, we explain the enclosures of Egison. Perhaps you are surprised at how many kinds of enclosures in Egison when you first see code of Egison. In Egison, there are six kinds of enclosures, i.e. parentheses ( ), brackets < >, square brackets [ ], braces { }, double brackets [| |], and double braces {| |}. They represent procedure call, an inductive datum, a tuple, a collection, an tensor, and a hash, respectively. Roughly speaking, you should use parentheses as LISP, brackets when you want to create patterns or your own data, double brackets to create a tensor, and double braces to create an hash. Square brackets and braces are both used to collect values. The difference between them is the number of values. The former are used to collect the fixed number of values, but the latter are used to collect the others.

Inductive Data

  <Identifier value ...>

We can create an new object combining objects. It is called an inductive datum. An inductive datum can have values. In particular, it can have any inductive data. This is why it is called "inductive". Note that the name has to start with uppercase.

;; Inductive data
<Card <Diamond> 13>
<Node 3 <Node <Leaf 1> <Leaf 2>> <Leaf 4>>

Tuples (Multiple Values)

  [value ...]

A tuple is expressed as a sequence of elements enclosed in square brackets.

;; Tuples
[]
[1 2]
[#t "abc" 4]

If a tuple consists of an element, the tuple and the element are the same.

[1];=>1
[[[[[[[[[[["too many"]]]]]]]]]]];=>"too many"

Collections

  {value ...}

A collection is a sequence of elements enclosed in braces. Unlike a tuple, a collection of an element and its unique element differ.

;; Collections
{}
{1 2 3 4 5}

A collection may contain a collection as its element. Generally, an element of a collection that is an element of a collection isn't an element of the outer collection. @ placed before a collection breaks the collection. Then, an element of a collection with @ that is an element of a collection is an element of the outer collection. Using this notation, you can construct a collection from subcollections.

{1 @{2 3} 4 5};=>{1 2 3 4 5}
{1 @{2 @{3 4}} 5};=>{1 2 3 4 5}

Tensors

  [|value ...|]

a tensor is a sequence of elements enclosed in double brackets. Adding an underscore _ and an index at the end, you can get the associated element of a tensor. If the index is larger than the size of a tensor, you will get an error.

;; Tensors
(define $t [| 1 2 3 4 5 |])
(tensor-shape t);=> {5}
t_1;=>1
t_5;=>5
t_8;=>error!

We can get the size of a tensor with tensor-shape.

(tensor-shape [| 1 2 3 4 5 |]);=>[1 5]

a tensor can have another tensor as its element. It allows us to use multi-dimensional tensors.

[| [| 1 2 3 |] [| 4 5 6 |] [| 7 8 9 |] |]_1;=>[| 1 2 3 |]
[| [| 1 2 3 |] [| 4 5 6 |] [| 7 8 9 |] |]_2_3;=>6

Egison prepares special syntax for tensors. They are generate-tensor, tensor-shape, and tensor-ref. The former gives an easy way to create complicated tensors, and the latter shows the size of tensors. The details are described in each subsection.

Hash Maps

  {|[key value] ...|}

A hash map is a sequence of key-value pairs enclosed in double braces. Adding an underscore _ and an index at the end, we can get the associated element of a hash map. If the index is not a key of a hash map, we will get undefined.

{| [1 11] [2 12] [3 13] [4 14] [5 15] |}_1;=>11
{| [1 11] [2 12] [3 13] [4 14] [5 15] |}_4;=>14
{| [1 11] [2 12] [3 13] [4 14] [5 15] |}_8;=>undefined

Syntax of Egison

`lambda` expressions

  (lambda [variable ...] formula)

Lambda expressions make functions as other functional programming languages. It takes two arguments. The first one is a tuple of variables, which are the dummy variables of the function. Note [$x] and $x are the same. The second argument is a formula, which is the body of the function.

((lambda [$x $y] (+ x y)) 3 7);=>10
((lambda $x (not x)) #t);=>#f

From ver.3.0, a lambda expression is equipped with simpler notation. In this notation, you can omit to write "lambda" and the arguments of the function. You can refer to the i-th argument by writing concatenation of $ and i. If the order of occurrences of the arguments is the same as the order of the arguments and their occurrences are exactly one, then you can omit a number after $. That is, (lambda [$x $y] (+ x y)), (+ $1 $2), and (+ $ $) are the same. Although this notation is so powerful, it is limited to specific functions. The body of such a function has to be simple. Namely, it is application of a function and its arguments, and all occurrences of $i are the arguments. For example, you can't write (+ $1 (* $2 2)) or (if $1 #f #t)

((+ $1 $2) 3 4);=>7
((+ $ $) 3 4);=>7
((* $1 $1) 5);=>25
((map $2 $1) {1 2 3} (+ $ 1));=>{2 3 4}

`cambda` expressions

  (cambda variable formula)

The name cambda comes from the combination of collection and lambda. cambda differs from lambda in that it can take arbitrary number of arguments. For instance, the following code defines a function which calculates the average of its arguments.

(define $average (cambda $xs (/ (sum xs) (length xs))))
(average 1 2 3);=>2
(average 1 2 3 4 5);=>3

Note that cambda requires only one argument variable, which is bound to the collection consisting of its arguments (e.g. in the case of (average 1 2 3), the variable xs is bound to {1 2 3}).

`let` expressions

  (let {[variable formula] ...} formula)

A let expression takes two arguments. The first argument is a collection of binary tuples, which are pairs of a variable and a formula. These formulas will be evaluated when the associated variable is needed in an evaluation of the second argument, and then the variable is bound to them. Since a formula in the first argument is evaluated with the original environment, you can't use variables in the first argument in the formula.

(let {[$x 1] [$y 2]} (+ x y));=>3

`let*` expressions

  (let* {[variable formula] ...} formula)

A let* expression is a syntax suger to avoide nested let expressions. This expression is desugared as follows.

(let* {[$x 2] [$y (+ x 1)]} y)
;=>(let {[$x 2]} (let {[$y (+ x 1)]} y))
;=>3

`letrec` expressions

  (letrec {[variable formula] ...} formula)

A letrec expression is the same as a let expression except the fact that you can use recursive definition in the first argument. Mutual recursion is also allowed.

(letrec {[$x #t] [$y x]} (not y));=>#f

(letrec {[$evens {2 @(map (+ $ 2) evens)}]}
  (take 10 evens))
;=>{2 4 6 8 10 12 14 16 18 20}

(letrec {[$odds {1 @(map (+ $ 1) evens)}]
         [$evens (map (+ $ 1) odds)]}
  (take 10 evens))
;=>{2 4 6 8 10 12 14 16 18 20}

`if` expressions

  (if boolean formula formula)

It's ordinary if. But, note the result of an evaluation of the first argument must be a boolean value (i.e. #t or #f).

(if #t "YES" "NO");=>"YES"
(if #f "YES" "NO");=>"NO"

`capply` expressions

  (capply function {value ...})

If you have a function and its arguments as a collection, and want to get the result of the application, then you should use this expression. The result of the evaluation of a capply expression is the result of application of the function with the arguments. That is, (capply f {x₀ x₁ x₂}) is the same thing as (f x₀ x₁ x₂).

(capply + {1 2});=>3

`generate-tensor` expressions

  (generate-tensor function collection-of-natural-numbers)

Egison is equipped with two ways to generate a tensor. One is to write elements explicitly using double brackets [| |]. The other is this generate-tensor expression.

(generate-tensor (lambda [$i] i) {5})
;=>[| 1 2 3 4 5 |]
(generate-tensor + {2 2})
;=>[| [| 2 3 |] [| 3 4 |] |]

The first argument is the index variable, the second is the size of each dimension, and the third argument determines each element of the tensor. Note the first argument and the second argument have the same number of elements. For example, the above and the followings are examples to create a tensor whose size is 5 and 5×3, respectively.

(generate-tensor * {5 3})
;=>[|[|1 2 3|] [|2 4 6|] [|3 6 9|] [|4 8 12|] [|5 10 15|]|]
(generate-tensor (lambda [$x $y] (+ (* 10 x) y)) {5 3})
;=>[|[|11 12 13|] [|21 22 23|] [|31 32 33|] [|41 42 43|] [|51 52 53|]|]

`tensor-shape` expressions

  (tensor-shape tensor)

An tensor-shape expression tells us the start index and last index of a given tensor.

(tensor-shape [| 1 2 3 4 5 6 |]);=>{6}
(tensor-shape [| |]);=>{}
(tensor-shape [| [| 1 2 3 |] |]);=>{1 3}
(tensor-shape (generate-tensor + {3 5}));=>{3 5}

`seq` expressions

  (seq expr expr)

Egison's seq expression derives from Haskell's seq expression. The first argument of seq is strictly evaluated. The most popular use case of seq is in the definition of the foldl function.

(define $foldl
  (lambda [$fn $init $ls]
    (match ls (list something)
      {[<nil> init]
       [<cons $x $xs>
        (let {[$z (fn init x)]}
          (seq z (foldl fn z xs)))]})))

What to do next...

Next Chapter: Basic of Patterns Top of Manual Back to Home

Basics of Syntax and Semantics