Syntax Overview

Tokens

Layout

A colon begins a layout block, which contains every token or bracketed item indented at least as much as the first token following the colon.

foo:
  bar
  baz
  quux
// ===
foo {
  bar
  baz
  quux
}

foo: bar
     baz
     quux
// ===
foo { bar
      baz
      quux }

Comments

// Single-line comment.

/*
  Multi-line
  comment.
*/

/*
  Nested
  /* multi-line */
  comment.
*/

Literals

Integers

Decimal integer literals consist of one or more decimal digits. Binary, octal, and hexadecimal literals begin with the appropriate prefix: 0b, 0o, or 0x, respectively:

0
1
123

0b1100  // 12

0o777   // 511

0xFF    // 255
0xabcd  // 43981

By default, integers have type Int32—the type of a literal is not inferred or polymorphic. You can select a different type with a suffix consisting of i (for signed integers) or u (for unsigned) followed by a number of bits. A Kitten implementation is guaranteed to support 8, 16, 32, and 64-bit integer sizes, but may support additional sizes such as 128-bit or arbitrary-precision integers.

1     // Int32
1i8   // Int8
1i16  // Int16
1i32  // Int32
1i64  // Int64
1u8   // UInt8
1u16  // UInt16
1u32  // UInt32
1u64  // UInt64

Integer literals may be prefixed with a sign character, +, -, or (U+2122 MINUS SIGN).

Floating-point Numbers

Floating-point numbers consist of one or more decimal digits separated by a decimal point, and optionally followed by a decimal exponent in scientific notation.

1.0
0.5
.5      // 0.5f64
1.      // 1.0f64
3.14
1.0e+6  // 1000000.0

The default type of a floating-point literal is Float64. You can select a different type with a suffix of f32 or f64—implementations may support additional types such as f80.

1.0     // Float64
1.0f64  // Float64
1.0f32  // Float32

Like integer literals, floating-point literals (and their exponent parts) may be prefixed with a sign character.

Characters

A character literal consists of a single Unicode code point or character escape, surrounded by single quotes (apostrophes). Character literals have type Char.

'a'
'é'
'\n'

Valid character escapes include:

  • \a—BEL (U+0007)
  • \b—BS (U+0008)
  • \f—FF (U+000C)
  • \n—LF (U+000A)
  • \r—CR (U+000D)
  • \t—TAB (U+0009)
  • \v—VT (U+000B)
  • \'' (U+0027)
  • \"" (U+0022)
  • \\\ (U+005C)

You can use U+2018 LEFT SINGLE QUOTATION MARK and U+2019 RIGHT SINGLE QUOTATION MARK instead of apostrophes.

Lists

A list literal consists of a series of comma-separated terms between square brackets. A trailing comma is allowed.

[]          // <T> List<T>
[1]         // List<Int32>
[1.0, 2.0]  // List<Float64>
[
  'a',
  'b',
  'c',
]           // List<Char>

Text

A text literal consists of a series of zero or more Unicode characters or escapes surrounded by double quotes. It has type List<Char>.

""
"meow"
"foo\n\tbar\n"

You can use U+201C LEFT DOUBLE QUOTATION MARK and U+201D RIGHT DOUBLE QUOTATION MARK instead of apostrophes. Unlike ASCII double quotes, the Unicode quotes can be nested, as long as they match—that is, all opening quotes have a corresponding closing quote before the end of the literal:

“”
“meow”
“He said, “Meow”.”

Paragraphs

A paragraph literal is a text literal that may span multiple lines. It begins with three double quotation marks """ followed by a newline, and ends with three double-quotes as well. All lines in a paragraph literal must begin with the same whitespace prefix, or be blank. This indentation is stripped from the resulting text. Paragraph literals are often used for documentation.

about foo:
  docs: """
    This is a paragraph literal.

    It spans multiple lines.
    """
    // "This is a paragraph literal.\n\nIt spans multiple lines."

"""
This one ends with a newline.

"""
// "This one ends with a newline.\n"

Quotations

A quotation is a series of zero or more terms surrounded by curly brackets, or preceded by a colon and delimited by indentation. It may allocate a reference-counted closure.

{}     // <R...> (R... -> R...)
{ 1 }  // <R...> (R... -> R..., Int32)

// <R...> (R... -> R..., Int32, Int32 +IO)
:
  1
  2
  "foo" say

Terms

Words

A word is an identifier that refers to a function, such as say or map. There are a number of built-in words:

  • call invokes a closure

    <R..., S..., +P> (R..., (R... -> S... +P) -> S... +P)

    <R…, S…> (R…, (R… → S…) → S…)

  • jump tail-calls a closure

  • return jumps to the end of the current definition

  • loop jumps to the start of the current definition

Locals

Local variables switch from function-level programming to data-level programming by moving a value from the stack into a local variable. They consist of a rightward arrow followed by a comma-separated list of one or more names, terminated with a semicolon.

"foo" -> name;
name say

1 2 3 -> x, y, z;
(x + y + z) say

Instead of a semicolon, a block may be specified; this is syntactic sugar for a quotation beginning with local variable introductions.

-> x { x + 1 }
// ===
{ -> x; x + 1 }

This is intended to combine with case branches and do blocks:

match (something_optional)
case some -> x:
  x say
…
// ===
match (something_optional)
case some:
  -> x;
  x say
…

do (map) -> x:
  x + 1
// ===
{ -> x; x + 1 } map

do

do allows higher-order functions to be used as prefix control-flow syntax.

do (f) { g }
// ===
{ g } f

match

match is the inverse of a constructor: while a constructor takes fields from the stack and constructs an instance of an algebraic data type, match takes an ADT from the stack, dispatches on its tag, and expands the fields back onto the stack.

match (scrutinee)
case constructor_1 -> field_1, field_2, …:
  …
case constructor_2:
  …
else:
  …

If no else branch is specified, the default branch calls abort, making the match expression require the +Fail permission. If all constructors are covered by case branches, then the else branch is redundant and +Fail is not required.

if

if evaluates a Boolean condition. If true, it evaluates the true branch. Otherwise, it evaluates the else branch, if one is present. If no else branch is specified, it’s equivalent to else {}. You can specify multiple conditions by adding elif (condition) { block } clauses. if is syntactic sugar for match.

if (condition_1) {
  branch_1
} elif (condition_2) {
  branch_2
} else {
  false_branch
}

// ===

condition_1 match
case true {
  branch_1
} case false {
  condition_2 match
  case true {
    branch_2
  } case false {
    false_branch
  }
}

The condition of an if may be drawn from the stack.

condition
if {
  true_branch
} else {
  false_branch
}

condition
if {
  true_branch
}

Program Elements

Vocabularies

A vocabulary is a group of related names, introduced with the vocab keyword followed by a name and a block. Vocabularies may be nested.

vocab math:

  define successor (Int32 -> Int32):
    (+ 1)

  vocab experimental:
    define predecessor (Int32 -> Int32):
      (- 1)

// ==

define math::successor (Int32 -> Int32):
  (+ 1)
define math::experimental::predecessor (Int32 -> Int32):
  (- 1)

To reduce nesting, the block may be replaced with a semicolon, in which case all following code until the next vocab element is placed in that vocabulary.

vocab math;

define successor (Int32 -> Int32):
  (+ 1)

vocab math::experimental;

define predecessor (Int32 -> Int32):
  (- 1)

Word Definitions

A word is a user-defined name for a function or infix operator.

define double (Int32 -> Int32):
  2 (*)

A definition begins with the define keyword, followed by a type signature and a block.

Metadata

An about block contains a set of key-value pairs, where the keys are Kitten identifiers and the values are untyped terms. It’s intended to subsume the special syntax, pragmas, and magic comments that other languages use for denoting metadata.

about +:
  docs: """
    The operation of an additive monoid
    with `zero` as the identity.
    """
  operator:
    left 6
  inline:
    always

Types

A type describes a user-defined shape of data constructed with sums and products of primitive types. There are three basic kinds of types, using the same notation:

// Enumerations
type Bool:
  case false
  case true

// Structures
type Pair<A, B>:
  case pair (A, B)

// Tagged Unions
type Optional<T>:
  case none
  case some (T)

It’s possible to explicitly specify that a constructor takes no arguments.

type Bool:
  case false ()
  case true ()

type Optional<T>:
  case none ()
  case some (T)

Permissions

A permission is a word that grants permission to do something to a closure that needs that permission to run.

permission Locked
  <R..., S...> (R... -> (R... -> S... +Locked) -> S...)
{
  take_lock
  call
  release_lock
}

Traits and Instances

A trait definition declares a generic function, which may have different implementations for different concrete types:

trait show<T> (T -> List<Char>)

These implementations are called instances of the trait, and the type of an instance must be an instance of the signature of its parent trait.

instance show (List<Char> -> List<Char>)
instance show<T> (List<T> -> List<T>)

Note: generic instances not implemented.

Synonyms

A synonym is an alias for an existing name.

synonym name (existing_name)

Type Signatures

Type signatures are one of the most complex areas of Kitten’s syntax, but they use familiar conventions from other languages and provide syntactic sugar to remain readable.

All definitions denote functions, so function types are among the first types you will use. They are represented with a rightwards arrow (-> or Unicode ), with the types of the inputs and outputs written as comma-separated lists on the left and right sides, respectively.

// Basic function with one input and output
define inc (Int32 -> Int32):
  (+ 1)

// Multiple inputs
define add (Int32, Int32 -> Int32):
  (+)

// Multiple outputs
define inc_dec (Int32 -> Int32, Int32):
  -> x;
  (x + 1) (x - 1)

// No inputs
define two (-> Int32):
  2

// No outputs
define drop_int (Int32 ->):
  -> _;

A function type accepts a set of permissions, which represent actions the function is allowed to take. Permissions are written with a plus sign followed by a permission name, such as +IO, written after a function’s return types:

define yell (List<Char> -> +IO):
  (+ "!") say

define ask (List<Char> -> List<Char> +IO +Fail):
  print get_line -> answer;
  if (answer empty):
    "no input" fail
  else:
    answer

Words in Kitten are often generic, able to operate on values of any type. A generic type begins with a quantifier, consisting of a comma-separated list of type variable names in angle brackets (<>). By convention, type variable names are typically named with single capital letters.

// Duplicate a value of any type.
define dup<T> (T -> T, T):
  -> x; x x

// Swap two values of any types.
define swap<A, B> (A, B -> B, A):
  -> x, y; y x

The quantifier part <T> is written with no space between it and the word name (e.g., dup<T>) in order to mimic the generic type syntax of other programming languages. However, it’s important to understand that the quantifier is associated with the type, not the name: it’s dup and <T> (T -> T, T), not dup<T> and (T -> T, T).

There are a few basic kinds of type variables. Value type variables (such as T, A, and B above) are those that refer to a type like Int32 or List<Char>, which are inhabited by values. Stack type variables are suffixed with an ellipsis (... or Unicode ), and refer to a series of types on the stack. For example, the type of the call word is:

<R..., S...> (R..., (R... -> S...) -> S...)

Which means that it takes some stack R..., with a closure on top of type R... -> S..., and applies the closure to the stack to produce the result stack S.... All functions are generic in the part of the stack that they don’t touch, so a type like Int32 -> Int32, Int32 is syntactic sugar for <S...> (S..., Int32 -> S..., Int32, Int32)—a function with this type takes any stack S... with an Int32 on top, and returns the same stack S... with two Int32 values on top.

Finally, permission type variables are prefixed with a plus sign (+), and refer to sets of permissions such as +IO or +Unsafe +Fail. All functions are generic in the permissions that they don’t use, so a type like Int32 -> Int32, Int32 is syntactic sugar for <+P> (Int32 -> Int32, Int32 +P). By default, all function types in the same type signature are given the same implicit permission type variable. Take the type of map:

<A, B> (List<A>, (A -> B) -> List<B>)

This is syntactic sugar for:

<A, B, +P> (List<A>, (A -> B +P) -> List<B> +P)

Which means that map requires the same set of permissions +P as the function you pass to it, because map calls that function.

results matching ""

    No results matching ""