Regular languages

By Martin McBride, 2018-05-09
Tags: none
Categories: none

A written language uses set of symbols that are be arranged according to certain rules, to express meaning.

The sentence above, of course, is written in the English language. The symbols it uses the characters of the alphabet, plus a few punctuation symbols. It has rules of spelling that allow you to construct words, and also rules of grammar that allow you to construct sentences. However, in the case of the English language, the rules are not completely precise. There are differences of opinion on the correct spelling of some words, and also some of the finer points of grammar. Not only that, but people often write English that is indisputably incorrect, but we can usually still understand it perfectly well.

Computer languages

Computer languages also have rules, but they are generally far more precisely defined than natural languages such English. For example, consider this Python code:

def sum(a, b):
    return a + b

This defines a function that adds two numbers together. Here are some of the rules:

  • The word def and return are called keywords, and must be written exactly as shown. We can't decide to use define to define our function, we can't even use a capital letter Def, that would be incorrect and the program wouldn't run.
  • We can choose our own name for the function sum, and the variables a and b. We could have called it add(x, y) for example. But there are rules about these names - the first character must be either a letter or an underscore, and the remaining characters must be letters, numbers or underscores.
  • The overall structure of the function definition must also follow rules. A simple function like this must start with def followed by the function signature and colon character. The body of the function follows this line, and must be indented.

Regular languages

A regular language is a particular type of computer language that obeys specific rules. Regular languages are often quite simple and apply to things such as number formats (eg 0xAA55 for a hex number), Huffman codes, simple file formats and similar.

There are various ways to define a regular language. One definition is that a regular language can be parsed by a finite state machine). We will look at that is more detail in that section.

Most programming languages, such as Python, Java, C etc are not regular languages, see this section

See also

Sign up to the Creative Coding Newletter

Join my newsletter to receive occasional emails when new content is added, using the form below:

Popular tags

555 timer abstract data type abstraction addition algorithm and gate array ascii ascii85 base32 base64 battery binary binary encoding binary search bit block cipher block padding byte canvas colour coming soon computer music condition cryptographic attacks cryptography decomposition decryption deduplication dictionary attack encryption file server flash memory hard drive hashing hexadecimal hmac html image insertion sort ip address key derivation lamp linear search list mac mac address mesh network message authentication code music nand gate network storage none nor gate not gate op-amp or gate pixel private key python quantisation queue raid ram relational operator resources rgb rom search sort sound synthesis ssd star network supercollider svg switch symmetric encryption truth table turtle graphics yenc