1. Introduction

In this tutorial, we’ll study the organization of a programming language. As for a natural language, programming languages use a set of words considered valid. Moreover, a programming language specifies rules of how to dispose of these words in a source code. In such a way, programming languages must be able to judge if the written code represents valid and logical statements.

So, a programming language lexicon, syntax, and semantics provide information for programmers on how to express and correctly execute multiple operations in their source codes.

In the following section, we’ll first understand what a language lexicon, syntactic, and semantics are in a broad sense. So, we’ll see how these concepts are practically mapped to the programming languages context. We conclude the tutorial by summarizing the explored concepts and presenting final remarks.

2. The Organization of Generic Languages

Maybe, when we think or talk in our native language, we do not notice the level of formality there. Actually, it is normal behavior. The natural learning of a language includes much less formal study and much more practice.

However, learning a second language commonly makes much more clear the formal structures of it. In such cases, the study of the language’s lexicon, syntax, and semantics of learning language is very explicit.

But the question is: what are a language lexicon, syntax, and semantics?

In short, a language lexicon includes the complete set of available terms. Practically, we can see the lexicon as a dictionary. This dictionary contains every word used and recognized by the language speakers.

A language syntax, in turn, represents the possible ways that we can put words from the lexicon together. So, talking about syntax is talking about well-defined rules to create sentences in a given language.

However, even recognized words disposed of in a valid sentence guarantee that they will make sense in the real world. The semantics of a language, in turn, indicates if a sentence has a concrete representation for those who hear or read it.

Programming languages pay special attention to their lexicon, syntax, and semantics. It occurs due to the necessity of being exact (without ambiguity) in expressing the instructions for a computer.

Thus, in the following sections, we’ll explore these concepts applied in the programming scenario.

3. The Organization of Programming Languages

Similar to natural languages, learning a programming language includes understanding its formal structures and organization.

So, summarily, the lexicon of a programming language presents the reserved list of words adopted by it. These words, in turn, are what we use to code commands and create data structures.

The words written by a programmer are known as lexemes. Lexemes, however, are matched with a predefined pattern and thus identified as a token of the programming language (we call this lexical analysis). Let’s see a simple pseudocode example:

Lexic2

Furthermore, we also have a specific lexicon related to a particular program (source code). So, besides the standard reserved words list from the programming language, the program lexicon include other words exclusively defined for it, such as variable and function names.

However, executing commands, creating data structures, and using generic coding resources typically demand not only one word but a bunch of them. In such a way, the rules to correctly use the available words to achieve coding objectives form what we know as the programming language syntax.

If a programming statement is lexically and semantically valid, we can compile/interpret it. But, if it is not semantically valid, we can have unexpected or wrong behaviors during the program execution.

In practice, the syntax analysis process a derivation tree with the tokens identified in the lexical analysis. Let’s see an example:

Syntax

Semantics in a programming language indicates what practically does or not make sense in the context of a given source code.

Some usual semantic errors are, for example, using an uninitialized variable in arithmetic expressions or adding an operation immediately after a return operation in a function.

It is important to note that the number and format of words and rules in the lexicon, syntax, and semantics can vary from one programming language to another. It depends on several technical aspects of these languages, such as being a language statically or dynamically typed.

Besides the traditional meaning of semantics in programming, this word is also employed in other computing contexts, such as networking applications. In this case, two occurrences of semantics are the semantic web and the semantic social networks. Let’s explore these concepts a little bit:

  • Semantic Web: the main objective of the semantic web is to add meaningful metadata for everything and everywhere on the World Wide Web. In this way, machines accessing data provided online will be able to make correlations and categorization of heterogeneous entities from different sources based on these metadata
  • Semantic Social Networks: a concept that determines the application of the semantic web in the context of online social networks. These semantic social networks have been used (experimentally) to make the collaboration between decentralized organizations easier

4. Systematic Summary

Similar to natural languages, programming languages also include rules and formal structures. Thus, to write correct statements in a source code, we need to consider these rules and structures, which define their lexicon, syntax, and semantics.

First, the lexicon determines the words recognized by a programming language (we can see the lexicon as a kind of dictionary). The syntax of a programming language, in turn, indicates how to organize the words from the lexicon to create valid statements. Finally, the programming language semantics is responsible for checking if a statement makes sense in the broad context of a program source code.

The following table summarizes the characteristics of lexicon, syntax, and semantics of programming languages:

Lexicon

Syntax

Semantics

Outline

Recognized words of a language (dictionary)

Structure for creating valid statements

Logical use of statements in a particular context

Analysis Agents

Lexer/Interpreter

Compiler/Interpreter

Compiler/Interpreter (partially)

Analysis Elements

Lexems; Patterns; Tokens

Derivation tree

5. Conclusion

In this article, we studied the meaning of lexicon, syntax, and semantics in the context of computing, particularly programming languages. Firstly, we investigated general concepts of lexicon, syntax, and semantics. Thus, we specifically examined how these concepts work in the context of programming languages. Finally, we compared the studied concepts in a systematic summary.

We can conclude that the lexicon, syntax, and semantics play a crucial role in programming languages. Of course, there are many other aspects of programming languages not explored in this tutorial. However, their lexicon, syntactical rules, and semantics are the core resources of any language, with other technical aspects expressed and enforced through them.