1. Overview

Many programs and services need their configuration to be written down in a structured way. One of the ways to achieve this goal is using YAML files.

In the Bash shell, we need tools to deal with YAML content from the command line or script. 

In this tutorial, we’re going to learn about the yq utility.

2. yq Basics

The yq command is usually not a part of standard Linux distribution, so we need to install it manually. Then, let’s go through the command’s basics.

2.1. Version Check

First, let’s check the version:

$ yq -V #or --version
yq (https://github.com/mikefarah/yq/) version 4.24.2

We should be aware that two widely used implementations of yq exist. This used throughout the tutorial is known as the ‘Go implementation’, while the second one is based on Python. Both have very similar capabilities but differ slightly in syntax.

For more details, we should visit the project’s documentation page.

2.2. The Default Working Mode

Let’s notice that the default yq mode is eval, which allows reading, searching, and editing YAML files. So, let’s print the whole content of the file personal_data.yaml:

$ yq personal_data.yaml

name: "My name"
surname: "Foo"
street_address:
  city: "My city"
  street: "My street"
  number: 1

3. Accessing Properties

We should use the path in the YAML file to retrieve the property’s value. First, let’s print the name property in personal_data.yaml:

$ yq '.name' personal_data.yaml

My name

Next, the brackets [] provide all values from the structure street_address:

$ yq '.street_address[]' personal_data.yaml
My city
My street
1

3.1. Searching for Value

We can use the select operator to find a particular value. So let’s look for ‘Foo’ in the YAML file:

$ yq '.[] | select(. == "Foo")' personal_data.yaml

Foo

**Let’s notice the use of the pipe operator | to pass the root node’s values to the select operator.
**

Now let’s find all values which start with ‘My’ in the street_address node. So, we apply the wildcard ‘*’:

$ yq '.street_address[] | select(. == "My*")' personal_data.yaml

My city
My street

With the double dot .. operator, we can recursively traverse the document, starting from the given node. Thus, let’s find all values starting with ‘My’:

$ yq '.. | select(. == "My*")' personal_data.yaml

My name
My city
My street

Next, let’s narrow our search only to the street_address node’s children

$ yq '.street_address | .. | select(. == "My*") ' personal_data.yaml

My city
My street

3.3. Changing Values

We can change or update properties using the assign operator ‘=’. Then, let’s change the street number:

$ yq '.street_address.number = 256' personal_data.yaml

name: "My name"
surname: "Foo"
street_address:
  city: "My city"
  street: "My street"
  number: 256

The result shows up in the standard output only. Thus, we need to use the i option to modify the file in place.

4. Working With the Nodes of YAML

We can query and modify the YAML structure as well. Hence, we can add, delete and find nodes.

4.1. Adding and Deleting Nodes

Let’s create a new property zip_code in the street_address by simply adding it to the path:

$ yq -i '.street_address.zip_code = 16' personal_data.yaml && cat personal_data.yaml

name: "My name"
surname: "Bar"
street_address:
  city: "My city"
  street: "My street"
  number: 1
  zip_code: 16

Next, in a similarly simple way, let’s remove the node with the del operator:

$ yq -i 'del(.street_address.zip_code)' personal_data.yaml && cat personal_data.yaml

name: "My name"
surname: "Bar"
street_address:
  city: "My city"
  street: "My street"
  number: 1

4.2. Retrieving Nodes’ Names

The to_entries operator returns keys together with their values. Then, let’s use it on the root level:

$ yq 'to_entries' personal_data.yaml

- key: name
  value: "My name"
- key: surname
  value: "Foo"
- key: street_address
  value:
    city: "My city"
    street: "My street"
    number: 1

The results are provided as array elements. Further, we can extract the keys:

$ yq 'to_entries | .[] | .key' personal_data.yaml

name
surname
street_address

and values:

$ yq 'to_entries | .[] | .value' personal_data.yaml

My name
Foo
city: "My city"
street: "My street"
number: 1

4.3. Searching for Nodes With has

Let’s try to find Bash entry in the language data file languages.yaml:

$ yq languages.yaml

languages:
  - language:
      name: Bash the Bourne-Again shell
      feature: interpreted
  - language:
      name: C++
      feature: compiled, comes together with Bash well
  - language:
      name: Java
      feauture: compiled and interpreted, different from Bash and C++

Obviously, we should not search file-wide for a pattern like *Bash*, because we’d obtain Java and C++ data as well. Thus, we’re going to search only the name nodes:

$ yq  '.. | select(has("name")) | select(.name == "*Bash*")' languages.yaml

name: Bash
feature: interpreted

Let’s notice the use of operator has. It returns true for matching nodes. Further, the search is refined by the next select, which checks the node’s content.

5. Working With Multiple Files

yq can work with multiple files, which are provided as arguments. Moreover, we can index these files and refer to them.

As an example, let’s amend languages.yaml with languages’ versions from versions.yaml file:

$ yq versions.yaml

Java: openjdk 11.0.14.1 2022-02-08
C++: gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)
Bash: GNU bash, version 5.1.8(1)-release (x86_64-redhat-linux-gnu)

We need to turn into the eval-all mode to read both files into memory:

$ yq eval-all '
    select(fi == 0) as $versions |
    select(fi == 1) |                                
    .languages[0].language.version = $versions.Bash|
    .languages[1].language.version = $versions.C++ |
    .languages[2].language.version = $versions.Java
' versions.yaml languages.yaml

languages:
  - language:
      name: Bash the Bourne-Again shell
      feature: interpreted
      version: GNU bash, version 5.1.8(1)-release (x86_64-redhat-linux-gnu)
  - language:
      name: C++
      feature: compiled, comes together with Bash well
      version: gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)
  - language:
      name: Java
      feature: compiled and interpreted, different from Bash and C++
      version: openjdk 11.0.14.1 2022-02-08

Let’s highlight the use of file index fi to select the appropriate file. Further, with as $versions, we defined a variable with the first YAML content. In addition, to add the version node, we referred to the languages array elements by index.

6. Interaction With a Bash Script

When programming in Bash, we need to use the language’s constructs to access the YAML structure.

6.1. Injecting Bash Variables Into a YAML File

Let’s update the configuration in the YAML file with the bash variables. Hence, we’re going to use the env operator. An example, let’s create YAML content with the hostname:

$ yq --null-input '.hostname = env(HOSTNAME)'

hostname: fedora35

Let’s notice the null-input switch, which tells the command to create the YAML content without an input file.

6.2. Variable’s Value as a Search Target

Now let’s search the YAML content for a value or node matching the variable’s value. So first, let’s find the ‘Foo’ value in personal_data.yaml:

$ targetVal=Foo yq '.[] | select(. == env(targetVal))' personal_data.yaml

Foo

Next, let’s use the variable as a lookup key to the keys’ table:

$ targetKey=name yq  ' .[env(targetKey)] ' personal_data.yaml

My name

Finally, let’s search for a node that contains the city key with has:

$ targetKey=city yq '.. | select(has(env(targetKey)))' personal_data.yaml

city: "My city"
street: "My street"
number: 1

6.3. Variable Substitution With envsubst

Let’s consider a simple YAML template to collect basic system information:

$ yq system_data.yaml

hostname: ${HOSTNAME}
user: ${USER}
shell: ${SHELL}

Now we’re going to replace all ${} placeholders with actual values. Thus, we should use envsubst inside yq:

$ yq '.[] |= envsubst' system_data.yaml

hostname: fedora35
user: joe
shell: /bin/bash

6.4. Reading YAML Content Into an Array

Now let’s access YAML inside the script. Hence, we’re going to use an associative array, where the node’s name is a key. So, let’s import the street_address from personal_data.yaml with the yaml_reader script:

#!/bin/bash

declare -A content

while IFS="=" read -r key value; do content["$key"]=$value; done < <(
  yq '.street_address | to_entries | map([.key, .value] | join("=")) | .[]' personal_data.yaml
)

for key in "${!content[@]}"; do printf "key %s, value %s\n" "$key" "${content[$key]}"; done

We used the while IFS loop to fill the array. Furthermore, the to_entries results fed the map operator to concatenate each key-value pair with the join operator to the IFS format.

Now let’s check the results:

$ ./yaml_reader

key city, value My city
key number, value 1
key street, value My street

7. Conclusion

In this tutorial, we learned about the yq command. First, we went through its basics. Then, we focused on searching and modifying the properties in the YAML file. Next, we studied the example of multifile operation.

*Meanwhile, we highlighted the importance of chaining the yq‘s operator to effectively achieve our goals.* It’s especially useful as the command offers a wide choice of operators.

Finally, we looked through ways to interact between the bash script and the YAML content.