如何在 Linux shell 中构建 JSON 字符串

1. Overview

Creating a JSON string from Bash variables can ensure interoperability with different tools or systems. For example, in DevOps and system administration, we may need to integrate shell scripts with RESTful APIs that typically require JSON payloads.

In addition, automating deployment processes often involves sending configuration data, status reports, or monitoring metrics to web services in JSON format. Finally, in a microservices architecture where different services communicate over HTTP, generating JSON from Bash scripts can facilitate the efficient exchange of configuration settings, environment variables, or operational data between services.

In this tutorial, we’ll explore several methods for converting different types of Bash variables into a valid JSON string.

2. Getting Started Example

While Bash treats most variables as strings by default, it can interpret them as integers in arithmetic contexts, and as arrays if explicitly defined. It also allows boolean-like behavior using integers, command exit statuses, and string comparisons. This flexibility leads to dynamic use of variables and even permits handling of edge cases such as null or undeclared variables, but it also requires careful attention to avoid unexpected behavior.

In contrast, the JSON ECMA-404 standard supports a more rigid and well-defined set of data types, including strings, numbers, arrays, objects, booleans, and null. Unlike Bash, which dynamically interprets variable types based on context, JSON’s strict syntax ensures consistency and avoids ambiguity in data representation. This rigidity simplifies data exchange between systems and programming languages, ensures consistency, and reduces the risk of type-related errors.

Let’s start with a simple example:

#!/bin/bash
 
# Bash Variables
name="Paco Bellez"
age=41
city="Santiago de Compostela"
has_car=true
languages=("English" "Italian" "Japanese")

While age is intended to be a number and has_car to be a boolean value, Bash treats them as strings by default.

2.1. Using printf and Manual Construction

Let’s convert Bash variables to JSON using printf:

#!/bin/bash

# Bash Variables
[...]

# Prepare languages as a JSON array string
languages_json=$(printf '"%s", ' "${languages[@]}")
languages_json=${languages_json%, } # Remove the trailing comma and space

# Conversion to JSON
json_string=$(printf '{
    "name": "%s",
    "age": %d,
    "city": "%s",
    "has_car": %s,
    "languages": [%s]
}' "$name" "$age" "$city" "$has_car" "$languages_json")

# Output JSON
echo "$json_string"

In this script, we prepare the JSON array by formatting each element of the languages variable into a quoted string followed by a comma and a space and then removing the trailing comma and space to create a valid JSON array string. Then printf generates the json_string in a readable multiline format, with each variable placed in the appropriate position in the JSON structure:

$ ./test.sh 
{
    "name": "Paco Bellez",
    "age": 41,
    "city": "Santiago de Compostela",
    "has_car": true,
    "languages": ["English", "Italian", "Japanese"]
}

During development, a good rule of thumb is to use a JSON validator to check the syntax of the generated JSON output, as printf can be prone to errors when dealing with special characters such as quotes or backslashes. That’s why manual string manipulation with printf can easily result in malformed JSON.

2.2. Using jq With Bash Variables

Here we use jq, a powerful command-line JSON processor, to make the conversion of Bash variables to a JSON string much more robust. The benefit of using jq is that it ensures that all data is properly formatted and escaped, preventing problems such as incorrect JSON structure or special character issues:

#!/bin/bash

# Bash Variables
[...]

# Conversion to JSON without printf
json_string=$(jq -n \
    --arg name "$name" \
    --argjson age "$age" \
    --arg city "$city" \
    --argjson has_car "$has_car" \
    --argjson languages "$(jq -n '$ARGS.positional' --args "${languages[@]}")" \
    '{name: $name, age: $age, city: $city, has_car: $has_car, languages: $languages}')

# Output the JSON string
echo "$json_string"

Let’s look at some details:

-n → Doesn’t read from standard input or files, it’s useful for constructing JSON data from scratch
–arg → Given a variable name and value, treat the value as a string
–argjson → Given a variable name and value, treat the value as a JSON-encoded value (e.g., boolean or number)
$ARGS.positional → In conjunction with –args, treats each subsequent argument as a separate string element and includes them in an array
“${languages[@]}” → Bash construct that expands each element of the languages array into a separate argument
‘{name: $name, age: $age, […]}’ → jq expression that constructs the final JSON object using the provided variables and values

Here is the result:

$ ./test.sh 
{
    "name": "Paco Bellez",
    "age": 41,
    "city": "Santiago de Compostela",
    "has_car": true,
    "languages": [
        "English",
        "Italian",
        "Japanese"
    ]
}

If we need to produce a compact one-row JSON, we can simply add the -c flag. This format can make data easier to handle and parse, especially when interprocess communication is via pipes or sockets:

$ ./test.sh 
{"name":"Paco Bellez","age":41,"city":"Santiago de Compostela",...}

As a final test, let’s see if jq can handle special characters:

#!/bin/bash
str=$'Hello\nWorld\n\tThis is a tab\n\\ backslash\nSingle quote: \'\nDouble quote: \"'
json_string=$(jq -nc --arg str "$str" '{str: $str}')
echo "$json_string"

The result is as expected:

$ ./test.sh 
{"str":"Hello\nWorld\n\tThis is a tab\n\\ backslash\nSingle quote: '\nDouble quote: \""}

In this case, using printf instead of jq would increase the complexity of the code due to the need to manually escape special characters in JSON formatting.

3. Complex Cases

Here we continue the exploration of jq with more complex cases.

3.1. null and Undeclared Variables

The JSON standard includes the null value, which is different from an empty string or an undeclared value, while in Bash an empty string is typically considered a null string, as documented in the Bash manual. Undeclared variables don’t exist in JSON, but they do in Bash, and with the proper precautions we can use them. This creates an ambiguity when dealing with these problematic Bash variables, and the approach to resolving this issue depends on the specific use case.

A possible workaround is to use a json_value() function that processes each variable passed to jq, returning its value if it exists and if it’s not an empty string, or null in all other cases. Of course, we need to use indirect expansion because the function needs to dynamically handle and dereference variable names passed as arguments:

#!/bin/bash

# Function to handle undefined or empty variables and output a JSON-compatible value
json_value() {
    # Store the first argument in a local variable 'var'
    local var="$1"
    
    # Check if the variable is not set
    if [ -z "${!var+x}" ]; then
        # If not set, print 'null'
        echo "null"
    
    # Check if the variable is set but empty
    elif [ -z "${!var}" ]; then
        # If set but empty, print 'null'
        echo "null"
    
    # If the variable is set and not empty
    else
        # Print the variable's value enclosed in double quotes
        echo "\"${!var}\""
    fi
}

# Example variables
null_var=          # null variable
empty_string=""
# undeclared_string is intentionally not declared
# $3 is unset for a script launched without parameters
regular_string="Hello Baeldung!"

# Convert variables to JSON using jq with --argjson
json_output=$(jq -n \
    --argjson null_var "$(json_value null_var)" \
    --argjson empty_string "$(json_value empty_string)" \
    --argjson undeclared_string "$(json_value undeclared_string)" \
    --argjson positional3 "$(json_value 3)" \
    --argjson regular_string "$(json_value regular_string)" \
    '{
        "null_var": $null_var,
        "empty_string": $empty_string,
        "undeclared_string": $undeclared_string,
        "positional3": $positional3,
        "regular_string": $regular_string
    }')

echo "$json_output"

The result is as expected:

$ ./test.sh 
{
    "null_var": null,
    "empty_string": null,
    "undeclared_string": null,
    "positional3": null,
    "regular_string": "Hello Baeldung!"
}

Our json_value function always returns a JSON-compatible value, even for regular strings, so it should always be used with –argjson rather than –arg, otherwise, we may get this undesirable result:

    [...]
    "regular_string": "\"Hello Baeldung!\""
}

However, this approach is incompatible with values that aren’t null or strings, such as booleans, numbers, and arrays.

3.2. Handling Arrays of Complex Objects

In many real-world scenarios, we may need to handle arrays of complex objects rather than simple arrays or primitive types. For example, we may need to manage a list of users, where each user has a name, age, and a list of skills. We want to convert this list to a JSON array of objects:

#!/bin/bash

# Define arrays of user attributes
names=("Alice" "Bob" "Carol")
ages=(29 34 28)
declare -A skills
skills["Alice"]="Bash Python"
skills["Bob"]="JavaScript HTML CSS"
skills["Carol"]="Go Rust"

# Function to convert a space-separated list of skills into a JSON array
skills_to_json() {
    local skills_str="$1"
    local skills_arr=($skills_str)
    local skills_json=$(printf '"%s", ' "${skills_arr[@]}")
    echo "[${skills_json%, }]"
}

# Initialize an empty JSON array
json_users="[]"

# Construct JSON array of user objects
for i in "${!names[@]}"; do
    name=${names[$i]}
    age=${ages[$i]}
    skill_str=${skills[$name]}
    skill_json=$(skills_to_json "$skill_str")
    
    # Use jq to construct each user object and add it to the JSON array
    json_users=$(jq --argjson users "$json_users" \
        --arg name "$name" \
        --argjson age "$age" \
        --argjson skills "$skill_json" \
        '$users + [{"name": $name, "age": $age, "skills": $skills}]' <<<"$json_users")
done

# Output the JSON string
echo "$json_users"

Let’s break it down:

The names, ages, and skills arrays store user data
The skills_to_json function converts a space-separated list of skills into a JSON array
A loop iterates over the names array and constructs JSON objects for each user
We don’t use -n here because we incrementally build our JSON array from an initial empty array and update it at each iteration

Here is the result:

$ ./test.sh 
[
    {
        "name": "Alice",
        "age": 29,
        "skills": [
            "Bash",
            "Python"
        ]
    },
    {
        "name": "Bob",
        "age": 34,
        "skills": [
            "JavaScript",
            "HTML",
            "CSS"
        ]
    },
    {
        "name": "Carol",
        "age": 28,
        "skills": [
            "Go",
            "Rust"
        ]
    }
]

This approach allows us to efficiently manage complex data structures and ensures that our JSON construction logic is both clear and maintainable.

4. Conclusion

In this article, we explored several methods for converting Bash variables to a valid JSON string. We began by understanding the differences between how Bash and JSON handle data types, and the importance of JSON’s rigid syntax for consistency and error reduction.

First, we manually constructed JSON strings using printf, emphasizing the importance of careful string manipulation to avoid malformed JSON. Then we discussed the use of jq to make JSON conversion more efficient and robust, especially when dealing with special characters and complex data types.

Finally, we looked at more complex cases, such as handling null or undeclared variables, and managing arrays of complex objects. These methods provide a solid foundation for generating JSON in Bash, improving scripting functionality and system interoperability.

Persistence

REST

Security