使用jq从变量创建嵌套JSON文件

1. Overview

JSON (JavaScript Object Notation) has become ubiquitous for data interchange. As applications grow more complex, so do data structures. Enter the nested JSON – a powerful way to represent hierarchical data. Yet, how do we efficiently create these structures, especially when working with variables and dynamic data?

In this tutorial, we build a software deployment configuration generator using jq. This tool creates a nested JSON configuration file for deploying a microservices-based application across multiple environments. As we progress, we enhance the generator to handle increasingly complex scenarios.

2. Nested JSON Structures

Let’s quickly recap what we mean by nested JSON. In essence, nesting in JSON occurs when an object or array contains other objects or arrays. This hierarchical structure enables the representation of complex relationships in the data.

Let’s create a basic structure for the deployment configuration generator:

{
  "app_name": "MyAwesomeApp",
  "version": "1.0.0",
  "environments": {
    "dev": {
      "url": "dev.myapp.com",
      "resources": {
        "cpu": "0.5",
        "memory": "512Mi"
      }
    },
    "prod": {
      "url": "myapp.com",
      "resources": {
        "cpu": "2",
        "memory": "2Gi"
      }
    }
  }
}

Thus, this structure nests environment-specific configurations within an environments object, which itself is nested in the root object. Each environment then has its own nested resources object.

3. Preparation of Data

Before we start working on the JSON, we need to prepare the data. Real data might come from various sources:

environment variables
configuration files
command-line arguments

For this example, let’s set up some variables:

$ APP_NAME="MyAwesomeApp"
VERSION="1.0.0"
ENVIRONMENTS=("dev" "prod")
DEV_URL="dev.myapp.com"
PROD_URL="myapp.com"
DEV_CPU="0.5"
DEV_MEMORY="512Mi"
PROD_CPU="2"
PROD_MEMORY="2Gi"

In a production setting, we might source these from an .env file or fetch them dynamically. The key is to have the data readily accessible for jq to process.

4. Basic Nested JSON Creation With jq

Now that we have the respective variables, let’s start building the nested JSON structure.

We begin with the basic configuration skeleton:

$ jq -nc \
  --arg name "$APP_NAME" \
  --arg version "$VERSION" \
  '{
    app_name: $name,
    version: $version,
    environments: {}
  }'
{
{"app_name":"MyAwesomeApp","version":"1.0.0","environments":{}}

As a result, this jq command creates a simple object with app_name and version fields, plus an empty environments object. Further, the -n option tells jq to start with a NULL input, the -c option prints the response on a line, and we use –arg to pass in the shell variables.

Next, let’s add each environment:

$ jq -nc \
  --arg name "$APP_NAME" \
  --arg version "$VERSION" \
  --arg dev_url "$DEV_URL" \
  --arg prod_url "$PROD_URL" \
  '{
    app_name: $name,
    version: $version,
    environments: {
      dev: {
        url: $dev_url,
        resources: {}
      },
      prod: {
        url: $prod_url,
        resources: {}
      }
    }
  }'
{"app_name":"MyAwesomeApp","version":"1.0.0","environments":{"dev":{"url":"dev.myapp.com","resources":{}},"prod":{"url":"myapp.com","resources":{}}}}

Here, we nested the dev and prod objects within the environments object, each with its own url field and an empty resources object.

5. Complex Nesting

As the configuration grows more complex, we may need more advanced techniques.

Now, let’s add the resource configurations and make the environment creation more dynamic:

$ jq -nc \
  --arg name "$APP_NAME" \
  --arg version "$VERSION" \
  --arg dev_url "$DEV_URL" \
  --arg prod_url "$PROD_URL" \
  --arg dev_cpu "$DEV_CPU" \
  --arg dev_mem "$DEV_MEMORY" \
  --arg prod_cpu "$PROD_CPU" \
  --arg prod_mem "$PROD_MEMORY" \
  '{
    app_name: $name,
    version: $version,
    environments: {
      dev: {
        url: $dev_url,
        resources: {
          cpu: $dev_cpu,
          memory: $dev_mem
        }
      },
      prod: {
        url: $prod_url,
        resources: {
          cpu: $prod_cpu,
          memory: $prod_mem
        }
      }
    }
  }'
{"app_name":"MyAwesomeApp","version":"1.0.0","environments":{"dev":{"url":"dev.myapp.com","resources":{"cpu":"0.5","memory":"512Mi"}},"prod":{"url":"myapp.com","resources":{"cpu":"2","memory":"2Gi"}}}}

This works, but it’s not very DRY (Don’t Repeat Yourself). So, we can refactor the snippet to make it more dynamic:

$ jq -nc \
  --arg name "$APP_NAME" \
  --arg version "$VERSION" \
  --arg dev_url "$DEV_URL" \
  --arg prod_url "$PROD_URL" \
  --arg dev_cpu "$DEV_CPU" \
  --arg dev_mem "$DEV_MEMORY" \
  --arg prod_cpu "$PROD_CPU" \
  --arg prod_mem "$PROD_MEMORY" \
  '{
    app_name: $name,
    version: $version,
    environments: {
      dev: {
        url: $dev_url,
        resources: {
          cpu: $dev_cpu,
          memory: $dev_mem
        }
      },
      prod: {
        url: $prod_url,
        resources: {
          cpu: $prod_cpu,
          memory: $prod_mem
        }
      }
    }
  } | 
  .environments |= with_entries(
    .value.resources |= with_entries(
      .value |= 
        if . == "" then null 
        elif test("^[0-9]+$") then tonumber 
        else . 
        end
    )
  )'
{"app_name":"MyAwesomeApp","version":"1.0.0","environments":{"dev":{"url":"dev.myapp.com","resources":{"cpu":"0.5","memory":"512Mi"}},"prod":{"url":"myapp.com","resources":{"cpu":2,"memory":"2Gi"}}}}

This advanced jq filter performs a few steps:

creates our basic structure as before
pipes (|) the result to another operation that focuses on the environments object
the with_entries function can modify each key-value pair in the environments object
the nested with_entries modifes the resources object within each environment

For each resource value, the code uses some conditional logic:

if the value is an empty string, we set it to null
if it’s a string that looks like a number, we convert it to a number with tonumber
otherwise, we leave it as is

Thus, this approach enables more flexible handling of the input data and demonstrates how we can apply complex logic within the jq filter.

6. Working With External Data Sources

We often need to combine data from various sources. Let’s enhance the deployment configuration generator to incorporate service definitions from external files.

To begin with, we create a services.json file containing definitions for the microservices we require:

$ cat services.json
{
  "auth-service": {
    "port": 8080,
    "dependencies": ["user-db"]
  },
  "user-service": {
    "port": 8081,
    "dependencies": ["user-db", "email-service"]
  },
  "email-service": {
    "port": 8082,
    "dependencies": []
  }
}

Then, we modify the jq command to include this data:

$ jq -nc \
  --arg name "$APP_NAME" \
  --arg version "$VERSION" \
  --arg dev_url "$DEV_URL" \
  --arg prod_url "$PROD_URL" \
  --arg dev_cpu "$DEV_CPU" \
  --arg dev_mem "$DEV_MEMORY" \
  --arg prod_cpu "$PROD_CPU" \
  --arg prod_mem "$PROD_MEMORY" \
  --slurpfile services services.json \
  '{
    app_name: $name,
    version: $version,
    environments: {
      dev: {
        url: $dev_url,
        resources: {
          cpu: $dev_cpu,
          memory: $dev_mem
        }
      },
      prod: {
        url: $prod_url,
        resources: {
          cpu: $prod_cpu,
          memory: $prod_mem
        }
      }
    },
    services: $services[0]
  } | 
  .environments |= with_entries(
    .value.resources |= with_entries(
      .value |= 
        if . == "" then null 
        elif test("^[0-9]+$") then tonumber 
        else . 
        end
    )
  )'
{"app_name":"MyAwesomeApp","version":"1.0.0","environments":{"dev":{"url":"dev.myapp.com","resources":{"cpu":"0.5","memory":"512Mi"}},"prod":{"url":"myapp.com","resources":{"cpu":2,"memory":"2Gi"}}},
"services":{"auth-service":{"port":8080,"dependencies":["user-db"]},"user-service":{"port":8081,"dependencies":["user-db","email-service"]},"email-service":{"port":8082,"dependencies":[]}}}

Here, we introduced the –slurpfile option to read the services.json file. The $services[0] syntax is used because –slurpfile always produces an array, even for single objects.

7. Large Nested Structures

As the JSON structures grow, performance can become a concern.

Let’s briefly look at some strategies to keep jq running effectively:

Minimize Pipes: While pipes are powerful, each one creates a new JSON parser. For larger datasets, we can try to combine operations where possible.
Use Built-in Functions: jq‘s built-in functions are often faster than custom solutions. For example, we can use map() instead of [.[] | …] for transforming arrays.
Leverage Indexing: When dealing with large arrays, we can use indexing to access specific elements rather than filtering the entire array.

To demonstrate, let’s optimize the deployment config generator for multiple environments.

Initially, we move the environment configurations to the external environments.json file:

$ cat environments.json
{
  "dev": {
    "url": "dev.myapp.com",
    "resources": {
      "cpu": "0.5",
      "memory": "512Mi"
    },
    "replicas": 1
  },
  "staging": {
    "url": "staging.myapp.com",
    "resources": {
      "cpu": "1",
      "memory": "1Gi"
    },
    "replicas": 2
  },
  "prod": {
    "url": "myapp.com",
    "resources": {
      "cpu": "2",
      "memory": "2Gi"
    },
    "replicas": 3
  },
  "dr": {
    "url": "dr.myapp.com",
    "resources": {
      "cpu": "2",
      "memory": "2Gi"
    },
    "replicas": 3
  }
}

Then, we make the jq command more flexible:

$ jq -nc \
  --arg name "$APP_NAME" \
  --arg version "$VERSION" \
  --slurpfile services services.json \
  --slurpfile envs environments.json \
  '{
    app_name: $name,
    version: $version,
    environments: ($envs[0] | map_values(
      .resources |= map_values(
        if . == "" then null
        elif test("^[0-9]+$") then tonumber
        else .
        end
      )
    )),
    services: $services[0]
  }'
{"app_name":"MyAwesomeApp","version":"1.0.0","environments":{"dev":{"url":"dev.myapp.com","resources":{"cpu":"0.5","memory":"512Mi"},"replicas":1},
"staging":{"url":"staging.myapp.com","resources":{"cpu":1,"memory":"1Gi"},"replicas":2},"prod":{"url":"myapp.com","resources":{"cpu":2,"memory":"2Gi"},"replicas":3},
"dr":{"url":"dr.myapp.com","resources":{"cpu":2,"memory":"2Gi"},"replicas":3}},"services":{"auth-service":{"port":8080,"dependencies":["user-db"]},
"user-service":{"port":8081,"dependencies":["user-db","email-service"]},"email-service":{"port":8082,"dependencies":[]}}}

In this optimized version, we moved the environment configurations to an external file (environments.json) and used map_values() for efficient transformation. Evidently, this approach scales better for a large number of environments.

8. Validation

Ensuring the correctness of any configuration is important.

So, let’s do that after saving the JSON generated to a file:

$ jq -n \
  --arg name "$APP_NAME" \
  --arg version "$VERSION" \
  --slurpfile services services.json \
  --slurpfile envs environments.json \
  '{
    app_name: $name,
    version: $version,
    environments: ($envs[0] | map_values(
      .resources |= map_values(
        if . == "" then null
        elif test("^[0-9]+$") then tonumber
        else .
        end
      )
    )),
    services: $services[0]
  }' | tee output.json | 
jq '
  if .app_name == "" then
    error("app_name is empty")
  elif .environments | length == 0 then
    error("no environments defined")
  elif .services | length == 0 then
    error("no services defined")
  else
    "Configuration is valid"
  end
'
"Configuration is valid"

This command generates the configuration, saves it to output.json, and then performs some basic validation checks.

9. Conclusion

In this article, we explored how to build a deployment configuration generator as an example of creating and handling a nested JSON structure dynamically via variables.

In conclusion, creating nested JSON structures from variables using jq is a powerful technique for generating complex configurations, data transformations, and more.

Persistence

REST

Security