如何在 Linux 中合并 JSON 文件

1. Introduction

JavaScript Object Notation (JSON) is an important data format that has become integral for structuring and exchanging information. Its simplicity and versatility have led to it being widely accepted.

Understanding how to manipulate and merge JSON files is essential because it makes data easy to read and analyze, consolidates data from multiple sources, and saves space. In this tutorial, we’ll learn to merge JSON files on Linux using the jq (JSON Query) command.

2. Understanding JSON File Structure

Before delving into merging JSON files, it’s important to grasp their structure. JSON represents data in key-value pair format, organized into objects and arrays.

To break it down, objects are enclosed within curly braces {} and it’s possible to represent an object inside another object. Below is a simple JSON format with a single object and key-value pairs:

{
   "id": 12345,
   "name": "John Doe", 
   "email": "[email protected]" 
}

At the beginning, the single curly brace indicates that the JSON data above has a single object. The data within the curly braces are the key-value pairs. In particular, a colon separates the key from its corresponding value (i.e., “id”: 12345), and commas separate each key-value pair as shown above.

3. Using jq to Merge JSON Files

The jq command merges JSON files in Linux; it slices, filters, maps, and transforms structured data. However, Linux doesn’t come with this command utility pre-installed, so we’ll need to install it manually.

Let’s proceed to install this utility with the apt command:

$ sudo apt install -y jq

In the above syntax, we invoked admin privilege by using the sudo command along with the apt install command. In essence, the option -y automatically accepts that the utility installation should continue without prompting the user to accept or decline.

Next, let’s look at the syntax for the jq command to merge JSON files:

$ jq [options] <jq filter> [file ....]

Now that we have the syntax for using the jq command, we can create two simple JSON files.

3.1. Merging Simple JSON Files

Before we proceed with the jq command, note that we’re using two simple JSON files: employee1.json and employee2.json.

Now, we can proceed to merge these two files using the jq command and understand the final output produced:

$ jq -s '.' employee1.json employee2.json > emp_details.json

The option -s signals the jq command to read the files as a stream of JSON objects and then output them as arrays. The dot (.) filter instructs the command to output each file as individual objects.

Finally, *the redirect symbol (>) saves the output in the file named emp_details.json if the file exists, or creates a new file with that name if it doesn’t exist in the current directory*.

Let’s proceed to glimpse the output of the merged JSON files using the cat command:

$ cat emp_details.json  
[
  {
    "name": "John Doe",
    "age": 30,
    "city": "New York"
  },
  {
    "name": "Jane Smith",
    "age": 25,
    "city": "Los Angeles"
  }
]

The merging resulted in a single JSON file with an array that contains two objects. Each of these objects correspond to the content of the two files merged.

3.2. Merging Complex JSON Files

Next, we can advance our knowledge of merging JSON files by applying the same syntax used for merging simple files. So, let’s proceed to merge multiple files by creating additional JSON files in addition to the files we created earlier.

Currently, we have five JSON files that we need to merge into a single JSON file. Since the files are all in the same directory and no other files with the same extension are present in the directory, we can use wildcard characters with the JSON extension to enumerate all the files:

$ jq -s '.' *.json > All_data.json

The command above streams all the files as a single file and treats data from the files as an object, thereby merging and directing the output into a single file named All_data.json.

Below is the result of the merger:

[
  {
    "name": "John Doe",
    "age": 30,
    "city": "New York"
  },
  {
    "name": "Jane Smith",
    "age": 25,
    "city": "Los Angeles"
  },
  {
    "name": "Jane Smith",
    "age": 25,
    "city": "Los Angeles"
  },
  {
    "employee": {
      "id": 12345,
      "name": "John Doe",
      "contact": {
        "email": "[email protected]",
        "phone": "+1234567890"
      }
    }
  },
  {
    "store": {
      "name": "Green Grocer",
      "location": "Main Street, 123",
      "products": [
        {
          "name": "Apples",
          "price": 1.2,
          "quantity": 30
        },
        {
          "name": "Bananas",
          "price": 0.8,
          "quantity": 50
        },
        {
          "name": "Carrots",
          "price": 0.6,
          "quantity": 100
        }
      ]
    }
  }
]

The result of the merger shows a complex JSON data structure containing both arrays and objects from the files merged. It’s worth noting that one of the object’s keys, products, contained an array.

4. Advanced Merging Techniques With jq

Let’s now go over some more advanced usage of jq. The different options and their unique combinations provide this command with expanded capabilities, such as duplicate removal during merging, merging nested JSON data, preserving array order during merging, etc.

4.1. Merge JSON Arrays With Duplicate Removal

To see how duplicate removal works, let’s create two JSON files, file1.json and file2.json.

This is the content of file1.json:

{
  "numbers": [1, 2, 3],
  "letters": ["a", "b", "c"]
}

And this is the content of file2.json:

{
  "numbers": [3, 4, 5],
  "letters": ["c", "d", "e"]
}

While looking at both files, we can see that they have the same keys and share a few duplicates, such as number 3 and the letter c appearing in the key values of both files. To avoid duplication during merging, -s flatten | unique is used with the jq command:

$ jq -s 'flatten | unique' file1.json file2.json > merged.json

After executing the command, this is the result:

{
  "numbers": [1, 2, 3, 4, 5],
  "letters": ["a", "b", "c", "d", "e"]
}

The result above illustrates that the merging process combined only the unique values from the keys of both files.

4.2. Merge JSON Objects With Nested Keys

To merge an inner nested object, the reduce .[] as $item ({}; . * $item) function is used with the jq command to iterate over each object from each of the files merged. Moreover, the inner object within the main object is merged during the processing.

Therefore, let’s proceed to examine the JSON files we want to merge and analyze the result.

Here is the first JSON file:

{
  "person": {
    "name": "Abioye",
    "age": 30,
    "address": {
      "city": "New York",
      "country": "USA"
    }
  }
}

This is the second JSON file:

{
  "person": {
    "name": "Bob",
    "address": {
      "zipcode": "10001"
    }
  }
}

Now that we’ve got the suitable data for this merge instance, we can proceed to carry out this merging operation:

$ jq -s 'reduce .[] as $item ({}; . * $item)' file1.json file2.json > merged.json

The result of the merge is shown below:

{
  "person": {
    "name": "Bob",
    "age": 30,
    "address": {
      "city": "New York",
      "country": "USA",
      "zipcode": "10001"
    }
  }
}

Consequently, the result demonstrates that we deeply merged the nested keys from both files while not merging the non-nested keys. Additionally, it’s important to note that the merge performs a left concatenation of the files, retaining every item in the second file in the merged file.

5. Conclusion

In this article, we discussed how to merge JSON files in Linux using the jq command. We began with a basic understanding of JSON data and structure, then demonstrated the syntax of the jq command for merging multiple files.

Furthermore, we articulated advanced usage of the jq command with practical examples to enhance understanding of the inner workings of the merging process through observing the output.

Persistence

REST

Security