1. Overview

The getline function in AWK is a powerful and advanced mechanism to read input from sources other than the main input file. While the default behavior of awk is to read input line-by-line, getline gives us explicit control over how to read input.

In this tutorial, we’ll explore the getline function in detail by writing AWK scripts to solve a sample scenario.

2. Scenario Setup

First of all, let’s note that we’ll use the GNU AWK implementation of AWK to execute our scripts:

$ awk --version | head -1
GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)

Now, let’s look at the mean.awk script. It contains a function mean() that calculates the integer mean of an array of integers:

$ cat mean.awk
function mean(arr) {
    total = 0
    for (i = 1; i <= length(arr); i++) {
        total += arr[i]
    }
    return int(total/length(arr))
}

The mean() function accepts an array and returns the average value of the integers in the array.

Next, we’ve also got a test plan for the mean() function in the test_cases_mean.txt file:

$ cat test_cases_mean.txt
3
3
1 2 3
2
2
7 5
6
5
3 2 1 6 8
4

The first line denotes the total number of test cases. Then, each test case comprises three lines containing the count of numbers, space-separated numbers, and the expected mean value, respectively.

Lastly, let’s note that our goal is to write a test script to read and execute the test plan. So, we’ll reuse the mean.awk and test_cases_mean.txt files.

3. Reading the Next Record

We can use the getline function to read the next record from the main input file explicitly. Let’s learn more about this behavior in this section.

3.1. Writing the Test Script

Firstly, let’s write the @include statement in test_script_v1.awk to reuse the mean() function defined in the mean.awk script:

@include "mean.awk"

Now, let’s write the BEGIN block and call getline without any arguments:

BEGIN {
    getline
    printf("Count of test cases: %d\n", $0)
}

When we don’t specify any arguments, getline reads the next record from the main input file. Further, it also sets it as the current record ($0).

Next, let’s start writing the main block in our test_script_v1.awk script:

N=$0;
getline;

Since we’ve already read the first line in the BEGIN block, the main block only gets input records from the second record onwards. So, we’ve initialized N with the first line of the first test case that contains the count of numbers. Additionally, we call getline to read the space-separated list of numbers into the current record ($0).

Moving on, we can split the input record ($0) to populate the nums array and pass it to the mean() function:

split($0, nums, " ")
actual=mean(nums)

Next, let’s call getline to read the next line containing the expected mean value and capture the value in the expected variable:

getline
expected=$0

Lastly, let’s compare the actual and expected values to determine if the mean function works correctly for this test case:

test_status=(actual==expected ? "pass": "fail")
print(test_status)

That’s it! Our script is ready for use.

3.2. Test Script in Action

Before executing our test script script, let’s look at the test_script_v1.awk script in its entirety:

$ cat test_script_v1.awk
@include "mean.awk"
BEGIN {
    getline
    printf("Count of test cases: %d\n", $0)
}
{
    N=$0
    getline
    split($0,nums," ")
    actual=mean(nums)
    getline
    expected=$0
    test_status=(actual==expected ? "pass": "fail")
    print(test_status)
}

It’s important to note that the main block executes repeatedly until the end of the input file.

Now, we can run our test script to validate the test cases:

$ awk -f test_script_v1.awk test_cases_mean.txt
Count of test cases: 3
pass
pass
pass

Perfect! Our test script confirms that the mean function is working as expected.

4. Reading Into Variable

We can pass a variable as an argument to the getline function to read the next record from the main input file directly into the variable.

Let’s use this approach to write the test_script_v2.awk by modifying the test_script_v1.awk script:

$ cat test_script_v2.awk
@include "mean.awk"
BEGIN {
    getline count;
    printf("Count of test cases: %d\n", count)
}
{
    N=$0
    getline nums_str
    split(nums_str, nums, " ")
    actual=mean(nums)
    getline expected
    test_status=(actual==expected ? "pass": "fail")
    print(test_status)
}

We can notice that we’ve reduced multiple statements into a single one by combining read and assign operations for the expected variable. Further, we should remember that the main block executes repeatedly until the end of input.

Like earlier, let’s run the test script to validate the test plan:

$ awk -f test_script_v2.awk test_cases_mean.txt
Count of test cases: 3
pass
pass
pass

It works as expected.

5. Reading From a File

In this section, we’ll learn how to use getline to read from a file.

5.1. Usage

By default, getline reads the next record from the main input file. However, we can use the redirection operator to read from a different source:

getline [var] < file

When we provide a variable argument (var), the content goes into var. Otherwise, it’s read into the current record ($0).

5.2. Test Script

Using this approach, let’s write the test_script_v3.awk script that accepts the TEST_PLAN_FILE parameter:

$ cat test_script_v3.awk
@include "mean.awk"
BEGIN {
    getline count < TEST_PLAN_FILE
    printf("Count of test cases: %d\n", count)
    while (count-->0) {
        getline _ < TEST_PLAN_FILE # skip record
    getline num_str < TEST_PLAN_FILE
    split(num_str, nums, " ")
    actual=mean(nums)
    getline expected < TEST_PLAN_FILE
    test_status=(actual==expected ? "pass" : "fail")
    print(test_status)
    }
    close(TEST_PLAN_FILE)
}

Now, let’s break down the nitty-gritty of the logic.

Firstly, we transferred the logic into the BEGIN block and replaced the main block with a while loop. Further, we used getline function to read from the TEST_PLAN_FILE file.

As a best practice, we should always close the file after use. Additionally, let’s note that it’s a common practice to use _ variable when we want to read a record without the intention to use it.

Like always, let’s verify the test script by setting the TEST_PLAN_FILE parameter as test_cases_mean.txt:

$ awk -v TEST_PLAN_FILE="test_cases_mean.txt" -f test_script_v3.awk
Count of test cases: 3
pass
pass
pass

Great! We got this one right.

6. Reading From a Pipe

In this section, we’ll learn how to use getline to read from a pipe by solving a use case of creating a test plan generator script.

6.1. Usage

We can pipe the output of a shell command to getline and read its content:

cmd | getline [var]

On passing a variable (var) argument, getline puts the content into the variable. Otherwise, the content goes into the current record ($0).

6.2. Test Script With Remote Test Plan

Using curl, we can fetch a test plan from a remote location. For simplicity, let’s see how to use a local URL to retrieve the test plan:

$ curl --silent file://$(pwd)/test_cases_mean.txt
3
3
1 2 3
2
2
7 5
6
5
3 2 1 6 8
4

We can replace the URL with a remote server URL that hosts our test plan.

Now, let’s look at the test_script_v4.awk script that accepts the TEST_PLAN_URL parameter to get a remote test plan and validates it:

$ cat test_script_v4.sh
@include "mean.awk"
BEGIN {
    remote_test_plan="curl --silent "TEST_PLAN_URL
    remote_test_plan | getline count
    printf("Count of test cases: %d\n", count)
    while (count-- > 0) {
        remote_test_plan | getline _  # skip record
        remote_test_plan | getline num_str
        split(num_str, nums, " ")
        actual=mean(nums)
        remote_test_plan | getline expected
        test_status=(actual==expected ? "pass" : "fail")
        print(test_status)
    }
    close(remote_test_plan)
}

We defined the remote_test_plan variable with the command string to fetch the remote test plan. Further, we used getline to execute the remote_test_plan command and read its content from the pipe.

Lastly, let’s see the test_script_v4.sh script in action by passing the URL of the locally hosted test plan:

$ awk -v TEST_PLAN_URL="file://$(pwd)/test_cases_mean.txt" -f test_script_v4.sh
Count of test cases: 3
pass
pass
pass

Fantastic! It looks like we’ve nailed this one.

7. Conclusion

In this article, we learned about the getline function in AWK. Furthermore, we learned about the split() and close() functions in AWK while solving the use case of writing a test script for validating a test plan.

Lastly, we explored multiple usage patterns of getline, namely, reading from the main input file, a secondary file, and the output of an external curl command via pipe expressions.