1. Introduction
Comments are human-readable parts of a source code file that machine interpreters usually ignore. There are different ways to designate a comment as such. These depend on the language and syntax. In Bash scripts, comments start with the # octothorp (hash) character and continue until a newline.
There are two general types of Bash comments:
#!/bin/bash
# line comment
command argument1 argument2 # inline comment
In this tutorial, we’ll talk about ways to remove all comments from a Bash script file. First, we explore methods for removing line comments. After that, we turn to inline comments, the parsing of which increases the complexity of the task.
Importantly, we also attempt to avoid removing the script shebang, as seen in the first line above.
We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It should work in most POSIX-compliant environments.
2. Removing Bash Line Comments
Since the logic for that is much simpler than handling all comment types, initially, we only concentrate on removing line comments, excluding the shebang.
Our general regular expression (regex) consists of several elements:
- ^ marks the beginning of the line
- [[:blank:]] followed by an * asterisk means any amount of whitespace
- # octothorp is the comment start (after any or no start-of-line whitespace)
- [^!] means any character except an exclamation point
Let’s apply its variations with several tools.
2.1. Using grep
With grep, we can remove all line comments by leveraging the –invert-match or -v flag and our regex:
$ grep --invert-match '^[[:blank:]]*#[^!]' script.sh
In essence, we leave in all non-matching lines. However, this solution leaves any shebang-like comment, not only the actual shebang on the first line. Let’s see what can be improved.
2.2. Using sed
Alternatively, we could employ sed for our purposes:
$ sed -e '1s/^#[^!].*$//' -e '2,$s/^[[:blank:]]*#.*//g' script.sh
In essence, the second -e command functions like its grep equivalent but uses the sed [s]ubstitute function on all lines after the first and doesn’t make an exception for shebang-like constructs. However, the first -e command only works on line 1, removing non-shebang comments.
2.3. Using awk
Of course, we can also use awk:
$ awk '(NR==1 && !/^[[:blank:]]*#[^!]/) {print;} (NR>1 && !/^[[:blank:]]*#/) {print;}' script.sh
Similarly to grep, we use a ! negative regular expression and print only the lines that don’t match. Again, we allow only shebang comments on the NR==1 first line, while any line comments NR>1 after that aren’t printed.
2.4. Using vi
We can also achieve our aims by employing regular expressions in vi:
$ vi script.sh
[...]
:1s/^#[^!].*$//g | 2,$s/^[[:blank:]]*#.*//g
To search and replace in Vi, we can :s[ubstitute] by line number just like when using sed. Here, we combine two commands via a | (:bar).
In most cases above, we can match and remove all leftover blank and empty lines with a regex and command like g/^[[:space:]]*$/d.
3. Remove All Bash Comments
In general, we can attempt a solution that also works on inline comments. However, the previous tools are mostly ill-equipped due to their use of regular expressions and the vastly more complex logic required for the task.
While we can simply remove all characters after and including any # octothorp, this can lead to incorrect results in many cases:
- escaping
- quoting
- here-strings
Even purpose-built tools like the sed-based sed-octo-proctor miss some of the exceptions in a complex script:
$ cat script.sh
#!/bin/bash
# line comment
command argument1 argument2 # inline comment
echo \#
echo "#" "#"
echo '
# not a comment
'
cat <<EOI
# not a comment
command # not a comment
EOI
Still, *there are sed-based implementations like dehash, which do manage to handle such exceptions as well*:
$ ./dehash -o - script.sh
#!/bin/bash
command argument1 argument2
echo \#
echo "#" "#"
echo '
# not a comment
'
cat <<EOI
# not a comment
command # not a comment
EOI
Although it requires go, another solution is the sophisticated shfmt shell script parser:
$ shfmt -mn script.sh
#!/bin/bash
command argument1 argument2
echo \#
echo "#" "#"
echo '
# not a comment
'
cat <<EOI
# not a comment
command # not a comment
EOI
Seemingly, there are no loopholes and exceptions to the grammar capabilities of dehash and shfmt. Moreover, the latter can interpret and reformat scripts.
4. Summary
In this article, we saw ways to remove comments from a Bash script.
In conclusion, among the many tools that could do the job, only a few can actually correctly perform everything necessary.