1. Overview
The wget command, a Linux utility, is a crucial tool for downloading files from the web. Its extensive features and options make it versatile for fetching content from URLs (Universal Resource Locators).
In this guide, we’ll delve into the various functionalities of the wget command, covering basic usage and more advanced features.
2. Common wget Command Options
For the start, we examine the simple basic syntax of the wget command:
wget [options] [URL]
Let’s break down the components of the wget command:
- [options]: These represent the various command-line flags available with wget.
- [URL]: This represents the URL from which to download the file.
Conversely, the command defaults to downloading the specified URL and saves the downloaded file when used without options. So, the syntax looks like this:
wget [URL]
Now, let’s explore some common options associated with the wget command:
Options
Description
-P, –directory-prefix=PREFIX
Specifies the directory where the downloaded file will be saved.
-O, –output-document-FILE
Specifies the name of the downloaded file.
-r, –recursive
Enables recursive downloading, which is useful for downloading entire websites.
-np, –no-parent
Restricts downloading to the specified directory, preventing retrieval of files from parent directories.
-c, –continue
Resumes a partially downloaded file.
-q, –-quiet
Suppresses output, making wget operate in quiet mode.
In the above table, the options allow us to customize the behavior of wget according to our requirements.
3. Common wget Command Examples
Let’s dive into the practical examples of using the wget command.
3.1. Download a Single File
The most basic use of the wget command is to download a single file from a URL.
For example, let’s download a file named requirement.txt from a GitHub repository:
$ wget https://github.com/Abwonder/diabetesprediction/blob/main/requirements.txt
--2024-05-04 10:54:50-- https://github.com/Abwonder/diabetesprediction/blob/main/requirements.txt
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘requirements.txt’
requirements.txt [ <=> ] 144.66K 131KB/s in 1.1s
2024-05-04 10:54:52 (131 KB/s) - ‘requirements.txt’ saved [148135]
Note that the URL points the wget command to the file to download, which is the endpoint on the URL. In this case, requirements.txt is the file on the endpoint of the URL.
The file downloaded is named requirements.txt and is in the current folder where it was downloaded.
3.2. Save Downloaded File to a Specific Directory
Proceeding from the previous example, we can now include the option to specify the directory where the downloaded file should be saved using the -P option.
For instance, we repeat the previous download and save it to the download directory:
$ wget -P ~/Downloads https://github.com/Abwonder/diabetesprediction/blob/main/requirements.txt
--2024-05-04 11:11:29-- https://github.com/Abwonder/diabetesprediction/blob/main/requirements.txt
.........
2024-05-04 11:11:30 (293 KB/s) - ‘/home/kali/Downloads/requirements.txt’ saved [148132]
In the code above, the -P directs wget command to save the file it’s downloading to Downloads. The -P command requires adding the pathway to the directory; it should point the command where to save the file.
More so, we observed that the downloaded file is saved to /home/kali/Downloads/requirements.txt.
3.3. Rename Downloaded File
The -O option, in combination with the wget command, renames the downloaded file to what’s specified within the code. Subsequently, let’s examine the use of this option:
$ wget -O newbaeldung.txt https://github.com/Abwonder/diabetesprediction/blob/main/requirements.txt
..........
2024-05-04 12:09:01 (283 KB/s) - ‘newbaeldung.txt’ saved [148125]
Now, the output showed that the file was downloaded and saved as newbaeldung.txt, and the size of the file was 148125 bytes.
3.4. Download an Entire Website
The –recursive –no-parent option gives the wget command the capability to download an entire website for offline viewing recursively. So, we can proceed to try this command out on www.skrill.com website:
$ wget --recursive --no-parent https://www.skrill.com/en/
.........
2024-05-04 12:42:50 (60.7 KB/s) - ‘www.skrill.com/en/index.html’ saved [29495/29495]
Loading robots.txt; please ignore errors.
.........
2024-05-04 12:43:23 (35.6 MB/s) - ‘www.skrill.com/robots.txt’ saved [930/930]
--2024-05-04 12:43:23-- https://www.skrill.com/en/business/
.........
2024-05-04 12:44:13 (324 KB/s) - ‘www.skrill.com/en/business/index.html’ saved [23662/23662]
--2024-05-04 12:44:13-- https://www.skrill.com/en/support/
............
Saving to: ‘www.skrill.com/en/support/index.html’
Based on the result of the command, it shows that the option triggered the command to continuously reuse the URL to download all the pages on the website.
In addition, the command used the default naming convention for naming files exactly as the original file name on the website.
3.5. Resuming Partially Downloaded File
The -c option allows the wget command to resume file downloading that was initially interrupted. For example, let’s interrupt wget current download using Ctrl + C:
$ wget https://github.com/Abwonder/LinearRegression4
............
Saving to: ‘LinearRegression4’
LinearRegression4 [ <=> ] 36.01K 145KB/s ^C
We used Ctrl+C to interrupt the code above, allowing us to use the -c option with the wget command to resume the downloading process. Now, let’s take a look at how this option works:
$ wget -c https://github.com/Abwonder/LinearRegression4
............
Saving to: ‘LinearRegression4’
LinearRegression4
In the results, the first section shows the downloading that was interrupted with Ctrl+C key combinations, while the second result shows that the downloading was resumed.
In addition, this option empowers the wget command to verify if the file slated for download already resides in the current directory and to resume the download process if it remains incomplete.
4. Advanced wget Command Examples
Essentially, advanced wget commands further extend its capabilities. So, it’s an added advantage to have knowledge of their usage.
Let’s explore more advanced usage scenarios of the wget command.
4.1. Limit Download Speed
The wget command controls download speed by utilizing the rate-limiting option –limit-rate.
For instance, to limit download speed to 100 KB/s:
$ wget --limit-rate=100k www.skrill.com
So, by adjusting the number specified in the options, we can alter the download speed rate in kilobytes per second. For instance, in the code provided, the rate is set to 100k/s.
4.2. Download Files from a List
The wget command can read and download URLs from a file. For this example, let’s combine all the URLs we have referenced previously to create a list named urls.txt using the nano text editor:
www.skrill.com/en/index.html
https://www.skrill.com/en/
https://github.com/Abwonder/LinearRegression4
www.skrill.com
www.example.com
Next, we can proceed to apply the -i option to the wget command:
$ wget -i urls.txt
Now, in the code above, the -i option instructs the wget command to iterate over each of the URLs in the list and download them consecutively. Therefore, the wget command downloads files from the URLs one after the other into the current active folder on our host system.
5. Conclusion
In this article, we extensively discuss the usage of the wget command, including examples and outputs for better understanding. Moreover, it’s important to note that the wget command is a versatile tool for downloading files from the web on Linux.
Therefore, understanding its various options and capabilities allows for efficient file retrieval and management. Moreover, by mastering wget, we can streamline the process of downloading files and automate tasks effectively.