Download an Entire Website Using wget

06-01-2021

I was recently doing some research and found a few sites that I really liked, whether it was their styling or layout or something, I was drawn in. I wanted to dive deeper into how they implemented their solution, but using the developer tools and poking around at minified files was too manual and time consuming. I wanted a way to download everything so I could use my code editor and other tools I was used to using.

The Solution

I found this handy article on downloading an entire website using wget. The final command I used looked something like this:

1
wget -r --reject mp3,mp4 -e robots=off https://brianchildress.co/

Breaking this down:

wget: the tool that’s available for the GNU operating system

-r: Recursive, downloads all assets and resources recursively

--reject: Rejects all file types based on the array of file extensions provided

-e robots=off Respects the robots.txt file and ignores any files that are off limits to the robots

<url> The URL of the site we want to download from

Additional options worth considering

-m: Mirror, enables recursion and time-stamping, maintains FTP directory listings.
-p: Page-requisites, retrieves all images, etc. needed to display HTML page.
-k: Convert-links, alters links in downloaded HTML point to local files.