I was recently doing some research and found a few sites that I really liked, whether it was their styling or layout or something, I was drawn in. I wanted to dive deeper into how they implemented their solution, but using the developer tools and poking around at minified files was too manual and time consuming. I wanted a way to download everything so I could use my code editor and other tools I was used to using.
wget -r --reject mp3,mp4 -e robots=off https://brianchildress.co/
Breaking this down:
wget: the tool that’s available for the GNU operating system
-r: Recursive, downloads all assets and resources recursively
--reject: Rejects all file types based on the array of file extensions provided
-e robots=off Respects the robots.txt file and ignores any files that are off limits to the robots
<url> The URL of the site we want to download from
Additional options worth considering
-m: Mirror, enables recursion and time-stamping, maintains FTP directory listings.
-p: Page-requisites, retrieves all images, etc. needed to display HTML page.
-k: Convert-links, alters links in downloaded HTML point to local files.