Downloading an Entire Web Site with wget or HTTrack
If you ever need to download an entire Web site, perhaps for off-line viewing, wget can do the job—for example:
$ wget \\
--recursive \\
--no-clobber \\
--page-requisites \\
--html-extension \\
--convert-links \\
--restrict-file-names=windows \\
--domains website.org \\
--no-parent \\
[www.website.org/tutorials/html/](http://web.archive.org/web/20141115162522/http://www.website.org/tutorials/html/ "www.website.org/tutorials/html/")
This command downloads the Web site www.website.org/tutorials/html/.
The options are:
-
--recursive: download the entire Web site.
-
--domains website.org: don't follow links outside website.org.
-
--no-parent: don't follow links outside the directory tutorials/html/.
-
--page-requisites: get all the elements that compose the page (images, CSS and so on).
-
--html-extension: save files with the .html extension.
-
--convert-links: convert links so that they work locally, off-line.
-
--restrict-file-names=windows: modify filenames so that they will work in Windows as well.
-
--no-clobber: don't overwrite any existing files (used in case the download is interrupted and
resumed).
Source from : http://bit.ly/14dEluH
Is that all? NO!! there are number of other ways/options avilable you can use
wget -p -k http://www.example.com/
wget -A pdf,jpg -m -p -E -k -K -np http://site/path/
wget -m -p -E -k -K -np http://site/path/
wget --no-clobber --convert-links --random-wait -r -p -E -e robots=off -U mozilla http://site/path/
wget --user-agent=Mozilla --content-disposition --mirror --convert-links -E -K -p http://example.com/
But is that all? are those better? NO!!!
The best solution I could came across is HTTracker, it's fast and it organize your content with all the local URLs, so it will be the best solution to scrape and clone a online website to use offline.
© Heshan Wanigasooriya.RSS🍪 This site does not track you.