HTTrack Advanced Configurations on Ubuntu 20.04 LTS

HTTrack is a unique piece of software for extracting static pages from the web. It has an enormous benefits for web developers to maintain a clean echo-system of their web applications. It also helps them to mitigate any front-end problems.

Here at LinuxAPT, as part of our Server Management Services, we regularly help our Customers to perform related Software configuration queries on Linux systems.

In a previous post we explained how to Install HTTrack Server on Ubuntu 20.04. In this context, we shall look into its advanced configurations.


How to Install HTTrack on Ubuntu Linux system ?

If you haven't installed HTTrack, then open the command-line interface to apply the following commands:

$ sudo apt install httrack webhttrack

HTTrack is only available as a web app for Linux operating systems. It can be used as standalone software on Mac and Windows, but it is not the case for us. 


How to run HTTrack ?

Once installed you will run it via the command line as it is the only option you have:

$ webhttrack

When you run HTTrack (server-ip:8080/server/index.html) then it will take the Welcome to WebHTTrack page, where you can configure it even further.


How to configure HTTrack on Ubuntu ?

1. Select a language

HTTrack prompts you to select a language first. If English is the default language then you do not need to worry about it. Otherwise, select an appropriate language and move ahead.


2. Enter project details

Here, we are going to add project details. The data comes from Linuxapt.com.


3. Select Action and Add URLs

Here, we can select an action out of the given list and add URLs. It depends on what we want to achieve. Here is how each of the actions is different than one another:

  • Download web site(s) This option will copy a full website and will help you to browse it locally.
  • Download web site(s) + questions This action will do the same as the previous one, but it will also download any URL which works with a query string.
  • Get individual files This will download all files separately. It means .css, .html, and the rest of the available files on the server.
  • Download all sites in pages (multiple mirrors) This downloads all the sites available on a single server at once.
  • Test links in pages (bookmark test) Depending on what we want to test on our website, this action will help us to test links on a particular page.


4. Testing the HTTrack configuration

Here, we will select 'Get Individual Files'. Also, we will input a URL here which is http://linuxapt.com


5. Enter URL

Here, we will enter the URL and other credentials. 


6. Add Settings

Click OK to add settings and set any options as required.


7. Get Ready to Mirror website

Here, we are ready to mirror the selected website. However, for the test case, We will save the settings and exit.


[Need assistance in fixing Ubuntu Software configuration? We can help you. ]

This article covers  every aspect of HTTrack settings. In fact, WebHTTrack backs up complete websites for offline access and modifies the links automatically. Despite ubiquitous Internet access, users often have good reason to create offline copies of websites – be it for archiving or to provide the content on your intranet. However, manual mirroring can be time-consuming and cumbersome. Tools like WebHTTrack can help, and they allow convenient updating of the content. Now you are ready to mirror any website using HTTrack on Ubuntu 20.04 Linux distribution. 

Related Posts