Downloading a Password-Protected Website for Offline Access
Recently, I needed to download a website with content for offline access.
This seems like a fairly trivial task, and it is for most simple websites.
However, if you need to download a website whose content is hidden behind authentication, it’s a different story.
There are various authentication methods, which complicates the task. For simple methods where parameters are passed through POST, most tools have options to provide the login and password.
For example:
wget --user=username --password=password -r -np -k -p -H https://example.com/
In my case, authentication occurs on the frontend, and the token is stored in cookies. Therefore, simply providing the login and password won’t work.
In this situation, the easiest solution is to pass prepared cookies containing the appropriate tokens. To do this, you need to log in to the website and use the extension: https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid.
Click export, and the corresponding file will be saved to your disk.
Next, you can specify this file in the wget command using the parameter:
--load-cookies…