Monday, November 2, 2009

Saving websites

I frequently save websites I come across for reference purposes. Fortunately, there are several ways to do this, depending on your needs. If you need only the text from a site, in both Firefox and Internet Explorer, you can choose File/Save (Page) As, and the option "Web page, html only". Both browsers also have the option to save the whole page, including graphics, as a web archive. In Firefox, choose File/Save Page As/Web archive MHTML, and in IE choose File/Save As/Web archive, single file (*.mht). Additionally, you can save the whole web page, and this downloads all the graphics and other elements such as Cascading Style Sheets to a separate folder. Choose Save As/Web page complete in both browsers.
Yet a further option is to download an entire site. You could be setting yourself up for a VERY big download, depending on the site, but if you really want to go into the depths of a site it will yield a lot of useful data. To do it, you will need a separate piece of software, for instance HTTrack, a free program that will save all the pages within a domain, including images, while maintaining the link structure. It does not save the pages from external links like those in advertising. Guess if I were an industrial spy, a tool like this would be a must-have in my armoury...

No comments: