A simple Node.js web scraper using website-scraper to download an entire website.
Make sure you have Node.js installed on your machine.
-
Clone the repository:
git clone https://github.com/Bahrul-Rozak/url-to-code.git
-
Navigate to the project directory:
cd your-repo-name -
Install dependencies:
npm install
-
Open
index.jsin your preferred code editor. -
Set the
websiteUrlvariable to the URL of the website you want to scrape.const websiteUrl = 'https://example.com';
-
Customize other options if needed (e.g.,
maxDepth,directory, etc.). -
Run the scraper:
node index.mjs
-
Check the
./resultdirectory for the downloaded website.
urls: An array of URLs to scrape.urlFilter: A function to filter URLs. The example filters URLs that start with the specifiedwebsiteUrl.recursive: Iftrue, the scraper will follow links recursively.maxDepth: Maximum recursion depth.prettifyUrls: Iftrue, URLs will be prettified.filenameGenerator: File naming strategy, set to'bySiteStructure'in the example.directory: Output directory for the downloaded website.
- website-scraper for providing an easy-to-use web scraping library.
Happy downloading! 🕸️