This is a simple command-line program implemented in Go that fetches web pages and saves them to disk for later retrieval and browsing.
Make sure you have docker installed on your machine.
Clone the repository:
git clone https://github.com/karthick2696/web-parser.git
Changing Directory (cd) to the Repository:
cd web-parser
Build the program
docker build -t web-parser .
To fetch a web page, run the following command
docker run web-parser https://www.google.com https://autify.com
docker run web-parser --metadata https://www.google.com https://autify.com
Output :
We are running program using docker so html file of website will be stored in docker volumes. Due to that we need to move html files from docker to local
Get docker container id of web-parser using below commend
docker ps -a
Output :
Copy container id
- Copy html files from docker container based on container-id using below commend
- Replace container-id by value we copied at first step
docker cp container-id:/app/. ./output
- Files will be moved to output folder of your local code repo
- nevigate to output folder to view html files using below commends
cd output
ls
Output :
Attached image reference for all three steps
When ever we run program using docker new container-id will be generated each time. Due to that we need take latest container id to copy files from docker to local machine. otherwise it will copy old files only from old container