Browse documentation for IT-systems used by the European Environment Agency and the Eionet network.
Simple dockerised python application that takes in a list of urls, extract urls and checks the links for http status code by retrieving the HEAD data from the host.
it requires docker engine installed.
The “path-to-data-folder” is a path to a folder on your host where you must make available a file (urls-to-analyze.txt) with urls. The file must contain one page url per line.
The tool will scan each page html and extract links from the page.
If the optional variable EXCLUDE_LINKS is passed the urls containing that string wil be skipped for checking. This can be useful if you want to extract and check only external links from your site. In this last case you pass the environment variable EXCLUDE_LINKS=yourdomain.com.
At the end the tool reports each link status code (200, 301, 404 etc.).