Link checker » History » Revision 4
Revision 3 (Zhoujie Li, 04.09.2023 08:45) → Revision 4/11 (Zhoujie Li, 04.09.2023 09:18)
h1. Link checker h2. Introduction The Link Checker script is to help you identify and manage broken links and images on a website. It automates the process of checking URLs within a specified domain and provides detailed reports on the status of each link and image. h2. Script Link destination script location: extension/Resources/Public/scripts script Name: web_crawler.py scropt Name: web_crawler_dev.py h2. Usage h3. Configuration Before using the Link Checker script, you *need* a configuration file. It looks like this !clipboard-202309040834-symyc.png! * "startUrl": The URL where the link checking will begin. * "login_url": URL for logging in if required. If empty it will use the "startUrl" instead. * "username and password": Login credentials. If you don't have login credentials, leave this field *empty* it's *important !* * "max_depth": The maximum depth to crawl links. * "target_path": The path to restrict link checking (e.g., /blog). * "target_string": Looking for a unique string. * "blacklist": URLs to exclude from checking. h3. Running the Script You can run the Link Checker script using the following command: