Project

General

Profile

Link checker » History » Revision 4

Revision 3 (Zhoujie Li, 04.09.2023 08:45) → Revision 4/11 (Zhoujie Li, 04.09.2023 09:18)

h1. Link checker 

 h2. Introduction 

 The Link Checker script is to help you identify and manage broken links and images on a website. It automates the process of checking URLs within a specified domain and provides detailed reports on the status of each link and image. 


 h2. Script Link destination 

 script location: extension/Resources/Public/scripts 
 script Name: web_crawler.py 
 scropt Name: web_crawler_dev.py 

 h2. Usage 

 h3. Configuration 

 Before using the Link Checker script, you *need* a configuration file. 
 It looks like this  
 !clipboard-202309040834-symyc.png! 

 * "startUrl": The URL where the link checking will begin. 
 * "login_url": URL for logging in if required. If empty it will use the "startUrl" instead. 
 * "username and password": Login credentials. If you don't have login credentials, leave this field *empty* it's *important !* 
 * "max_depth": The maximum depth to crawl links. 
 * "target_path": The path to restrict link checking (e.g., /blog). 
 * "target_string": Looking for a unique string. 
 * "blacklist": URLs to exclude from checking. 

 h3. Running the Script 

 You can run the Link Checker script using the following command: