Project

General

Profile

Link checker » History » Version 4

Zhoujie Li, 04.09.2023 09:18

1 1 Zhoujie Li
h1. Link checker
2 2 Zhoujie Li
3 3 Zhoujie Li
h2. Introduction
4
5
The Link Checker script is to help you identify and manage broken links and images on a website. It automates the process of checking URLs within a specified domain and provides detailed reports on the status of each link and image.
6
7
8 4 Zhoujie Li
h2. Script destination
9 3 Zhoujie Li
10 2 Zhoujie Li
script location: extension/Resources/Public/scripts
11
script Name: web_crawler.py
12 3 Zhoujie Li
scropt Name: web_crawler_dev.py
13
14
h2. Usage
15
16
h3. Configuration
17
18
Before using the Link Checker script, you *need* a configuration file.
19
It looks like this 
20
!clipboard-202309040834-symyc.png!
21
22
* "startUrl": The URL where the link checking will begin.
23
* "login_url": URL for logging in if required. If empty it will use the "startUrl" instead.
24
* "username and password": Login credentials. If you don't have login credentials, leave this field *empty* it's *important !*
25
* "max_depth": The maximum depth to crawl links.
26
* "target_path": The path to restrict link checking (e.g., /blog).
27
* "target_string": Looking for a unique string.
28
* "blacklist": URLs to exclude from checking.
29
30
h3. Running the Script
31
32
You can run the Link Checker script using the following command: