Project

General

Profile

Link checker » History » Version 9

Zhoujie Li, 30.10.2023 15:46

1 1 Zhoujie Li
h1. Link checker
2 2 Zhoujie Li
3 3 Zhoujie Li
h2. Introduction
4
5
The Link Checker script is to help you identify and manage broken links and images on a website. It automates the process of checking URLs within a specified domain and provides detailed reports on the status of each link and image.
6
7 4 Zhoujie Li
h2. Script destination
8 3 Zhoujie Li
9 2 Zhoujie Li
script location: extension/Resources/Public/scripts
10
script Name: web_crawler.py
11 3 Zhoujie Li
12
h2. Usage
13
14
h3. Configuration
15
16
Before using the Link Checker script, you *need* a configuration file.
17
It looks like this 
18
!clipboard-202309040834-symyc.png!
19
20
* "startUrl": The URL where the link checking will begin.
21
* "login_url": URL for logging in if required. If empty it will use the "startUrl" instead.
22
* "username and password": Login credentials. If you don't have login credentials, leave this field *empty* it's *important !*
23
* "max_depth": The maximum depth to crawl links.
24
* "target_path": The path to restrict link checking (e.g., /blog).
25
* "target_string": Looking for a unique string.
26 1 Zhoujie Li
* "blacklist": URLs to exclude from checking.
27
28 5 Zhoujie Li
h3. Ignore CSS class
29 6 Zhoujie Li
30 5 Zhoujie Li
This script also ignore the CSS class "link-checker-skip"
31
32 3 Zhoujie Li
h3. Running the Script
33 1 Zhoujie Li
34
You can run the Link Checker script using the following command:
35 5 Zhoujie Li
!clipboard-202309051340-pj0ak.png!
36 7 Zhoujie Li
<pre>
37
python web_crawler.py conj.json "all or <index>"
38
</pre>
39 5 Zhoujie Li
40
h2. Result/Output
41
42
It generate detailed reports. These reports include:
43
* Broken links and images with response codes.
44
* Denied links with 403 Forbidden errors.
45
* Redirects to the home page.
46
* Successfully checked links.
47
* The results will be saved in log files (detail.log and summary.log) and a CSV file containing broken links.
48
49
h3. Summary log:
50
51
*0 error*
52 9 Zhoujie Li
53 8 Zhoujie Li
!clipboard-202309051417-p5lf8.png!
54 5 Zhoujie Li
55 1 Zhoujie Li
56
*1 or more error*
57 9 Zhoujie Li
58 5 Zhoujie Li
!clipboard-202309051415-bmmau.png!
59 9 Zhoujie Li
60 5 Zhoujie Li
61
h3. Detail log:
62
63
!clipboard-202309051423-xa8x4.png!
64