Project

General

Profile

Actions

Link checker » History » Revision 4

« Previous | Revision 4/11 (diff) | Next »
Zhoujie Li, 04.09.2023 09:18


Link checker

Introduction

The Link Checker script is to help you identify and manage broken links and images on a website. It automates the process of checking URLs within a specified domain and provides detailed reports on the status of each link and image.

Script destination

script location: extension/Resources/Public/scripts
script Name: web_crawler.py
scropt Name: web_crawler_dev.py

Usage

Configuration

Before using the Link Checker script, you need a configuration file.
It looks like this

  • "startUrl": The URL where the link checking will begin.
  • "login_url": URL for logging in if required. If empty it will use the "startUrl" instead.
  • "username and password": Login credentials. If you don't have login credentials, leave this field empty it's important !
  • "max_depth": The maximum depth to crawl links.
  • "target_path": The path to restrict link checking (e.g., /blog).
  • "target_string": Looking for a unique string.
  • "blacklist": URLs to exclude from checking.

Running the Script

You can run the Link Checker script using the following command:

Updated by Zhoujie Li 11 months ago · 4 revisions