Project

General

Profile

Actions

Feature #17670

open

WeasyPrint (instead of wkhtml) for PDF with header/footer/toc/...

Added by Carsten Rose 3 months ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
Start date:
21.01.2024
Due date:
% Done:

0%

Estimated time:
Discuss:
Prio Planung:
No
Vote:

Description

Unter https://wkhtmltopdf.org/status.html weist wkhtml eindringlich daraufhin das wenn wkhtml fuer nicht vertrauenswuerde HTML content verwendet wird, der Server uebernommen werden kann.

Das ist bei uns zwar nicht der Fall, aber weasyprint sollte einen Blick wert sein.

Das Hauptproblem bei wkthml ist, das sobald eine Datei (js, css, png, jpg,...) nicht geladen werden kann, es sofort einen Abbruch gibt, aber keinen Hinweis auf die Ursache.

Actions #1

Updated by Enis Nuredini 3 months ago

Recherche und zusammengefasste wichtige Punkte:

WeasyPrint basiert auf Python, daher wäre eine Python Integration in Projekten sehr einfach umsetzbar falls gewollt.

Voraussetzungen für Installation:
• python3 3.7.0
• pango-view 1.44

apt install python3-pip python3-cffi python3-brotli libpango-1.0-0 libharfbuzz0b libpangoft2-1.0-0
Einfache CMD: 
weasyprint https://weasyprint.org /tmp/weasyprint-website.pdf

weasyprint https://weasyprint.org /tmp/weasyprint-website.pdf \
    -s <(echo 'body { font-family: serif !important }')

Command-line API: https://doc.courtbouillon.org/weasyprint/stable/api_reference.html#command-line-api

Wichtige Unteschiede:

  • WeasyPrint focuses more on accurate rendering of HTML/CSS content than multiple features.
  • Better Error response: Since WeasyPrint is written in Python and focuses on CSS compliance, errors often come with clear descriptions about what went wrong, such as unsupported CSS properties, missing resources, or layout issues.
  • No JS Support: WeasyPrint has excellent CSS support, often better in terms of CSS compliance than wkhtmltopdf.
    However, it does not support JavaScript, so any dynamic content in the HTML that relies on JavaScript will not be rendered correctly.
  • Performance slower: WeasyPrint might be slower in comparison, especially with complex layouts or very long documents, due to its Python-based rendering engine.
  • Some options need more CSS knowledge and cant be controlled that precise: To add headers and footers in WeasyPrint, you would typically use CSS page rules (@page). This involves specifying the content and style within your HTML/CSS. For example, you can use ::before and ::after pseudo-elements to create headers and footers.
  • Less CMD options: WeasyPrint has fewer command-line options and relies more on CSS for layout and formatting. While it supports crucial features like page size and margins via CSS, some of the fine-grained control provided by wkhtmltopdf command-line options might not be available.
  • No authentication: Web-Related Features: Features like cookies, CORS, custom HTTP headers, or disabling/enabling web security, which are available in wkhtmltopdf, are not present in WeasyPrint. These are relevant when converting web pages that require authentication or specific HTTP interactions.
  • One HTML content at a time: File Conversion Options: wkhtmltopdf can convert multiple HTML files into a single PDF and control the cover, table of contents, and pages separately. WeasyPrint focuses more on converting a single HTML document at a time.
Actions #2

Updated by Enis Nuredini 3 months ago

Um das Aufstarten des weasyprints für jedes Dokument zu vermeiden kann statt dem CMD auch ein python API verwendet werden: https://doc.courtbouillon.org/weasyprint/stable/api_reference.html#python-api

from weasyprint import HTML, CSS
HTML('https://weasyprint.org/').write_pdf('/tmp/weasyprint-website.pdf',
    stylesheets=[CSS(string='body { font-family: serif !important }')])
Actions #3

Updated by Carsten Rose about 1 month ago

  • Tracker changed from Support to Feature
Actions

Also available in: Atom PDF