Project

General

Profile

Feature #10715

wkhtmltopdf ersetzen: puppeteer

Added by Marc Egger about 1 year ago. Updated about 2 months ago.

Status:
New
Priority:
High
Assignee:
Target version:
Start date:
08.06.2021
Due date:
% Done:

0%

Estimated time:
Discuss:

Description

Prio4 von Nicola

Varianten:

Tools

Diese koennen vermutlich alle kein JS, sind dafuer aber als LIB oder API sehr schnell:


Files

final.pdf (59.4 KB) final.pdf Carsten Rose, 07.06.2021 15:30
qfq.cookie.8ZORgh (1 KB) qfq.cookie.8ZORgh Carsten Rose, 10.06.2021 07:52
#1

Updated by Marc Egger about 1 year ago

  • Description updated (diff)
#2

Updated by Carsten Rose about 1 year ago

  • Subject changed from wkhtmltopdf ersetzen to wkhtmltopdf ersetzen: puppeteer
#4

Updated by Carsten Rose about 2 months ago

  • Priority changed from Normal to High
  • Target version changed from 21.8.0 to 21.7.0
#5

Updated by Carsten Rose about 2 months ago

  • Description updated (diff)
#6

Updated by Carsten Rose about 2 months ago

  • Description updated (diff)
#7

Updated by Carsten Rose about 2 months ago

Test

Umgebung

# Test als npm - Installation hat funktioniert, aber das Binary blieb haengen
# FALSCH: CR hat einfach nur nicht lange genug gewartet. Beim ersten Start wird in das Home des users ein kompletter Chrome geladen!!! >300MB

[root@host] 
$ npm install -g crpdf

# Aufruf bleibt haengen
$ crpdf https://math.uzh.ch final.pdf 

# Installation des fertigen Linux Pakets https://github.com/JorgenEvens/crPDF/releases
[root@host] mv /var/tmp/crpdf /usr/local/bin/crpdf-pkg

# Aufruf OK fuer root UND fuer www-data
[root@host] cd; crpdf-pkg 'http://webwork16.math.uzh.ch/crose/qfq/index.php?id=annotatefabric&form=annotateGraphic&r=1' final.pdf
[www-data@host] cd; crpdf-pkg 'http://webwork16.math.uzh.ch/crose/qfq/index.php?id=annotatefabric&form=annotateGraphic&r=1' final.pdf
#8

Updated by Carsten Rose about 2 months ago

#9

Updated by Carsten Rose about 2 months ago

  • Assignee changed from Carsten Rose to Benjamin Baer
  • Start date changed from 04.06.2020 to 08.06.2021

Liste der wkhtml Optionen: https://wkhtmltopdf.org/usage/wkhtmltopdf.txt

Ziel: zumindest die aktuell verwendeten Optionen sollten funktionieren.

Wichtig sind die Cookies. Achtung : in dem --cookie-jar File werden alle aktuellen Cookies kopiert (QFQ, FE, PHP, ...)

user arguments
------------------

--margin-bottom=20mm
--footer-left="Seite: [page]/[toPage]" 
--footer-right="1234, UZH Doc.Mobility" 
--footer-font-size=8
--footer-spacing=10

QFQ added arguments
-------------------
--cookie-jar ‘/tmp/qfq.cookie….’
'--custom-header User-Agent ' . escapeshellarg($_SERVER['HTTP_USER_AGENT'] ?? '') 
--custom-header-propagation
#11

Updated by Benjamin Baer about 2 months ago

Damit ich mal wieder meine Browser Fenster aufraeumen kann, hier noch einige Punkte zum jetzigen Status.

Unser crpdf: https://git.math.uzh.ch/utilities/crpdf-custom
Download builds: https://www.math.uzh.ch/repo/?dir=crpdf

About Cookies:
Bug in old pupeteer: https://github.com/puppeteer/puppeteer/issues/717
Cookies and Pupetteer uebergeben: https://stackoverflow.com/questions/50418994/pass-signed-cookie-to-puppeteer
Multiple Cookies: https://stackoverflow.com/questions/50584770/passing-multiple-cookies-to-puppeteer
Spread Syntax: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Spread_syntax
Set Cookie has no effect: https://github.com/puppeteer/puppeteer/issues/1342

Browserfetcher problems: Newer versions of browserFetcher don't support the current getStandardVersion anymore.
Pupeteer v10 uses Chromium 92.0.4512.0, see top of: https://github.com/puppeteer/puppeteer/blob/v10.0.0/docs/api.md
To get the slug to download the version (eg 885287 for linux chrome 92.0.4512.0), go to: http://omahaproxy.appspot.com/
Download brings error that it wasn't found with 885287
https://github.com/puppeteer/puppeteer/blob/v10.0.0/docs/api.md#class-browserfetcher

About NPM:
cmd: https://nodejs.org/en/knowledge/command-line/how-to-parse-command-line-arguments/

#13

Updated by Carsten Rose about 2 months ago

  • crpdf (Puppeteer) von Benj: https://git.math.uzh.ch/utilities/crPDF/-/tree/master
  • Das json Array muss wie folgt aussehen:
    {"cookies": [
            {
                    "name": "test",
                    "value": "3fh822ji4f244",
                    "domain": "w16.math.uzh.ch",
                    "path": "/my/" 
            },
            {
                    "name": "mondieux",
                    "value": "oh mon dieux",
                    "domain": "w16.math.uzh.ch",
                    "path": "/my/" 
            }
    ]}
    

Also available in: Atom PDF