Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix font handling #5

Open
PylotLight opened this issue Jul 5, 2024 · 17 comments
Open

Fix font handling #5

PylotLight opened this issue Jul 5, 2024 · 17 comments

Comments

@PylotLight
Copy link

I never used to have to deal with fonts in python, so not sure why I'm being forced to define all this stuff I don't want to deal with here.

fs, err := fc.LoadFontsetFile(fontmapCache)
fontconfig := text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))

All I want is a simple html to pdf here from string to file.
err := pdf.HtmlToPdf(os.Stdout, utils.InputString(html), fs)

@benoitkugler
Copy link
Owner

Your snippet is almost correct, just pass fontconfig instead of fs in HtmlToPdf

The python implementation uses C dependencies to handle fonts. This module uses a pure Go implementation which uses an on-disk cache to store font information. We have chosen to expose the path to the font cache.

I'm working towards enabling go-text as a replacement for the text engine, so that the FontConfiguration creation will slightly change in the future. The reference to fcfonts.NewFontMap and fc.Standard will not be needed anymore.

@PylotLight
Copy link
Author

PylotLight commented Jul 8, 2024

I don't have the fontmapCache file present so I can't get this example to work currently.
Copying the test file exactly gives:

	var fontconfig text.FontConfiguration
	const fontmapCache = "pdf/test/cache.fc"
	fs, _ := fc.LoadFontsetFile(fontmapCache)
	fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
	err := goweasyprint.HtmlToPdf(os.Stdout, utils.InputString(html), fontconfig)

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xff864a]

goroutine 1 [running]:
github.com/benoitkugler/textprocessing/pango.(*GlyphString).fallbackShape(0xc0008157c0, {0xc00013a0b0, 0x2c, 0x2c}, 0xc000694ff0)
/home/User/go/pkg/mod/github.com/benoitkugler/[email protected]/pango/glyphs.go:213 +0x14a

@benoitkugler
Copy link
Owner

See the file pdf/draw_test.go and the snippet :

// this command has to run once
fmt.Println("Scanning fonts...")
_, err := fc.ScanAndCache(fontmapCache)
if err != nil {
	log.Fatal(err)
}

fs, err := fc.LoadFontsetFile(fontmapCache)
if err != nil {
log.Fatal(err)
}
fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))

@PylotLight
Copy link
Author

PylotLight commented Jul 8, 2024

2024/07/09 09:44:17 invalid font dir /usr/share/texmf/fonts/opentype/public stat /usr/share/texmf/fonts/opentype/public: no such file or directory

Ye this font stuff is just really not working for me. Perhaps I just wait for the replacement of these parts ;p
I got go-wkhtmltopdf working for now, but I'll come back to this one if I can ever get it working.

@benoitkugler
Copy link
Owner

The error message is just a warning, it shouldn't fatal.
What is the error returned by fc.ScanAndCache ?

@PylotLight
Copy link
Author

Full example

func main() {
	html := ""

	var fontconfig text.FontConfiguration
	const fontmapCache = "pdf/test/cache.fc"
	fmt.Println("Scanning fonts...")
	_, err := fc.ScanAndCache(fontmapCache)
	if err != nil {
		log.Fatal(err)
	}

	fs, err := fc.LoadFontsetFile(fontmapCache)
	if err != nil {
		log.Fatal(err)
	}
	fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
	err = goweasyprint.HtmlToPdf(os.Stdout, utils.InputString(html), fontconfig)
}
Scanning fonts...
2024/07/09 21:59:38 invalid font dir /usr/share/texmf/fonts/opentype/public stat 
/usr/share/texmf/fonts/opentype/public: no such file or directory
2024/07/09 21:59:39 open pdf/test/cache.fc: no such file or directory
exit status 1

@benoitkugler
Copy link
Owner

Thank you for the full example.There is something strange though : only one fatal log should happen (since log.Fatal exit the program).
Could you be even more specific and print all the errors ? (That is add fmt.Println(err)) )

@PylotLight
Copy link
Author

PylotLight commented Jul 9, 2024

That was with empty html string, this link has a sample page in it.
https://pastecode.dev/s/twxqkbfe

Scanning fonts...
2024/07/10 00:42:37 invalid font dir /usr/share/texmf/fonts/opentype/public stat /usr/share/texmf/fonts/opentype/public: no such file or directory
open pdf/test/cache.fc: no such file or directory
loading font set: open pdf/test/cache.fc: no such file or directory
webrender.progress: 2024/07/10 00:42:40 Step 1 - Fetching and parsing HTML
webrender.progress: 2024/07/10 00:42:40 Step 3 - Applying CSS - 1 sheet(s)
webrender.progress: 2024/07/10 00:42:40 Step 4 - Creating formatting structure
webrender.progress: 2024/07/10 00:42:40 Step 5 - Creating layout - Page 1
webrender.progress: 2024/07/10 00:42:40 Step 6 - Drawing pages
webrender.progress: 2024/07/10 00:42:40 Step 7 - Adding PDF metadata
%PDF-1.7
%����
4 0 obj
<</DecodeParms [ null ] /Filter [/FlateDecode] /Length 69 >>
stream
���� C��W��sfX��5���oʲ~SV=sT��׹��dY{ɲ�%4!��4M�|����
endstream
endobj
3 0 obj
<<
/Type/Page
/Parent 2 0 R
/MediaBox [0 0 595.27563 841.88983]
/BleedBox [0 0 595.27563 841.88983]
/TrimBox [0 0 595.27563 841.88983]
/Contents [4 0 R]
>>
endobj
2 0 obj
<</Type/Pages/Count 1/Kids [3 0 R]>>
endobj
1 0 obj
<<
/Type/Catalog
/Pages 2 0 R
>>
endobj
5 0 obj
<<
/Producer (Go-WebRender 0.59)
>>
endobj
xref
0 6
0000000000 65535 f 
0000000401 00000 n 
0000000349 00000 n 
0000000178 00000 n 
0000000015 00000 n 
0000000449 00000 n 
trailer
<<
/Size 6
/Root 1 0 R
/Info 5 0 R
>>
startxref
500

@benoitkugler
Copy link
Owner

Could you add the exact Go sample you use ? It still don't get why the program does not exit at the first log.Fatal.

@PylotLight
Copy link
Author

PylotLight commented Jul 14, 2024

Could you add the exact Go sample you use ? It still don't get why the program does not exit at the first log.Fatal.

package main

import (
	"fmt"
	"os"

	goweasyprint "github.com/benoitkugler/go-weasyprint"
	fc "github.com/benoitkugler/textprocessing/fontconfig"
	"github.com/benoitkugler/textprocessing/pango/fcfonts"
	"github.com/benoitkugler/webrender/text"
	"github.com/benoitkugler/webrender/utils"
)

func main() {
	html := `<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>My Website</title>
  </head>
  <body>
    <main>
        <h1>Welcome to My Website</h1>  
    </main>
  </body>
</html>
`

	var fontconfig text.FontConfiguration
	const fontmapCache = "pdf/test/cache.fc"
	fmt.Println("Scanning fonts...")
	_, err := fc.ScanAndCache(fontmapCache)
	if err != nil {
		fmt.Println(err.Error())
	}

	fs, err := fc.LoadFontsetFile(fontmapCache)
	if err != nil {
		fmt.Println(err.Error())
	}
	fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
	err = goweasyprint.HtmlToPdf(os.Stdout, utils.InputString(html), fontconfig)
	if err != nil {
		fmt.Println(err.Error())
	}
}

@benoitkugler
Copy link
Owner

The issue is here :

_, err := fc.ScanAndCache(fontmapCache)
if err != nil {
    fmt.Println(err.Error())
}

I think you don't have the proper directories to match the font cache file defined as
const fontmapCache = "pdf/test/cache.fc"

Could you adjust this constant to something like <a directory I own/cache.fc> or maybe simply cache.fc ? Thank you.

@PylotLight
Copy link
Author

PylotLight commented Jul 14, 2024

but i dont have that file, and there would be no reason to given it was never explained in any doc anywhere?
nvm it might be working, ima test it at work in the morning.

@PylotLight
Copy link
Author

Alrighty it wrote my file, but didn't process the inline css inside the string like wkhtml does.

	// weasyprint
	var fontconfig text.FontConfiguration
	const fontmapCache = "cache.fc"
	fmt.Println("Scanning fonts...")
	_, err = fc.ScanAndCache(fontmapCache)
	if err != nil {
		return err
	}

	fs, err := fc.LoadFontsetFile(fontmapCache)
	if err != nil {
		return err
	}
	fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
	file, err := os.Create(filename)
	if err != nil {
		return err
	}
	err = goweasyprint.HtmlToPdf(file, utils.InputString(buf.String()), fontconfig)
	if err != nil {
		return err
	}
	// wkhtml
	pdfg, err := wkhtmltopdf.NewPDFGenerator()
	if err != nil {
		log.Fatal(err)
	}
	pdfg.AddPage(wkhtmltopdf.NewPageReader(strings.NewReader(buf.String())))
	err = pdfg.Create()
	if err != nil {
		log.Fatal(err)
	}
	err = pdfg.WriteFile(filename)
	if err != nil {
		log.Fatal(err)
	}

@benoitkugler
Copy link
Owner

Can you post the exact html string you use ? I didn't grasp which CSS you are refering to.

@PylotLight
Copy link
Author

PylotLight commented Jul 15, 2024

Can you post the exact html string you use ? I didn't grasp which CSS you are refering to.

https://paste.ofcode.org/iCum4BQTjKeWcQkhexMVJp

@benoitkugler
Copy link
Owner

Thank you.
What is the CSS not processed by GoWeasyprint ?

@PylotLight
Copy link
Author

PDF result:
AU_PRD_RITM17270697.pdf

Wkhtml from same string generates correct coloring on each cell, and content. it just doesn't load properly for some reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants