You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In RAGLite we need a dependable estimate of the font size of a span. I thought it might be interesting for you to know how we obtain it with pdftext:
defextract_font_size(span: dict[str, Any]) ->float:
"""Extract the font size from a text span."""font_size: float=1.0ifspan["font"]["size"] >1: # A value of 1 appears to mean "unknown" in pdftext.font_size=span["font"]["size"]
elifdigit_sequences:=re.findall(r"\d+", span["font"]["name"] or""):
font_size=float(digit_sequences[-1])
elif"\n"notinspan["text"]: # Occasionally a span can contain a newline character.ifround(span["rotation"]) in (0.0, 180.0, -180.0):
font_size=span["bbox"][3] -span["bbox"][1]
elifround(span["rotation"]) in (90.0, -90.0, 270.0, -270.0):
font_size=span["bbox"][2] -span["bbox"][0]
returnfont_size
The text was updated successfully, but these errors were encountered:
In RAGLite we need a dependable estimate of the font size of a span. I thought it might be interesting for you to know how we obtain it with pdftext:
The text was updated successfully, but these errors were encountered: