Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF to PNG with PDFium #408

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
![Last Commit](https://img.shields.io/github/last-commit/dvisionlab/Larvitar)
![GitHub stars](https://img.shields.io/github/stars/dvisionlab/Larvitar?style=social)


**Larvitar** is a modern, lightweight TypeScript library for medical imaging applications. Built on top of the Cornerstone ecosystem, Larvitar provides tools for rendering, analyzing, and interacting with medical images, including support for advanced modalities like multiframe images, NRRD, and ECG synchronization.

## 🛠 Current Version
Expand All @@ -34,6 +33,7 @@ Check out the [releases page](https://github.com/dvisionlab/Larvitar/releases) f
Comprehensive documentation is available on the [Larvitar Documentation Page](https://larvitar.dvisionlab.com).

### Sections

1. [**Core API**](https://larvitar.dvisionlab.com/api/): Learn how to parse, load, and render DICOM images.
2. [**Modules**](https://larvitar.dvisionlab.com/api/): Explore the segmentation tools, color maps, and advanced rendering features.
3. [**Examples**](https://larvitar.dvisionlab.com/guide/examples.html): See working examples for ECG synchronization, NRRD image loading, segmentation tools, and more.
Expand Down Expand Up @@ -67,16 +67,17 @@ To start developing Larvitar or contribute to the project:
2. **Install dependencies**:
```bash
yarn install
```
```
3. **Start the development server**:
```bash
yarn run dev
```
```bash
yarn run dev
```
4. **Open the development environment**:
- Serve the examples folder using a static server (e.g., `http-server` or visual studio code live server).
- Navigate to http://localhost:5500/docs/examples/<example_name>.html (or the port configured in your dev server).
- Serve the examples folder using a static server (e.g., `http-server` or visual studio code live server).
- Navigate to http://localhost:5500/docs/examples/<example_name>.html (or the port configured in your dev server).

## 📝 License

Larvitar is licensed under the MIT License. Feel free to use, modify, and distribute it in your projects.

## 🤝 Contributing
Expand All @@ -92,8 +93,6 @@ Larvitar has adopted a [Code of Conduct](CODE_OF_CONDUCT.md) that we expect proj
- Laura Borghesi, D/Vision Lab | [LinkedIn](https://linkedin.com/in/laura-borghesi-160557218)
- Sara Zanchi, D/Vision Lab | [LinkedIn](https://linkedin.com/in/sara-zanchi-113a4b61)


<p align="center">
<img src="https://press.r1-it.storage.cloud.it/logo_trasparent.png" width="200" title="D/Vision Lab Logo" alt="D/Vision Lab Logo">
</p>

10 changes: 9 additions & 1 deletion bundler/webpack.common.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,23 @@ module.exports = {
test: /\.tsx?$/,
use: "ts-loader",
exclude: /node_modules/
},
// webAssembly support
{
test: /\.wasm$/,
type: "asset/resource"
}
]
},
resolve: {
extensions: [".tsx", ".ts", ".js", ".d.ts"],
extensions: [".tsx", ".ts", ".js", ".d.ts", ".wasm"],
fallback: {
fs: false,
path: false,
crypto: false
}
},
experiments: {
asyncWebAssembly: true
}
};
2 changes: 2 additions & 0 deletions decs.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ declare module "cornerstone-wado-image-loader";
declare module "cornerstone-web-image-loader";
declare module "cornerstone-file-image-loader";
declare module "dicom-character-set";
declare module "@hyzyla/pdfium/browser/cdn";
declare module "@hyzyla/pdfium/pdfium.wasm";

declare global {
interface Document {
Expand Down
Binary file added dist/fbea52d058cabba1610b.wasm
Binary file not shown.
157 changes: 83 additions & 74 deletions imaging/parsers/pdf.ts
Original file line number Diff line number Diff line change
@@ -1,58 +1,13 @@
/** @module imaging/parsers/pdf
* @desc This file provides functionalities for
* managing pdf files using pdfjs-dist library
* managing pdf files using PDFium
*/

// external libraries
import {
getDocument,
GlobalWorkerOptions,
PDFPageProxy,
PageViewport
} from "pdfjs-dist";
GlobalWorkerOptions.workerSrc = require("pdfjs-dist/build/pdf.worker");

import { PDFiumLibrary } from "@hyzyla/pdfium/browser/cdn";
// internal libraries
import { pdfType } from "../types";
import { populateFileManager } from "../imageManagers";

/**
* This module provides the following functions to be exported:
* convertToPNG(pdf, pageNumber)
* generateFiles(fileURL)
*/

/**
* Convert a pdf page to a png image in base64 format
* @instance
* @function convertToPNG
* @param {pdfType} pdf - The pdf object
* @param {number} pageNumber - The page number to be converted
* @returns {string} The png image in base64 format
*/
export const convertToPNG = async function (
pdf: pdfType,
pageNumber: number
): Promise<string> {
const page: PDFPageProxy = await pdf.getPage(pageNumber);
const viewport: PageViewport = page.getViewport({ scale: 1.5 });
const canvas: HTMLCanvasElement = document.createElement("canvas");
canvas.height = viewport.height;
canvas.width = viewport.width;

const context: CanvasRenderingContext2D | null = canvas.getContext("2d");
if (context === null) {
throw new Error("Failed to get 2D context from canvas");
}

const renderContext = {
canvasContext: context,
viewport: viewport
};
await page.render(renderContext).promise;
return canvas.toDataURL("image/png");
};

/**
* Generate an array of files from a pdf file
* @instance
Expand All @@ -62,45 +17,99 @@ export const convertToPNG = async function (
*/
export const generateFiles = async function (fileURL: string): Promise<File[]> {
let files: File[] = [];
await getDocument(fileURL).promise.then(async (pdf: pdfType) => {
// cycle through pages
for (let i = 0; i < pdf.numPages; i++) {
let aFile: File | null = await generateFile(pdf, i + 1);
files[i] = aFile;
aFile = null;
}
const response = await fetch(fileURL);
if (!response.ok) {
throw new Error(`Failed to fetch PDF file: ${response.statusText}`);
}
const pdfFile = await response.blob();

if (pdfFile.type !== "application/pdf") {
throw new Error("Invalid MIME type, expected application/pdf");
}

const buff = await pdfFile.arrayBuffer();

// Initialize the library and load the document
const library = await PDFiumLibrary.init({
disableCDNWarning: true
});
return files; // Add this line to return the files array

const usableBuffer = new Uint8Array(buff);
const pdfdocument = await library.loadDocument(usableBuffer);
const pages = await pdfdocument.pages();

for (const page of pages) {
let aFile = await generateFile(page);
files.push(aFile);
}

pdfdocument.destroy();
library.destroy();

return files;
};

// internal functions

/**
*
* Generate a single PNG file for a PDF page
* @instance
* @function generateFile
* @param {pdfType} pdf - The pdf object
* @param {number} pageNumber - The page number to be converted
* @returns {File} The png image of the pdf page in a File object
* @param {any} page - The PDF page object
* @returns {File} The PNG image of the PDF page as a File object
*/
async function generateFile(pdf: pdfType, pageNumber: number): Promise<File> {
const pngDataURL: string = await convertToPNG(pdf, pageNumber);
let byteString: string | null = atob(pngDataURL.split(",")[1]);
let ab: ArrayBuffer | null = new ArrayBuffer(byteString.length);
let ia: Uint8Array | null = new Uint8Array(ab);
for (let j = 0; j < byteString.length; j++) {
ia[j] = byteString.charCodeAt(j);
}
let blob: Blob | null = new Blob([ab], {
type: "image/png"
async function generateFile(page: any): Promise<File> {
// Render PDF page to bitmap data using PDFium
const image = await page.render({
scale: 3, // TODO: adjust scale
render: "bitmap"
});
let file: File | null = new File([blob], `pdf_page_${pageNumber}.png`, {

// Create a canvas element to convert the bitmap to PNG
const canvas = document.createElement("canvas");
canvas.width = image.width;
canvas.height = image.height;
const ctx = canvas.getContext("2d");
if (!ctx) {
throw new Error("Failed to get 2D context from canvas");
}

// Create an ImageData object from the bitmap data
const imageData = new ImageData(
new Uint8ClampedArray(image.data), // Use the bitmap data from PDFium
image.width,
image.height
);

// Draw the image data onto the canvas
ctx.putImageData(imageData, 0, 0);

// Convert the canvas content to a Blob (PNG format)
const blob = await new Promise<Blob | null>(resolve =>
canvas.toBlob(blob => resolve(blob), "image/png")
);

if (!blob) {
throw new Error("Failed to create PNG blob from canvas");
}

// Optionally create a File object
const file = new File([blob], `pdf_page_${page.number}.png`, {
type: "image/png"
});
populateFileManager(file);
byteString = null;
ab = null;
ia = null;
blob = null;
return file;
}

// Helper function to download files
function downloadFile(file: File | Blob): void {
const link = document.createElement("a");
const url = URL.createObjectURL(file);
link.href = url;
//@ts-ignore
link.download = file.name ?? "downloadPdf";
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
URL.revokeObjectURL(url);
}
9 changes: 6 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@
],
"scripts": {
"coverage": "typescript-coverage-report",
"generate-docs": "node_modules/.bin/jsdoc -c jsdoc.json",
"postinstall": "patch-package",
"build": "webpack --config ./bundler/webpack.prod.js",
"dev": "webpack --progress --config ./bundler/webpack.dev.js",
"docs:dev": "vuepress dev docs",
Expand All @@ -29,10 +31,12 @@
"contributors": [
"Mattia Ronzoni <[email protected]> (https://www.dvisionlab.com)",
"Sara Zanchi <[email protected]> (https://www.dvisionlab.com)",
"Laura Borghesi Re <[email protected]> (https://www.dvisionlab.com)"
"Ale Re <[email protected]> (https://www.dvisionlab.com)",
"Laura Borghesi <[email protected]> (https://www.dvisionlab.com)"
],
"license": "MIT",
"dependencies": {
"@hyzyla/pdfium": "^2.1.2",
"@rollup/plugin-commonjs": "^17.1.0",
"cornerstone-core": "^2.6.1",
"cornerstone-file-image-loader": "^0.3.0",
Expand All @@ -48,7 +52,7 @@
"lodash": "^4.17.15",
"pako": "^1.0.10",
"papaparse": "^5.3.0",
"pdfjs-dist": "3.11.174",
"patch-package": "^8.0.0",
"plotly.js-dist-min": "^2.27.1",
"uuid": "^8.3.2"
},
Expand All @@ -59,7 +63,6 @@
"@types/hammerjs": "^2.0.41",
"@types/lodash": "^4.14.192",
"@types/papaparse": "^5.3.7",
"@types/pdfjs-dist": "^2.10.378",
"@types/plotly.js": "^2.12.30",
"@types/plotly.js-dist-min": "^2.3.4",
"@types/uuid": "^9.0.1",
Expand Down
15 changes: 15 additions & 0 deletions patches/@hyzyla+pdfium+2.1.2+001+initial.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
diff --git a/node_modules/@hyzyla/pdfium/package.json b/node_modules/@hyzyla/pdfium/package.json
index 044d678..482edeb 100644
--- a/node_modules/@hyzyla/pdfium/package.json
+++ b/node_modules/@hyzyla/pdfium/package.json
@@ -18,8 +18,8 @@
"types": "./dist/index.esm.d.ts"
},
"./browser/cdn": {
- "default": "./dist/index.esm.cdn.js",
- "types": "./dist/index.esm.cdn.d.ts"
+ "types": "./dist/index.esm.cdn.d.ts",
+ "default": "./dist/index.esm.cdn.js"
},
"./browser/base64": {
"default": "./dist/index.esm.base64.js",
Loading
Loading