Skip to content

While running the 'pdf_converter' function #351

@suresh96458

Description

@suresh96458

I have converted few .text files into .PDF files and then I am running the 'PDF_converter' function to extract paragraphs and convert into a data frame while doing the same , I am unclear whether the issue faced is due to TIKA or my files , as a sample i am attaching two PDF files that i am using and also the error which i am facing.

`
$ df = pdf_converter(directory_path='/home/xxxx/Downloads/test/')

2020-03-12 11:43:42,470 [MainThread ] [WARNI] Failed to see startup log message; retrying...
2020-03-12 11:43:47,476 [MainThread ] [WARNI] Failed to see startup log message; retrying...
2020-03-12 11:43:52,480 [MainThread ] [WARNI] Failed to see startup log message; retrying...
2020-03-12 11:43:57,486 [MainThread ] [ERROR] Tika startup log message not received after 3 tries.
2020-03-12 11:43:57,489 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer.
Unexpected error: <class 'RuntimeError'>
Unable to process file NetworkEngineer1.pdf`

NetworkEngineer1.pdf
NetworkEngineer2.pdf

@andrelmfarias

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions