Missing text after redaction #867
              
                
                  
                  
                    Answered
                  
                  by
                    JorjMcKie
                  
              
          
                  
                    
                      NoraishaYusuf
                    
                  
                
                  asked this question in
                Looking for help
              
            -
Beta Was this translation helpful? Give feedback.
      
      
          Answered by
          
            JorjMcKie
          
      
      
        Jan 25, 2021 
      
    
    Replies: 1 comment 2 replies
-
| Problem 1:Hard to tell without looking at that file (probably confidential anyway). But there arethings like damaged PDFs ... 
 Problem 2:MuPDF normally uses the full font-defined line height when identifying the hits of search. If the PDF is made with smaller distances between lines, then adjacent lines may overlap somewhat. The redaction logic of MuPDF in turn removes every character overlaping the redaction rectangle - the result of this is what you saw. 
 | 
Beta Was this translation helpful? Give feedback.
                  
                    2 replies
                  
                
            
      Answer selected by
        NoraishaYusuf
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    


Problem 1:
Hard to tell without looking at that file (probably confidential anyway). But there arethings like damaged PDFs ...
You could try cleaning the file / the page before processing to reveal / remove any errors.
mutool clean -gggsc file.pdfpage.clean_contents(sanitize=True)Problem 2:
MuPDF normally uses the full font-defined line height when identifying the hits of search. If the PDF is made with smaller distances between lines, then adjacent lines may overlap somewhat. The redaction logic of MuPDF in turn removes every character overlaping the redaction rectangle - the result of this is what you saw.