• Robert Knight's avatar
    Fix incorrect anchoring in new PDF.js releases · 8e288390
    Robert Knight authored
    Fix anchoring of text quotes and positions in PDF.js releases (>=
    2.9.359) that include https://github.com/mozilla/pdf.js/pull/13257.
    
    The client's anchoring relies on the text content of pages extracted via
    PDF.js's text APIs (`PDFPage.getTextContent`) to match the `textContent`
    of the hidden text layer element. In older PDF.js releases acheiving
    this alignment required excluding text items with all-whitespace text,
    because PDF.js did not create elements in the text layer for these. In
    PDF.js releases after https://github.com/mozilla/pdf.js/pull/13257 this
    filtering is no longer needed.
    
    The fix in this commit is to feature-detect whether the active version
    of PDF.js includes this change or not and filter or not filter text
    items accordingly.
    
    Future changes to PDF.js could cause mismatches between the result of
    `PDFPage.getTextContent` and the rendered text layer in other ways, so a
    sanity check has been added which logs a console warning if a mismatch
    is detected.
    8e288390
pdf.js 16.5 KB