Simply PDF text extraction
Earlier, on some versions of PDF.js, we used to use the PDF.js's FindController as a data source for the text extraction. However, at some point, we stopped using the routines shipped with it, since it didn't always provide use adequate spacing between the various pieces of texts. So we ended up just contenating the various pieces of text ourselves. Then, for new versions of PDF.js, we introduced other means of accessing the same information, completely bypassing the PDFFindController. This change simply unifies the access; now we can do the same an all PDF.js versions.
Showing
Please register or sign in to comment