jsoup: Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety
OpensourceJavaHTMLparser,withthebestofHTML5DOMmethodsandCSSselectors,foreasydataextraction.
Qoppa Software – Java PDF Library and Tools
PDFStudioWorkwithPDFsonWindows,macOS,andLinuxondesktops&tabletsWeDoItAllPDFWhateveryourPDFneedsaretodayorinthefuture,wehaveasolutionforyou:creation,conversion,high-fidelityrenderingandprinting,digitalsignatures,textextraction,redaction,optimization,validationandmore…Forrendering,consideroursupport[…]
PyMuPDF 1.24.5 documentation
PyMuPDFisahigh-performancePythonlibraryfordataextraction,analysis,conversion&manipulationofPDF(andother)documents.
automeris.io: AI assisted data extraction from charts using WebPlotDigitizer
Webbasedsoftwaretoextractdatafromplots,charts,barchartsetc.