|
|
TIKA-3034
|
Detector always returns text/plain when scanning Mathematica files
|
Unassigned
|
Tung Nguyen
|
|
Open |
Unresolved
|
|
|
|
|
|
|
TIKA-3001
|
Throw TaggedIOException when we open the HWP file with the Tika-App GUI
|
Unassigned
|
Kim Ju Young
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-3000
|
Users should be able to configure POI's IOUtils.setByteArrayMaxOverride
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2999
|
PDFParser should set, not add digital signature value
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2996
|
Add dropThreshold to PDFParserConfig
|
Unassigned
|
Felix Sonntag
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2994
|
ExceptionUtils should let TikaException subclasses through
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2993
|
tika-server's /rmeta endpoint shouldn't throw an exception for a parse exception
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2990
|
Add mime detection via xml root for xfdf
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2989
|
Add mime detection via xml root for xdp
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2988
|
Add mime for alternative fdf format
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2984
|
Try to unify unit tests around TikaTest functions
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2983
|
tika-server should add the file name to the metadata when a file url is passed in
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2982
|
Tika 识别已加密的xlsx、docx、pptx时会把它们错误地识别成doc
|
Tim Allison
|
Feng Jiao Jiang
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2979
|
tika-server shouldn't throw an exception for a non-supported format
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2978
|
maxMainMemoryBytes should be long on
|
Tim Allison
|
Christian Ribeaud
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2976
|
Add an XLZ parser
|
Dave Meikle
|
Dave Meikle
|
|
Closed |
Implemented
|
|
|
|
|
|
|
TIKA-2975
|
XLIFF 1.2 Parser
|
Dave Meikle
|
Dave Meikle
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2974
|
unable to extract recursive metadata using tika rest server
|
Unassigned
|
Martha Thompson
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2971
|
Link to download OpenNLP models needs to be http not https
|
Unassigned
|
Eric Pugh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2970
|
Configuring Tesseract for OCR of PDF via Tika Config is not working
|
Tim Allison
|
Eric Pugh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2969
|
Unit test for TesseractOCRParserTest.java has confusing behavior when Tesseract not on path
|
Tim Allison
|
Eric Pugh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2968
|
Display specific command for Tesseract if you are running in Verbose mode
|
Tim Allison
|
Eric Pugh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2965
|
Add a metadata flag for XFA and XMP in PDFs
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2960
|
Detected 1 vulnerable components: [ERROR] com.fasterxml.jackson.core:jackson-databind:jar:2.9.8
|
Unassigned
|
Ramesh Thumati
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2959
|
TabularFormatsTest test fails in Germany
|
Unassigned
|
Tilman Hausherr
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2958
|
XmlBeanDefinitionStoreException with SpringExample
|
Unassigned
|
Tilman Hausherr
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2955
|
PDF parsing to XHTML results in tika attempting to write invalid HTML characters.
|
Unassigned
|
Luke Butters
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2954
|
Remove Magic Numbers from ImageMetadataExtractor
|
Unassigned
|
Chris Z
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2953
|
Vulnerable "commons-compress : 1.18" is present in tika-bundle 1.22.
|
Unassigned
|
Aman Mishra
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2951
|
Upgrade to PDFBox 2.0.17
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2950
|
Add metadata value for signed ooxml files
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2947
|
Following Tika documentation results in a build of Tika version 1.12.
|
Unassigned
|
Dan Becker
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2945
|
AutoDetectParser should skip the content type detection if Metadata already has it
|
Sergey Beryozkin
|
Sergey Beryozkin
|
|
Open |
Unresolved
|
|
|
|
|
|
|
TIKA-2942
|
HEIC files are detected as "video/quicktime" media type
|
Unassigned
|
Roman Ivanov
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2941
|
OSGI bundle and app are not self-contained
|
Unassigned
|
Peng Cheng
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2934
|
OOXML parser fails to parse XLSX files with missing cellRef properties
|
Unassigned
|
Yahav Amsalem
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2931
|
Tika CLI shouldn't log with System.out.println
|
Tim Allison
|
Eric Pugh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2925
|
General dependency/plugin upgrades for 1.23
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2922
|
Regression issue with detecting .dotx and .xlam MS Office mime-types
|
Unassigned
|
Pascal Essiembre
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2906
|
Modularize tika-eval's language stats from the application
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2894
|
Add support for WebAssembly (Content-Type application/wasm, or .wasm extension)
|
Dave Meikle
|
Fredrik Söderström
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2892
|
ForkParser deadlock when InputStreamResource catches/returns IOException
|
Luís Filipe Nassif
|
Luís Filipe Nassif
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2890
|
Critical security vulnerability in dependencies
|
Unassigned
|
Kyle DuPont
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2851
|
Upgrade to POI 4.1.1 when available
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2830
|
Detect Media type of HEIF file correctly
|
Unassigned
|
Laurent Grangier
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
TIKA-2624
|
Rendering PDFs for OCR with Tesseract uses different DPI than claimed
|
Tim Allison
|
Ewan Mellor
|
|
Resolved |
Fixed
|
|
|
|
|