ASF JIRA

Tika
1.23
Key descending
146 of 46 as at: 28/Mar/24 15:02
T Patch Info Key Summary Assignee Reporter P Status Resolution Created Updated Due Development
Bug TIKA-3034

Detector always returns text/plain when scanning Mathematica files

Unassigned Tung Nguyen Blocker Open Unresolved  
Bug TIKA-3001

Throw TaggedIOException when we open the HWP file with the Tika-App GUI

Unassigned Kim Ju Young Major Resolved Fixed  
Task TIKA-3000

Users should be able to configure POI's IOUtils.setByteArrayMaxOverride

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-2999

PDFParser should set, not add digital signature value

Unassigned Tim Allison Trivial Resolved Fixed  
Improvement TIKA-2996

Add dropThreshold to PDFParserConfig

Unassigned Felix Sonntag Minor Resolved Fixed  
Improvement TIKA-2994

ExceptionUtils should let TikaException subclasses through

Unassigned Tim Allison Minor Resolved Fixed  
Improvement TIKA-2993

tika-server's /rmeta endpoint shouldn't throw an exception for a parse exception

Unassigned Tim Allison Major Resolved Fixed  
Improvement TIKA-2990

Add mime detection via xml root for xfdf

Tim Allison Tim Allison Minor Resolved Fixed  
Improvement TIKA-2989

Add mime detection via xml root for xdp

Tim Allison Tim Allison Minor Resolved Fixed  
Improvement TIKA-2988

Add mime for alternative fdf format

Unassigned Tim Allison Major Resolved Fixed  
Improvement TIKA-2984

Try to unify unit tests around TikaTest functions

Unassigned Tim Allison Major Resolved Fixed  
Improvement TIKA-2983

tika-server should add the file name to the metadata when a file url is passed in

Tim Allison Tim Allison Major Resolved Fixed  
Bug TIKA-2982

Tika 识别已加密的xlsx、docx、pptx时会把它们错误地识别成doc

Tim Allison Feng Jiao Jiang Blocker Resolved Fixed  
Improvement TIKA-2979

tika-server shouldn't throw an exception for a non-supported format

Unassigned Tim Allison Minor Resolved Fixed  
Improvement TIKA-2978

maxMainMemoryBytes should be long on

Tim Allison Christian Ribeaud Major Resolved Fixed  
New Feature TIKA-2976

Add an XLZ parser

Dave Meikle Dave Meikle Minor Closed Implemented  
New Feature TIKA-2975

XLIFF 1.2 Parser

Dave Meikle Dave Meikle Minor Resolved Fixed  
Bug TIKA-2974

unable to extract recursive metadata using tika rest server

Unassigned Martha Thompson Major Resolved Fixed  
Improvement TIKA-2971

Link to download OpenNLP models needs to be http not https

Unassigned Eric Pugh Minor Resolved Fixed  
Improvement TIKA-2970

Configuring Tesseract for OCR of PDF via Tika Config is not working

Tim Allison Eric Pugh Critical Resolved Fixed  
Improvement TIKA-2969

Unit test for TesseractOCRParserTest.java has confusing behavior when Tesseract not on path

Tim Allison Eric Pugh Minor Resolved Fixed  
Improvement TIKA-2968

Display specific command for Tesseract if you are running in Verbose mode

Tim Allison Eric Pugh Minor Resolved Fixed  
Improvement TIKA-2965

Add a metadata flag for XFA and XMP in PDFs

Tim Allison Tim Allison Trivial Resolved Fixed  
Bug TIKA-2960

Detected 1 vulnerable components: [ERROR] com.fasterxml.jackson.core:jackson-databind:jar:2.9.8

Unassigned Ramesh Thumati Major Resolved Fixed  
Bug TIKA-2959

TabularFormatsTest test fails in Germany

Unassigned Tilman Hausherr Minor Resolved Fixed  
Bug TIKA-2958

XmlBeanDefinitionStoreException with SpringExample

Unassigned Tilman Hausherr Trivial Resolved Fixed  
Bug TIKA-2955

PDF parsing to XHTML results in tika attempting to write invalid HTML characters.

Unassigned Luke Butters Major Resolved Fixed  
Improvement TIKA-2954

Remove Magic Numbers from ImageMetadataExtractor

Unassigned Chris Z Trivial Resolved Fixed  
Bug TIKA-2953

Vulnerable "commons-compress : 1.18" is present in tika-bundle 1.22.

Unassigned Aman Mishra Major Resolved Fixed  
Improvement TIKA-2951

Upgrade to PDFBox 2.0.17

Unassigned Tim Allison Major Resolved Fixed  
Improvement TIKA-2950

Add metadata value for signed ooxml files

Tim Allison Tim Allison Major Resolved Fixed  
Bug TIKA-2947

Following Tika documentation results in a build of Tika version 1.12.

Unassigned Dan Becker Minor Resolved Fixed  
Improvement TIKA-2945

AutoDetectParser should skip the content type detection if Metadata already has it

Sergey Beryozkin Sergey Beryozkin Minor Open Unresolved  
Bug TIKA-2942

HEIC files are detected as "video/quicktime" media type

Unassigned Roman Ivanov Major Resolved Fixed  
Bug TIKA-2941

OSGI bundle and app are not self-contained

Unassigned Peng Cheng Major Resolved Fixed  
Bug TIKA-2934

OOXML parser fails to parse XLSX files with missing cellRef properties

Unassigned Yahav Amsalem Major Resolved Fixed  
Improvement TIKA-2931

Tika CLI shouldn't log with System.out.println

Tim Allison Eric Pugh Minor Resolved Fixed  
Bug TIKA-2925

General dependency/plugin upgrades for 1.23

Unassigned Tim Allison Minor Resolved Fixed  
Bug TIKA-2922

Regression issue with detecting .dotx and .xlam MS Office mime-types

Unassigned Pascal Essiembre Minor Resolved Fixed  
Task TIKA-2906

Modularize tika-eval's language stats from the application

Tim Allison Tim Allison Major Resolved Fixed  
Improvement TIKA-2894

Add support for WebAssembly (Content-Type application/wasm, or .wasm extension)

Dave Meikle Fredrik Söderström Major Resolved Fixed  
Bug TIKA-2892

ForkParser deadlock when InputStreamResource catches/returns IOException

Luís Filipe Nassif Luís Filipe Nassif Major Resolved Fixed  
Improvement TIKA-2890

Critical security vulnerability in dependencies

Unassigned Kyle DuPont Major Resolved Fixed  
Task TIKA-2851

Upgrade to POI 4.1.1 when available

Tim Allison Tim Allison Major Resolved Fixed  
New Feature TIKA-2830

Detect Media type of HEIF file correctly

Unassigned Laurent Grangier Major Resolved Duplicate  
Bug TIKA-2624

Rendering PDFs for OCR with Tesseract uses different DPI than claimed

Tim Allison Ewan Mellor Major Resolved Fixed