ASF JIRA

Tika
1.21
Key descending
139 of 39 as at: 23/Apr/24 06:59
T Patch Info Key Summary Assignee Reporter P Status Resolution Created Updated Due Development
Bug TIKA-2891

ForkClient "fillBootstrapJar()" lack few "MANIFEST.MF" properties

Unassigned Quentin Laville Blocker Resolved Fixed  
Bug TIKA-2884

Tika Parse - Null Pointer

Unassigned Ravi Major Closed Invalid  
Bug TIKA-2877

Tika 1.20 suffer from 3 separate CVE vulnerabilities

Tim Allison Pat cashman Critical Resolved Fixed  
Task TIKA-2873

Some password protected xlsx files no longer open with password

Tim Allison Tim Allison Blocker Resolved Fixed  
Bug TIKA-2869

Can't parse pdf in version 1.20 - Pkcs7Parser (DEF length 465542 object truncated by 465479)

Tim Allison Edans Sandes Major Resolved Fixed  
Task TIKA-2868

Fix DL4JVGG16Net to work with dl4j-beta3

Tim Allison Tim Allison Blocker Resolved Fixed  
Task TIKA-2867

EpubParser -- add check for null zipEntry

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-2866

EpubParser should allow .htm

Tim Allison Tim Allison Major Resolved Fixed  
Task TIKA-2865

Parameterize minConfidence for csv detection; bump default higher

Tim Allison Tim Allison Major Resolved Fixed  
Task TIKA-2864

Fix regression in RFC822 parsing time

Tim Allison Tim Allison Blocker Resolved Fixed  
Task TIKA-2863

Add comparison reports for time to process per mime

Tim Allison Tim Allison Major Resolved Fixed  
Bug TIKA-2854

upgrade out-of-date dependencies with outstanding CVEs

Unassigned Andrew Pavlin Major Resolved Fixed  
Task TIKA-2852

Add reports for missing/unaligned files in tika-eval Compare mode

Unassigned Tim Allison Major Resolved Fixed  
Bug TIKA-2849

TikaInputStream copies the input stream locally

Tim Allison Boris Petrov Major Resolved Fixed  
Task TIKA-2846

Add per page unicode mapping stats to the metadata in the PDFParser

Tim Allison Tim Allison Major Resolved Fixed  
Task TIKA-2845

Override ProcessPages in PDFTextStripper

Tim Allison Tim Allison Major Resolved Fixed  
Task TIKA-2841

Improve robustness of parsers of zip-based files on truncated files

Tim Allison Tim Allison Major Resolved Fixed  
Bug TIKA-2840

windows batch file not detected

Tim Allison chandra Major Resolved Fixed  
Bug TIKA-2838

RTF document processing glues comment fields together with text without whitespace

Tim Allison Karl Wright Major Resolved Fixed  
Bug TIKA-2836

Tika core API

Tim Allison chandra Major Resolved Fixed  
Task TIKA-2835

Upgrade to PDFBox 2.0.15 when available

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-2834

Upgrade to PDFBox 2.0.14 when available

Tim Allison Tim Allison Major Resolved Fixed  
Task TIKA-2827

Improve tika-eval comparison reports to include mime types in A and B for diffs

Tim Allison Tim Allison Major Resolved Fixed  
Task TIKA-2826

Add a csv/tsv parser

Tim Allison Tim Allison Minor Resolved Fixed  
Task TIKA-2825

Make interrupter in tika-batch's child process actually optional

Unassigned Tim Allison Trivial Resolved Fixed  
Improvement TIKA-2823

Remove printstacktrace in XMLReaderUtils

Tim Allison Tim Allison Major Resolved Fixed  
Improvement TIKA-2822

Update common tokens files for tika-eval

Tim Allison Tim Allison Trivial Resolved Fixed  
Improvement TIKA-2819

Update jaxb & activation

Tim Allison Hans Brende Major Resolved Fixed  
Bug TIKA-2816

Error when sending request to /tika with header X-Tika-OCRMinFileSizeToOcr

Tim Allison Anssi Törmä Major Resolved Fixed  
Improvement TIKA-2810

Back off to tagsoup when xml parser fails on Tika xhtml in tika-eval

Tim Allison Tim Allison Major Resolved Fixed  
Improvement TIKA-2809

Add reports for structure tags to tika-eval

Tim Allison Tim Allison Minor Resolved Fixed  
Bug TIKA-2807

.docx text extract leaves out rich text content-control inside of a text box

Tim Allison Claudia Mickiewicz Critical Resolved Fixed  
Task TIKA-2801

Tika includes 2 vulnerable components

Tim Allison Maxim Solodovnik Critical Resolved Fixed  
Improvement TIKA-2765

Regression extracting text from corrupted docx files

Tim Allison Luís Filipe Nassif Minor Resolved Fixed  
Improvement TIKA-2756

Switch to commons-lang 3

Tim Allison Robert Munteanu Major Resolved Fixed  
Task TIKA-2726

Handle truncated ooxml more robustly

Tim Allison Tim Allison Major Resolved Duplicate  
Bug TIKA-2601

Invalid XHTML output (overlapping a and formatting tags) for some WORD documents

Konstantin Gribov Filip Major Closed Fixed  
Bug TIKA-2555

Text with [underline] + [another format] in word document generates overlapping html tags.

Konstantin Gribov Serban Alexe Minor Resolved Fixed  
Bug TIKA-2310

Try to order chapters in epub correctly

Tim Allison Tim Allison Minor Resolved Fixed