ASF JIRA

Tika
1.18
Key descending
147 of 47 as at: 26/Apr/24 15:18
T Patch Info Key Summary Assignee Reporter P Status Resolution Created Updated Due Development
Bug TIKA-3198

extracting ppt with chart give excel in which data is missing

Unassigned sagar Major Open Unresolved  
Task TIKA-2635

Require imageMagick path be specified on Windows OS

Unassigned Tim Allison Minor Resolved Fixed  
Task TIKA-2634

Upgrade Jackson to 2.9.5

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-2620

Set sys property to get better rendering speed by default

Unassigned Tim Allison Trivial Resolved Fixed  
Task TIKA-2618

LabelRecord and LabelSSTRecord text can be overwritten in xls

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-2617

Ignore NPOIFS IOOBE in PPT attachments

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-2616

message/news now incorrectly identified as rfc822

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-2614

RFC822 treats non-multipart as attachment

Unassigned Tim Allison Blocker Resolved Fixed  
Improvement TIKA-2613

Tesseract 4.0 has removed -psm, so Tika must update

Unassigned Ewan Mellor Major Resolved Fixed  
Sub-task TIKA-2607

TIKA-2579 Exchange levigo-jbig2-imageio with pdfbox-jbig2-imageio:3.0.0

Unassigned Andreas Meier Major Resolved Fixed  
Bug TIKA-2604

Error with certain jar paths on OS X

Tim Allison Sasha Goodman Blocker Resolved Fixed  
Task TIKA-2600

Don't use md5 checksum due to changes to the release distribuition policy

Tim Allison Tim Allison Blocker Resolved Fixed  
Improvement TIKA-2598

Fix dependency convergence

Tim Allison Guillaume Smet Blocker Resolved Fixed  
Bug TIKA-2594

Mail detected as application/xhtml+xml

Unassigned Andreas Meier Major Resolved Fixed  
Improvement TIKA-2592

HTML with charset unicode handled as utf-16 instead utf-8

Unassigned Andreas Meier Minor Resolved Fixed  
Bug TIKA-2591

Some tiffs (Big Endian with fax compression) are showing up as x-tarr

Unassigned daniel schmidt Major Resolved Fixed  
Bug TIKA-2590

ExcelExtractor: cannot choose listening to the selected records only

Unassigned Grigoriy Alekseev Critical Resolved Fixed  
Bug TIKA-2588

Tika detecting/parsing pptx with embedded Excel worksheet(s)...

Tim Allison Brian McColgan Major Closed Fixed  
Bug TIKA-2587

DKIM signed mails recognized as text/plain

Unassigned Andreas Meier Major Resolved Fixed  
Improvement TIKA-2585

TikaInputStream support for resetting via a factory of InputStreams

Unassigned Nick Burch Major Resolved Fixed  
Improvement TIKA-2584

Tika should have a way to pass arbitrary Tesseract options

Unassigned Ewan Mellor Minor Resolved Fixed  
Bug TIKA-2582

Tesseract 4.0 includes a FF character by default, breaking parsers

Unassigned Ewan Mellor Major Resolved Fixed  
Bug TIKA-2580

SafeContentHandler documentation is incorrect about replacement character

Unassigned Ewan Mellor Minor Resolved Fixed  
Improvement TIKA-2579

Update to PDFBox 2.0.9 when available

Tim Allison David Pilato Major Closed Fixed  
Bug TIKA-2578

Mails not recognized when unknown X-headers are present

Tim Allison Andreas Meier Major Resolved Fixed  
Improvement TIKA-2576

Add application/zstd detection and parser

Unassigned Andreas Meier Minor Resolved Fixed  
Bug TIKA-2571

Swallows security exception and returns null

Unassigned Nik Everett Minor Resolved Fixed  
Task TIKA-2570

Tika 1.17 uses vulnerable Jackson version 2.9.2

Unassigned Julian Reschke Minor Resolved Fixed  
Bug TIKA-2569

Grouped Text boxes in .ppt

Tim Allison Richard A Major Resolved Fixed  
Bug TIKA-2568

Full encrypted 7Z file not detected as such

Luís Filipe Nassif Luís Filipe Nassif Minor Resolved Fixed  
Bug TIKA-2567

Tika mistakenly determines mimetype of .min.js file as matlab

Unassigned Anto Major Resolved Fixed  
Bug TIKA-2564

Tika client cannot extract files from embedded archive formats

Tim Allison Marc Prud'hommeaux Major Resolved Fixed  
Improvement TIKA-2563

Extract embedded objects in HTML and javascript

Tim Allison Tim Allison Trivial Resolved Fixed  
Bug TIKA-2561

Tika Parser includes oudated/vulnerable version of JSoup

Unassigned Asela Major Resolved Fixed  
Improvement TIKA-2559

Expose language metadata from PDF documents

Unassigned Matt Sheppard Major Resolved Fixed  
Improvement TIKA-2556

org.json package clash

Unassigned Andrei Rebegea Major Resolved Fixed  
Bug TIKA-2547

RFC822 w multipart/mixed first text element should be treated as body, not attachment

Unassigned Tim Allison Major Resolved Fixed  
Improvement TIKA-2541

Referenced version of Apache SIS (org.apache.sis) is branch EOL

Unassigned Richard Jones Major Resolved Fixed  
Improvement TIKA-2535

Use latest org.opengis:geoapi to avoid rejected/EOL'd jsr-275 dependency

Tim Allison Richard Jones Major Resolved Fixed  
Improvement TIKA-2528

Fix key location, keys file and download link

Unassigned Tim Allison Minor Resolved Fixed  
Bug TIKA-2527

Typos in tika-mimetypes.xml

Unassigned Andreas Meier Minor Resolved Fixed  
Improvement TIKA-2524

Create/integrate a parser for XPS

Tim Allison Peter Davies Major Resolved Fixed  
Bug TIKA-2509

TesseractOCRParser ignores configured ImageMagickPath in processImage method

Dave Meikle Richard Jones Major Resolved Fixed  
Improvement TIKA-2390

Extract images embedded in Html

Unassigned Luís Filipe Nassif Minor Resolved Duplicate  
Improvement TIKA-2338

Change Scope of Jai-ImageIO-Core dependency

Luís Filipe Nassif Luís Filipe Nassif Major Resolved Fixed  
Bug TIKA-1191

ForkParser / ClassLoaderProxy does not define package

Unassigned Nicolas Belisle Major Resolved Fixed  
Bug TIKA-879

Detection problem: message/rfc822 file is detected as text/plain.

Unassigned Konstantin Gribov Major Closed Duplicate