ASF JIRA

Tika
1.8
Key descending
157 of 57 as at: 25/Apr/24 00:00
T Patch Info Key Summary Assignee Reporter P Status Resolution Created Updated Due Development
Bug TIKA-1597

RTF with embedded image parsing produces div before html

Unassigned Konstantin Gribov Major Closed Fixed  
Improvement TIKA-1596

tika-cli logging cleanup

Konstantin Gribov Konstantin Gribov Trivial Closed Fixed  
Bug TIKA-1591

Tika Parsers uses wrong version of bouncycastle

Konstantin Gribov Java Developer Major Resolved Fixed  
Bug TIKA-1590

A particular PDF seems to trigger an infinite loop when being converted to HTML

Unassigned Matt Sheppard Major Closed Duplicate  
Bug TIKA-1589

Mp3 parser does not add duration to metadata if there are no ID3 tags

Unassigned Max Daniline Major Resolved Fixed  
Improvement TIKA-1587

ForkParser::setJavaCommand should take List<String>

Konstantin Gribov Oleg Oshmyan Major Resolved Fixed  
Bug TIKA-1584

Tika 1.7 possible regression (nested attachment files not getting parsed)

Tim Allison Rob Tulloh Blocker Resolved Fixed  
Bug TIKA-1581

jhighlight license concerns

Unassigned Karl Wright Major Resolved Fixed  
New Feature TIKA-1580

ISA-Tab parsers

Chris A. Mattmann Giuseppe Totaro Minor Resolved Fixed  
Improvement TIKA-1578

Add file type description to HDFParsers

Ann Burgess Ann Burgess Major Resolved Fixed  
Improvement TIKA-1576

Upgrade metadata-extractor to version 2.7.2

Tyler Bui-Palsulich Tyler Bui-Palsulich Major Resolved Fixed  
Improvement TIKA-1575

Upgrade to PDFBox 1.8.9 when available

Unassigned Tim Allison Minor Closed Fixed  
New Feature TIKA-1572

Utility script for pushing 3rd Party UCAR Dependencies to Maven Central

Lewis John McGibbney Lewis John McGibbney Major Resolved Fixed  
Improvement TIKA-1571

Upgrade UCAR dependencies to 4.5.5

Lewis John McGibbney Lewis John McGibbney Major Resolved Fixed  
Bug TIKA-1569

Doc typo Mime Magic Detction

Unassigned Hari Sekhon Trivial Resolved Fixed  
Bug TIKA-1567

WelcomeResource in TikaServer doesn't print PathParam prefix

Chris A. Mattmann Chris A. Mattmann Major Resolved Fixed  
Bug TIKA-1565

image/gif parse error

Tyler Bui-Palsulich lixin Major Resolved Fixed  
Bug TIKA-1563

Use .gz as the default extension for application/gzip

Unassigned Adam Lamar Minor Resolved Fixed  
Improvement TIKA-1561

GCMD Directory Interchange Format (.dif) identification

Chris A. Mattmann Luke sh Trivial Resolved Fixed  
New Feature TIKA-1558

Create a Parser Blacklist

Tyler Bui-Palsulich Tyler Bui-Palsulich Major Resolved Fixed  
Task TIKA-1556

Clean up whitespace in tika-server

Tim Allison Tim Allison Trivial Resolved Fixed  
Bug TIKA-1554

Improve EMF file detection

Chris A. Mattmann Luís Filipe Nassif Major Closed Fixed  
Bug TIKA-1549

Two times speed increase of language profile distance calculation

Chris A. Mattmann Toke Eskildsen Major Resolved Fixed  
Bug TIKA-1548

System property added while catching exception on parsing PDF encrypted doc

Unassigned David Pilato Major Resolved Fixed  
Improvement TIKA-1547

Use POST for tika-server form resources

Tyler Bui-Palsulich Tyler Bui-Palsulich Major Resolved Fixed  
Bug TIKA-1544

empty lines are not preserved

Unassigned mortee Minor Resolved Fixed  
Bug TIKA-1543

TesseractOCRParser.setTesseractPath() doesn't work on Linux

Unassigned Sean Zhao Major Closed Fixed  
Task TIKA-1542

Substitute Apache TTF test file for current non-Apache friendly file

Tim Allison Tim Allison Trivial Resolved Fixed  
Improvement TIKA-1541

StringsParser: a simple strings-based parser for Tika

Chris A. Mattmann Giuseppe Totaro Major Resolved Fixed  
Improvement TIKA-1539

GRB file magic bytes and extension matching

Chris A. Mattmann Luke sh Major Resolved Fixed  
Bug TIKA-1537

Installation on OSX 10.10.2 generates OutOfMemory Error during parser tests

Chris A. Mattmann Andrew Hwang Minor Resolved Fixed  
Improvement TIKA-1530

MP4Parser parses duration but does not set it

Chris A. Mattmann Oskar Wickström Minor Resolved Fixed  
Improvement TIKA-1521

Handle password protected 7zip files

Unassigned Nick Burch Major Resolved Fixed  
Bug TIKA-1519

Don't allow whatever is in http-equiv Content-Type to overwrite actual Content-Type in HtmlParser

Unassigned Tim Allison Trivial Resolved Fixed  
Bug TIKA-1514

http-equiv content-type extraction should pick first parseable content value

Unassigned Tim Allison Trivial Closed Won't Fix  
Bug TIKA-1512

WordParser fails on many Word files

Unassigned F Seid Major Resolved Fixed  
New Feature TIKA-1511

Create a parser for SQLite3

Unassigned Luís Filipe Nassif Major Resolved Fixed  
Bug TIKA-1489

PDF Text extraction without permission

Unassigned Tilman Hausherr Major Resolved Fixed  
New Feature TIKA-1483

Create a Latin1 charset raw string parser

Chris A. Mattmann Luís Filipe Nassif Major Resolved Fixed  
Bug TIKA-1482

ForkParser throws exceptions when process some large pdf files

Unassigned Sean Zhao Critical Closed Fixed  
New Feature TIKA-1423

Build a parser to extract data from GRIB formats

Vineet Ghatge Vineet Ghatge Critical Resolved Fixed  
Bug TIKA-1416

Refactor Translator Exception Handling

Unassigned Tyler Bui-Palsulich Major Resolved Fixed  
Bug TIKA-1408

Fix version for tikadotnet to be tracked along with trunk and release version

Chris A. Mattmann Chris A. Mattmann Major Resolved Fixed  
Bug TIKA-1388

Tika IOUtils java.lang.OutOfMemoryError

Unassigned Alejandro León Mora Minor Closed Not A Problem  
Improvement TIKA-1383

Simplify TikeServerCli endpoint setup code

Sergey Beryozkin Sergey Beryozkin Trivial Resolved Fixed  
Bug TIKA-1365

Incorrectly MimeType detection for Apache Lucene web site

Chris A. Mattmann Tien Nguyen Manh Major Resolved Fixed  
Bug TIKA-1309

RTF TextExtractor ignores consecutive linebreaks

Unassigned Aleksandr Dubinsky Major Resolved Fixed  
Bug TIKA-1307

Jenkins Java7 job requires a profile in order to build 'tika-java7' module.

Unassigned Lewis John McGibbney Major Resolved Done  
Improvement TIKA-1286

Adding MS Visio VSDX to mime-types detection

Unassigned Pascal Essiembre Minor Resolved Fixed  
Bug TIKA-1273

old tika-server jar artifact contains no manifest so not able to invoke from shell

Unassigned Lewis John McGibbney Minor Resolved Fixed  
Improvement TIKA-1269

Self-hosted documentation for the JAX-RS Server

Tyler Bui-Palsulich Nick Burch Major Closed Fixed  
Bug TIKA-1079

Word document hits AIOOBE in SummaryExtractor.parseSummaries

Unassigned Michael McCandless Major Resolved Fixed  
Bug TIKA-1072

AIOOBE when handling embedded document in .doc file

Unassigned Michael McCandless Major Resolved Fixed  
Bug TIKA-1028

Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

Unassigned Juha Haaga Major Resolved Fixed  
Bug TIKA-995

XHTMLContentHandler doesn't pass attributes of body element

Unassigned Markus Jelsma Major Reopened Unresolved  
Improvement TIKA-936

encoding of ZipArchiveInputStream

Chris A. Mattmann Shinichiro Abe Major Resolved Fixed  
Improvement TIKA-241

Rar archive support

Unassigned Jan Goyvaerts Minor Resolved Fixed