|
|
TIKA-1597
|
RTF with embedded image parsing produces div before html
|
Unassigned
|
Konstantin Gribov
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-1596
|
tika-cli logging cleanup
|
Konstantin Gribov
|
Konstantin Gribov
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-1591
|
Tika Parsers uses wrong version of bouncycastle
|
Konstantin Gribov
|
Java Developer
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1590
|
A particular PDF seems to trigger an infinite loop when being converted to HTML
|
Unassigned
|
Matt Sheppard
|
|
Closed |
Duplicate
|
|
|
|
|
|
|
TIKA-1589
|
Mp3 parser does not add duration to metadata if there are no ID3 tags
|
Unassigned
|
Max Daniline
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1587
|
ForkParser::setJavaCommand should take List<String>
|
Konstantin Gribov
|
Oleg Oshmyan
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1584
|
Tika 1.7 possible regression (nested attachment files not getting parsed)
|
Tim Allison
|
Rob Tulloh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1581
|
jhighlight license concerns
|
Unassigned
|
Karl Wright
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1580
|
ISA-Tab parsers
|
Chris A. Mattmann
|
Giuseppe Totaro
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1578
|
Add file type description to HDFParsers
|
Ann Burgess
|
Ann Burgess
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1576
|
Upgrade metadata-extractor to version 2.7.2
|
Tyler Bui-Palsulich
|
Tyler Bui-Palsulich
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1575
|
Upgrade to PDFBox 1.8.9 when available
|
Unassigned
|
Tim Allison
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-1572
|
Utility script for pushing 3rd Party UCAR Dependencies to Maven Central
|
Lewis John McGibbney
|
Lewis John McGibbney
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1571
|
Upgrade UCAR dependencies to 4.5.5
|
Lewis John McGibbney
|
Lewis John McGibbney
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1569
|
Doc typo Mime Magic Detction
|
Unassigned
|
Hari Sekhon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1567
|
WelcomeResource in TikaServer doesn't print PathParam prefix
|
Chris A. Mattmann
|
Chris A. Mattmann
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1565
|
image/gif parse error
|
Tyler Bui-Palsulich
|
lixin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1563
|
Use .gz as the default extension for application/gzip
|
Unassigned
|
Adam Lamar
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1561
|
GCMD Directory Interchange Format (.dif) identification
|
Chris A. Mattmann
|
Luke sh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1558
|
Create a Parser Blacklist
|
Tyler Bui-Palsulich
|
Tyler Bui-Palsulich
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1556
|
Clean up whitespace in tika-server
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1554
|
Improve EMF file detection
|
Chris A. Mattmann
|
Luís Filipe Nassif
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-1549
|
Two times speed increase of language profile distance calculation
|
Chris A. Mattmann
|
Toke Eskildsen
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1548
|
System property added while catching exception on parsing PDF encrypted doc
|
Unassigned
|
David Pilato
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1547
|
Use POST for tika-server form resources
|
Tyler Bui-Palsulich
|
Tyler Bui-Palsulich
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1544
|
empty lines are not preserved
|
Unassigned
|
mortee
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1543
|
TesseractOCRParser.setTesseractPath() doesn't work on Linux
|
Unassigned
|
Sean Zhao
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-1542
|
Substitute Apache TTF test file for current non-Apache friendly file
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1541
|
StringsParser: a simple strings-based parser for Tika
|
Chris A. Mattmann
|
Giuseppe Totaro
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1539
|
GRB file magic bytes and extension matching
|
Chris A. Mattmann
|
Luke sh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1537
|
Installation on OSX 10.10.2 generates OutOfMemory Error during parser tests
|
Chris A. Mattmann
|
Andrew Hwang
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1530
|
MP4Parser parses duration but does not set it
|
Chris A. Mattmann
|
Oskar Wickström
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1521
|
Handle password protected 7zip files
|
Unassigned
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1519
|
Don't allow whatever is in http-equiv Content-Type to overwrite actual Content-Type in HtmlParser
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1514
|
http-equiv content-type extraction should pick first parseable content value
|
Unassigned
|
Tim Allison
|
|
Closed |
Won't Fix
|
|
|
|
|
|
|
TIKA-1512
|
WordParser fails on many Word files
|
Unassigned
|
F Seid
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1511
|
Create a parser for SQLite3
|
Unassigned
|
Luís Filipe Nassif
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1489
|
PDF Text extraction without permission
|
Unassigned
|
Tilman Hausherr
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1483
|
Create a Latin1 charset raw string parser
|
Chris A. Mattmann
|
Luís Filipe Nassif
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1482
|
ForkParser throws exceptions when process some large pdf files
|
Unassigned
|
Sean Zhao
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-1423
|
Build a parser to extract data from GRIB formats
|
Vineet Ghatge
|
Vineet Ghatge
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1416
|
Refactor Translator Exception Handling
|
Unassigned
|
Tyler Bui-Palsulich
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1408
|
Fix version for tikadotnet to be tracked along with trunk and release version
|
Chris A. Mattmann
|
Chris A. Mattmann
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1388
|
Tika IOUtils java.lang.OutOfMemoryError
|
Unassigned
|
Alejandro León Mora
|
|
Closed |
Not A Problem
|
|
|
|
|
|
|
TIKA-1383
|
Simplify TikeServerCli endpoint setup code
|
Sergey Beryozkin
|
Sergey Beryozkin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1365
|
Incorrectly MimeType detection for Apache Lucene web site
|
Chris A. Mattmann
|
Tien Nguyen Manh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1309
|
RTF TextExtractor ignores consecutive linebreaks
|
Unassigned
|
Aleksandr Dubinsky
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1307
|
Jenkins Java7 job requires a profile in order to build 'tika-java7' module.
|
Unassigned
|
Lewis John McGibbney
|
|
Resolved |
Done
|
|
|
|
|
|
|
TIKA-1286
|
Adding MS Visio VSDX to mime-types detection
|
Unassigned
|
Pascal Essiembre
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1273
|
old tika-server jar artifact contains no manifest so not able to invoke from shell
|
Unassigned
|
Lewis John McGibbney
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1269
|
Self-hosted documentation for the JAX-RS Server
|
Tyler Bui-Palsulich
|
Nick Burch
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-1079
|
Word document hits AIOOBE in SummaryExtractor.parseSummaries
|
Unassigned
|
Michael McCandless
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1072
|
AIOOBE when handling embedded document in .doc file
|
Unassigned
|
Michael McCandless
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1028
|
Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.
|
Unassigned
|
Juha Haaga
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-995
|
XHTMLContentHandler doesn't pass attributes of body element
|
Unassigned
|
Markus Jelsma
|
|
Reopened |
Unresolved
|
|
|
|
|
|
|
TIKA-936
|
encoding of ZipArchiveInputStream
|
Chris A. Mattmann
|
Shinichiro Abe
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-241
|
Rar archive support
|
Unassigned
|
Jan Goyvaerts
|
|
Resolved |
Fixed
|
|
|
|
|