|
|
TIKA-3198
|
extracting ppt with chart give excel in which data is missing
|
Unassigned
|
sagar
|
|
Open |
Unresolved
|
|
|
|
|
|
|
TIKA-2800
|
Include num of unique common/alphabetic tokens (types) in tika-eval
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2799
|
Consider reverting jackcess
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2798
|
Consider reverting junrar
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2795
|
Error starting Tika 2.0 server with -spawnChild on Ubuntu
|
Tim Allison
|
Mario Bisonti
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2791
|
Add structure tags to tika-eval
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2788
|
Upgrade to PDFBox 2.0.13 when available
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2785
|
Switch parent/child IPC to mmap file from stdout/stderr in tika-server
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2784
|
Add static grabbing of stdout/err to MockParser
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2782
|
Protect IPC via stdout in child process in tika-server in -spawnChild mode
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2780
|
Intermittent failures in batch mode when STDIN = /tmp/null
|
Tim Allison
|
Jeroen
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2779
|
Integrate/parameterize new rotated text handling in PDFBox
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2778
|
Upgrade jaxb-runtime and javax.activation
|
Tim Allison
|
Hans Brende
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2777
|
Unbounded regex in Optimaize can lead to really, really slow processing
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2776
|
Tika server child restart
|
Tim Allison
|
Mario Bisonti
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2773
|
Upgrade Sqlite to 3.25.2
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2770
|
Convert EnviHeader "map info" from UTM to LatLon
|
Lewis John McGibbney
|
Kristen Cheung
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2764
|
Allow configuration to include/not deleted text in WordPerfect 6.x files
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2763
|
PDFParser - java.io.IOException: Missing root object specification in trailer
|
Unassigned
|
Lewis John McGibbney
|
|
Resolved |
Not A Problem
|
|
|
|
|
|
|
TIKA-2762
|
Capture short fields (<150 chars) in EnviParserHeader Metadata
|
Lewis John McGibbney
|
Lewis John McGibbney
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2761
|
XML Structured Text Is Missing Metadata Fields for mp3 files
|
Tim Allison
|
Nick Sincaglia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2760
|
LinkContentHandler does not report hyperlinks
|
Unassigned
|
Markus Jelsma
|
|
Closed |
Not A Problem
|
|
|
|
|
|
|
TIKA-2759
|
ScriptsExtractor incorrectly reports Javascript to characters() in SAX ContentHandler
|
Tim Allison
|
Markus Jelsma
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2754
|
Log file name in tika-server on exception/error
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2753
|
ChildProcess does not use the JAVA_HOME
|
Tim Allison
|
Julien Massiera
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2751
|
Upgrade to POI 4.0.1 when available
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2735
|
notes and footer contents are duplicated in extracting text from power point slides
|
Tim Allison
|
feng ye
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2637
|
ParsingReader.read throws exception when no bytes are available
|
Tim Allison
|
Boris Petrov
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2599
|
Hyperlink surrounded by Italics not closed Properly
|
Dave Meikle
|
Ronan O'Sullivan
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2550
|
ToTextHandler includes <style/> element content
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2147
|
ClassCastException on a valid Word template
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|