|
|
TIKA-873
|
Tika --extract fails for DOC
|
Unassigned
|
Albert L.
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-816
|
(XLS/XLSX) Improperly formatted date/time in text content.
|
Unassigned
|
Albert L.
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-847
|
Add regular expression support to the MagicDetector
|
Jukka Zitting
|
Andrew Jackson
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-900
|
Tika fails to detect ISO9660 disk images
|
Jukka Zitting
|
Andrew Jackson
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-863
|
MailContentHandler should not create AutoDetectParser on each call
|
Unassigned
|
Andrzej Bialecki
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-561
|
Support EMLX file detection
|
Jukka Zitting
|
Antoni Mylka
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-945
|
Upgrade tika-server to CXF 2.6.1
|
Chris A. Mattmann
|
Chris A. Mattmann
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-892
|
Tika does not use the HTML5 meta charset tag when determining charset
|
Jukka Zitting
|
Chris Jones
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-877
|
Embedded document not extracted (regression)
|
Maxim Valyanskiy
|
Daniel Bonniot de Ruisselet
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-939
|
Windows Media Video file detected as Windows Media Audio
|
Unassigned
|
Emil Burzo
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-431
|
Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly.
|
Jukka Zitting
|
Erik Hetzner
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-923
|
iWork keynote content on master slides are not being parsed
|
Michael McCandless
|
Erik Peterson
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-924
|
iWork number table titles not being parsed
|
Michael McCandless
|
Erik Peterson
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-876
|
Signed pdf parsing
|
Jukka Zitting
|
Fausto Cruzeiro de Moraes
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-910
|
Text contained in text boxes or shapes in Keynote docs runs together
|
Michael McCandless
|
Gabriel Valencia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-907
|
Comments embedded in Pages documents not supported
|
Unassigned
|
Gabriel Valencia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-906
|
Headers, footers, and footnotes not extracted from Pages documents
|
Unassigned
|
Gabriel Valencia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-905
|
Embedded text boxes and shapes with text not supported
|
Unassigned
|
Gabriel Valencia
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
TIKA-904
|
Pages documents created in Layout mode not supported
|
Michael McCandless
|
Gabriel Valencia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-834
|
server problem only 1st result is correct additional runs include data from 1st run
|
Jukka Zitting
|
George Kappel
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-901
|
Provide version number in tika-server
|
Chris A. Mattmann
|
Ingo Renner
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-943
|
Add parameter to tika-app to supply password for decryption
|
Jukka Zitting
|
Jan Høydahl
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-827
|
ForkServer fails to report issues if an exception is not properly serializable
|
Unassigned
|
Jerome Lacoste
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-832
|
ForkParser is unfriendly to code that prints things to its output
|
Jukka Zitting
|
Jerome Lacoste
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-935
|
TikaException thrown when trying to parse archive (*.ar) files
|
Chris A. Mattmann
|
John Mastarone
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-853
|
java.io.IOException with TikaGUI and testMP4.m4a
|
Unassigned
|
John Mastarone
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-896
|
OSGi deployment without declarative services
|
Jukka Zitting
|
Jörg Ehrlich
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-908
|
Adding XMP specification part one namespaces and properties
|
Jukka Zitting
|
Jörg Ehrlich
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-507
|
Parser for font files
|
Unassigned
|
Jukka Zitting
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-884
|
Dynamic loading of Parser and Detector services
|
Jukka Zitting
|
Jukka Zitting
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-951
|
Bundle activation policy for Eclipse
|
Jukka Zitting
|
Jukka Zitting
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-593
|
Tika network server
|
Chris A. Mattmann
|
Jukka Zitting
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-932
|
Upgrade to Commons Compress 1.4.1
|
Jukka Zitting
|
Jukka Zitting
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-322
|
Improve encoding detection speed and accuracy
|
Jukka Zitting
|
Jukka Zitting
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-471
|
Avoid Charset name bottleneck when multiple threads are using HtmlParser
|
Jukka Zitting
|
Kenneth William Krugler
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-502
|
Add programming language mime-types
|
Jukka Zitting
|
Kenneth William Krugler
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-941
|
Detecting KML / KMZ files
|
Jukka Zitting
|
Marco Quaranta
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-940
|
Support detecting 7-zip format
|
Unassigned
|
Marco Quaranta
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-883
|
Extract embedded images in PPT
|
Maxim Valyanskiy
|
Maxim Valyanskiy
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-882
|
IllegalArgumentException: No part found for relationship
|
Maxim Valyanskiy
|
Maxim Valyanskiy
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-931
|
Tika's PDFParser fails to parse documents embedded in a PDF Package
|
Jukka Zitting
|
Michael McCandless
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-757
|
Address TODOs when we upgrade to next POI release (3.8 beta 5)
|
Unassigned
|
Michael McCandless
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-948
|
Embedded PDF extracted incorrectly as MS Works file from Word 97-2003 doc
|
Michael McCandless
|
Michael McCandless
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-929
|
Consistent, namespaced definitions for office file related metadata
|
Jukka Zitting
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-890
|
Improve detection of Android Packages (APK)
|
Unassigned
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-886
|
OOXMLExtractorFactory can leave files open
|
Nick Burch
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-949
|
Mimetype magic needed for mapping formats such as XMind Pro and MindMapper
|
Nick Burch
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-700
|
Upgrade to POI 3.8 as available
|
Nick Burch
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-747
|
Ogg Vorbis and FLAC Parsers
|
Nick Burch
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-875
|
Temporary file leak in ImageParser
|
Michael McCandless
|
Niels Beekman
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-874
|
Identify FITS (Flexible Image Transport System) files
|
Chris A. Mattmann
|
Peter May
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-930
|
Consolidation of Some Tika Core Properties
|
Unassigned
|
Ray Gauss II
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-927
|
Composite Properties
|
Unassigned
|
Ray Gauss II
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-926
|
Data Typed Metadata.set(...) Value Methods Should Call Metadata.set(Property...)
|
Unassigned
|
Ray Gauss II
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-925
|
Remove DublinCore From Metadata and Deprecate String Properties
|
Unassigned
|
Ray Gauss II
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-842
|
IPTC Properties Should be Defined Completely and Independently of the Drew Library
|
Unassigned
|
Ray Gauss II
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-859
|
DublinCore Metadata Keys Should be Prefixed and Property Objects
|
Unassigned
|
Ray Gauss II
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-947
|
AbstractMetadataHandler addMetadata Does not Check Property.isMultiValuePermitted
|
Unassigned
|
Ray Gauss II
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-916
|
NullPointerException processing XPS file
|
Unassigned
|
Rob Tulloh
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-861
|
Parse links in PDF
|
Unassigned
|
Sasha Goodman
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-870
|
Allow to use call parseToString with a additional parameter of MaxStringLength, so it can be changed per call
|
Michael McCandless
|
Shay Banon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-482
|
Refactor image and jpeg parsers for access to MetadataExtractor API
|
Unassigned
|
Staffan Olsson
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-913
|
MagicMime detection of msdos executables does not work
|
Unassigned
|
Torsten Krah
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-355
|
DublinCore constants should be prefixed with "dc."
|
Unassigned
|
Vivek Magotra
|
|
Resolved |
Fixed
|
|
|
|
|