ASF JIRA

Tika
1.13
Key descending
170 of 70 as at: 08/May/24 21:57
T Patch Info Key Summary Assignee Reporter P Status Resolution Created Updated Due Development
Bug TIKA-2326

java.lang.OutOfMemoryError: Java heap space

Unassigned Md Major Closed Fixed  
Bug TIKA-2172

can not read Arabic file

Unassigned Ahmad Sawalhah Major Resolved Won't Fix  
Bug TIKA-2143

POI deprecated method used in TIKA 1.13

Unassigned sbathrutheen Major Open Unresolved  
Bug TIKA-1967

Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@10b8c32

Unassigned kostali Major Resolved Not A Problem  
Improvement TIKA-1965

Added types to Grobid quantities parser

Dave Meikle Can Menekse Minor Resolved Fixed  
Bug TIKA-1961

OutOfMemory when parsing shapes xml from xlsx files with multi-byte Unicode characters

Tim Allison Andrei Rebegea Major Closed Fixed  
Bug TIKA-1960

Put legacy language detection code back into 1.x=trunk

Tim Allison Tim Allison Blocker Resolved Fixed  
Improvement TIKA-1959

Upgrade to PDFBox 2.0.1/JempBox 1.8.12

Unassigned Tim Allison Minor Closed Fixed  
Bug TIKA-1956

NPE in WordParser when trying to getPicOffset

Tim Allison Ramit Wadhwa Major Resolved Fixed  
Improvement TIKA-1955

MIME types updates and additions for Scientific Data based on TREC-DD-Polar

Chris A. Mattmann Chris A. Mattmann Major Resolved Fixed  
Task TIKA-1950

Clean up jdom version conflict

Unassigned Tim Allison Minor Resolved Fixed  
Improvement TIKA-1949

Upgrade to Commons Compress 1.11

Tim Allison Nick Burch Major Resolved Fixed  
Improvement TIKA-1948

Catch exceptions per page in PDFParser

Tim Allison Tim Allison Minor Resolved Fixed  
Improvement TIKA-1944

Add mime magic for apple single/double files

Unassigned Nick C Minor Resolved Fixed  
Improvement TIKA-1943

Include support for Yandex Translate API

Chris A. Mattmann Mark Duske Minor Resolved Fixed  
Task TIKA-1939

Preparation for Tika 1.13 release

Unassigned Tim Allison Major Resolved Fixed  
Bug TIKA-1937

LinkContentHandler skips script tags

Unassigned Joseph Naegele Major Resolved Fixed  
Sub-task TIKA-1935

TIKA-1936 ISArchiveParser not releasing resources

Tim Allison Tim Allison Trivial Resolved Fixed  
Sub-task TIKA-1934

TIKA-1936 GeographicInformationParserTest leaving behind temp file in trunk

Tim Allison Tim Allison Trivial Resolved Fixed  
Sub-task TIKA-1933

TIKA-1936 ForkParser leaves tmp jars behind on Windows (at least)

Unassigned Tim Allison Trivial Resolved Fixed  
Sub-task TIKA-1932

TIKA-1936 Clear resources in ParserDecorator

Tim Allison Tim Allison Trivial Resolved Fixed  
Bug TIKA-1927

NPE in JDBCTableReader

Tim Allison Nick C Minor Resolved Fixed  
Bug TIKA-1926

JSON TEI Exception

Chris A. Mattmann Ayesha Hasan Minor Resolved Fixed  
Task TIKA-1924

Upgrade com.googlecode.mp4parser's isoparser to 1.1.18

Unassigned Tim Allison Trivial Resolved Fixed  
Improvement TIKA-1918

Shouldn't have to specify outputSuffix in tika-batch

Tim Allison Tim Allison Trivial Resolved Fixed  
Improvement TIKA-1917

Just a quick fix to allow NLTK Parser extract measurement information from text

Chris A. Mattmann Manali Shah Major Resolved Fixed  
Bug TIKA-1916

NPE in OpenDocumentParser

Tim Allison Nick C Trivial Resolved Fixed  
Bug TIKA-1914

ExecutableParser doesn't call start document

Unassigned Nick C Trivial Resolved Fixed  
New Feature TIKA-1913

Integrate MIT Information Extraction(MITIE) into Tika to perform Named Entity Recognition

Chris A. Mattmann Manali Shah Major Resolved Fixed  
Bug TIKA-1906

ExternalParser No Longer Supports Commands in Array Format

Ray Gauss II Ray Gauss II Major Resolved Fixed  
Bug TIKA-1898

backslashes in mime-type : application/vnd.mif are wrong

Unassigned Steffen Netz Minor Resolved Fixed  
Task TIKA-1895

Upgrade to POI 3.15-beta1 when available

Unassigned Tim Allison Major Resolved Fixed  
New Feature TIKA-1894

Add XMPMM metadata extraction to JempboxExtractor

Unassigned Tim Allison Minor Resolved Fixed  
Improvement TIKA-1893

Add new mimetype for *.icns (Apple Icon Image Format) files

Unassigned Manisha Kampasi Minor Resolved Fixed  
Improvement TIKA-1892

Mime Magic for application/x-mobipocket-ebook and application/x-shapefile

Unassigned Suman Kashyap Minor Resolved Fixed  
Improvement TIKA-1890

Update mimetype for application/vnd.ms-cab-compressed

Unassigned Ajay Kumar Loganathan Ravichandran Major Resolved Fixed  
Sub-task TIKA-1888

TIKA-1955 Update mimetype for application/x-netcdf

Unassigned Ajay Kumar Loganathan Ravichandran Major Resolved Fixed  
Sub-task TIKA-1886

TIKA-1955 Updating tika-mimetypes.xml to detect .hfa files

Chris A. Mattmann Nandan Chandrashekar Minor Resolved Fixed  
Sub-task TIKA-1885

TIKA-1955 Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

Chris A. Mattmann Adesh Gupta Critical Resolved Fixed  
Sub-task TIKA-1882

TIKA-1955 Scientific MIME updates to .cab files, .xar and .mobi and .mov files based on TREC-DD-Polar analysis

Chris A. Mattmann Manisha Kampasi Minor Resolved Fixed  
Sub-task TIKA-1881

TIKA-1955 Updates to MIME types for Postscript, WordPerfect, image and RSS based on Polar analysis

Chris A. Mattmann Namitha Sanjeeva Ganiga Minor Resolved Fixed  
Improvement TIKA-1878

Upgrade Apache SIS 0.6

Unassigned Hendy Irawan Trivial Resolved Fixed  
Bug TIKA-1877

On updating the tika-mimetypes.xml to detect .fts file format, tika detector does not return anything

Unassigned Prasad Nagaraj Subramanya Minor Resolved Fixed  
New Feature TIKA-1876

Integrate Natural Language Toolkit (NLTK) into Tika to perform Named Entity Recognition

Chris A. Mattmann Manali Shah Major Resolved Fixed  
Improvement TIKA-1875

Updating tika-mimetypes.xml to detect .NC files

Unassigned Prasad Nagaraj Subramanya Minor Resolved Fixed  
Improvement TIKA-1872

Backport tika-langdetect from 2.x branch to 1.13 branch

Chris A. Mattmann Trevor Lewis Major Resolved Fixed  
Task TIKA-1871

Update Tika JAXRS wiki page with the info about multipart/form-data

Sergey Beryozkin Sergey Beryozkin Major Resolved Fixed  
Bug TIKA-1870

Relocating RichTextContentHandler into tika-core from tika-server

Unassigned John Patrick Major Resolved Fixed  
Bug TIKA-1869

Jackson update to latest version

Unassigned John Patrick Major Resolved Fixed  
Bug TIKA-1868

create clean tika-server jar and shaded classifier jar

Unassigned John Patrick Major Closed Won't Fix  
Bug TIKA-1866

Out of memory error on Word document

Unassigned Shawn Johnson Major Resolved Fixed  
Bug TIKA-1862

Exception in thread "Thread-9" java.lang.UnsatisfiedLinkError: /usr/lib/jvm/jre/lib/amd64/headless/libmawt.so: libcups.so.2: cannot open shared object file: No such file or directory

Unassigned Avinash Major Resolved Invalid  
Improvement TIKA-1861

Upgrade to sqlite-jdbc 3.8.11.2

Unassigned Tim Allison Trivial Resolved Fixed  
Improvement TIKA-1857

Enhance PDFParser to extract text from XFA forms

Unassigned Pascal Essiembre Major Resolved Fixed  
Bug TIKA-1856

Error while parsing an ogg file

Unassigned Yash Tanna Major Resolved Fixed  
Task TIKA-1846

Set up Hudson (or similar?) with new Git repo

Lewis John McGibbney Tim Allison Major Resolved Fixed  
Bug TIKA-1845

Unable to extract content from certain RTFs using tika-server versions since 1.5

Tim Allison Ian Williams Major Resolved Fixed  
Bug TIKA-1844

PooledTimeSeriesParser takes precedence over MP4Parser

Unassigned Tim Allison Minor Resolved Fixed  
Bug TIKA-1836

Convertion DOC->TXT failed due to POI issue

Unassigned Jorge Spinsanti Major Resolved Fixed  
Improvement TIKA-1830

Upgrade to PDFBox 1.8.11 when available

Tim Allison Tim Allison Major Closed Fixed  
Improvement TIKA-1823

Support detecting DWF format

Unassigned Luca Moretti Minor Resolved Fixed  
Bug TIKA-1821

Problem in Tika().detect for xml file signed in CADES

Unassigned Alessandro De Angelis Major Resolved Fixed  
Bug TIKA-1816

Lenient testing for NamedEntityParser

Unassigned Thamme Gowda Major Resolved Fixed  
Bug TIKA-1801

Integrate MITIE Named Entity Recognition support

Chris A. Mattmann Chris A. Mattmann Major Resolved Duplicate  
Improvement TIKA-1723

Integrate language-detector into Tika

Kenneth William Krugler Kenneth William Krugler Minor Resolved Fixed  
New Feature TIKA-1696

Language Identification with Text Processing Toolkit from MITLL

Chris A. Mattmann Paul Ramirez Major Resolved Fixed  
Improvement TIKA-1657

Allow easier XML serialization of TikaConfig

Unassigned Tim Allison Minor Resolved Fixed  
Bug TIKA-1473

Apache Tika is not working for .docx documents

Unassigned Franco Catto Major Resolved Fixed  
Improvement TIKA-1435

Update rome dependency to 1.5

Chris A. Mattmann aoeu Minor Resolved Fixed  
Improvement TIKA-1285

Upgrade to PDFBox 2.0.0 when available

Unassigned Jeremy Anderson Major Closed Fixed