ASF JIRA

Tika
2.5.0
Key descending
142 of 42 as at: 25/Apr/24 07:17
T Patch Info Key Summary Assignee Reporter P Status Resolution Created Updated Due Development
Improvement TIKA-3890

Identifying an efficient approach for getting page count prior to running an extraction

Unassigned Ethan Wilansky Blocker Closed Fixed  
Improvement TIKA-3880

Tika not picking-up setByteArrayMaxOverride from tika-config

Unassigned Ethan Wilansky Blocker Closed Resolved  
Task TIKA-3867

Add pipes reporter that updates stats in a file on disk

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-3866

Update to PDFBox 2.0.27

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-3865

Add a composite PipesReporter

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-3863

Add a pipes reporter for OpenSearch

Unassigned Tim Allison Minor Resolved Fixed  
Task TIKA-3860

Pull tesseract 5 in our docker image for the next 2.x release

Unassigned Tim Allison Major Resolved Fixed  
Bug TIKA-3859

Wrong filename glob for Zstandard

Tim Allison Robin Schimpf Major Resolved Fixed  
Task TIKA-3856

Upgrade to jempbox 1.8.17

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-3855

Implement upsert for OpenSearch emitter

Unassigned Tim Allison Major Resolved Fixed  
Task TIKA-3854

Bump main's development version to 2.5.0-SNAPSHOT

Unassigned Tim Allison Trivial Resolved Fixed  
Wish TIKA-3853

Enable configuring digests via autodetectparserconfig

Unassigned Tim Allison Major Resolved Fixed  
Wish TIKA-3852

Extract signature info from PDFs

Unassigned Tim Allison Minor Resolved Fixed  
Wish TIKA-3851

Add detection for e57

Unassigned Tim Allison Trivial Resolved Fixed  
Wish TIKA-3849

Throw UnsupportedFormatException or similar for really old mdb files

Unassigned Tim Allison Minor Resolved Fixed  
Bug TIKA-3848

IllegalArgumentException in DBFColumnHeader.setType()

Unassigned Tilman Hausherr Major Resolved Fixed  
Bug TIKA-3847

NullPointerException when processing pdf document(Allow proceed on RuntimeException)

Tilman Hausherr Yurii Major Resolved Fixed  
Improvement TIKA-3846

Improve JDBC emitter to handle attachments and batch updates

Unassigned Tim Allison Trivial Resolved Fixed  
Improvement TIKA-3845

Add a callable wrapper for the pipesiterator

Unassigned Tim Allison Trivial Resolved Fixed  
Improvement TIKA-3844

Improve extraction of PDF subset info

Unassigned Tim Allison Minor Resolved Fixed  
Improvement TIKA-3843

use commons-io byte array streams

Unassigned PJ Fanning Major Resolved Fixed  
Improvement TIKA-3842

Revert slf4j core back to 1.x?

Unassigned Tim Allison Minor Resolved Fixed  
Improvement TIKA-3840

Add extraction of ODF version

Unassigned Tim Allison Minor Resolved Fixed  
Bug TIKA-3839

Property com.ctc.wstx.maxEntityCount is not supported

Unassigned Lakatos Gyula Minor Resolved Fixed  
New Feature TIKA-3836

Add initial jdbc emitter

Unassigned Tim Allison Minor Resolved Fixed  
Bug TIKA-3833

bzip2 MIME type is detected as bzip instead when using tika-core

Unassigned Eduardas Kazakas Major Resolved Fixed  
Bug TIKA-3832

Required array length is too large (OOM) error when reading a PDF file

Unassigned Lakatos Gyula Major Resolved Fixed  
Task TIKA-3831

Allow for retries in S3Fetcher

Unassigned Tim Allison Trivial Resolved Fixed  
Improvement TIKA-3825

ForkParser allow shutdown

Unassigned Ben Gilbert Major Resolved Fixed  
Task TIKA-3824

RegexCaptureParser should add metadata items, not set

Unassigned Tim Allison Trivial Resolved Fixed  
New Feature TIKA-3820

Kafka Tika Pipes Support

Unassigned Nicholas DiPiazza Major Resolved Fixed  
Task TIKA-3818

Remove pdfdebugger from tika (2)

Tilman Hausherr Tilman Hausherr Trivial Resolved Fixed  
Bug TIKA-3815

Inconsistent Date/Time information extracted from Exif data

Luís Filipe Nassif Luís Filipe Nassif Major Resolved Fixed  
Bug TIKA-3812

Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

Unassigned Eugen Caruntu Minor Resolved Fixed  
Bug TIKA-3810

Vtt file (encoding UTF-8 with BOM) seen as text/plain

Unassigned Giorgiana Ciobanu Major Resolved Fixed  
Task TIKA-3804

Improve configurability of renderers in the PDFParser

Tim Allison Tim Allison Major Resolved Fixed  
Task TIKA-3800

Consider wrapping 'unrar' commandline executable as a parser to handle rar v5

Unassigned Tim Allison Minor Resolved Fixed  
Task TIKA-3799

Refactor FuzzingCLI to use PipesParser

Unassigned Tim Allison Major Resolved Fixed  
Bug TIKA-3796

IncludeHeadersAndFooters is not being passed through via tika-config to the MSOffice parser

Tim Allison Tim Allison Minor Resolved Fixed  
Improvement TIKA-3795

General upgrades for 2.5.0

Unassigned Tilman Hausherr Minor Resolved Fixed  
Bug TIKA-3794

ocrImageType is not configurable via headers in tika-server

Tim Allison Tim Allison Minor Resolved Fixed  
Improvement TIKA-3767

Use junit's @TempDir where possible

Unassigned Tim Allison Minor Resolved Fixed