-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Labels
is-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFworkflow-imagesFrom a users perspective, image handling is the affected feature/workflowFrom a users perspective, image handling is the affected feature/workflow
Description
When extracting FlateDecoded grayscale images with one bit per component (/BitsPerComponent 1), the handle_flate function incorrectly determines the image mode. This results in a ValueError: not enough image data when the data is passed to Pillow's Image.frombytes.
Although _get_mode_and_invert_color is called before handle_flate and correctly handles /BitsPerComponent, the resulting mode is overwritten inside handle_flate (around L283) by the potentially incorrect result from _get_imagemode.
Environment
Which environment were you using when you encountered the problem?
$ python -m platform
macOS-26.0.1-x86_64-i386-64bit
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==6.4.0, crypt_provider=('cryptography', '44.0.1'), PIL=12.0.0Code + PDF
This is a minimal, complete example that shows the issue:
from pypdf import PdfReader
reader = PdfReader("pypdf_bug_3534_iccbased.pdf")
page = reader.pages[0]
for image in page.images:
img = image.image # ValueError: not enough image datapypdf_bug_3534_iccbased.pdf (this file can be added to tests)
Traceback
This is the complete traceback I see:
Traceback (most recent call last):
File "pypdf/_page.py", line 473, in __iter__
yield self[i]
File "pypdf/_page.py", line 469, in __getitem__
return self.get_function(lst[index])
File "pypdf/_page.py", line 654, in _get_image
imgd = _xobj_to_image(cast(DictionaryObject, xobjs[id]))
File "pypdf/filters.py", line 891, in _xobj_to_image
img, image_format, extension, _ = _handle_flate(
File "pypdf/_xobj_image_helpers.py", line 285, in _handle_flate
img = Image.frombytes(mode2, size, data) # reloaded as mode may have changed
File "site-packages/PIL/Image.py", line 3144, in frombytes
im.frombytes(data, decoder_name, decoder_args)
File "site-packages/PIL/Image.py", line 868, in frombytes
ValueError: not enough image data
Metadata
Metadata
Assignees
Labels
is-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFworkflow-imagesFrom a users perspective, image handling is the affected feature/workflowFrom a users perspective, image handling is the affected feature/workflow