Skip to content

get_page_images(): only return images actually referenced by the given page#4961

Open
andreasntr wants to merge 2 commits intopymupdf:mainfrom
andreasntr:fix-page-images-duplication
Open

get_page_images(): only return images actually referenced by the given page#4961
andreasntr wants to merge 2 commits intopymupdf:mainfrom
andreasntr:fix-page-images-duplication

Conversation

@andreasntr
Copy link
Copy Markdown

As per Document.get_page_images() docs:

this is not the list of images that are actually displayed.

This fix allows getting only the images actually referenced by the given page by comparing xrefs returned by Page.get_image_info(xrefs=True).

The list of images xrefs for each page is saved at Document creation time, filtering is performed only when invoking get_page_images on the document.

Request

I was able to write this code because i used this kind of workaround in a project of mine base on pymudf, however I'm not able to fully test it because the process of creating the test environment is not clear (pyproject.toml is empty for example). I'll be happy to test it when instructions are provided

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@andreasntr
Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant