Skip to content

Conversation

@pennersr
Copy link

With this PR, images are processed using PIL. This adds support for any PIL supported image format. Additionally, numpy is used to speed up image processing. This PR effectively makes #90 and #117 obsolete.

url='http://code.google.com/p/pyfpdf',
license='LGPLv3+',
download_url="https://github.com/reingart/pyfpdf/tarball/%s" % fpdf.__version__,
install_requires=['numpy>=1.15.4', 'Pillow>=5.3.0'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree with numpy depency

Copy link
Author

@pennersr pennersr Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a look here:

re_c = re.compile('(...).'.encode("ascii"), flags=re.DOTALL)

Here, bytes representing RGBA pixel values are processed using regular expressions. This is extremely slow. If you care about peformance while generating PDFs, this won't cut it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have all the deps you want in my fork

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to this code?:

https://github.com/alexanderankin/pyfpdf/blob/master/fpdf/image_parsing.py#L199

In any case, the way the pixels are being processed (by means of regular expressions seperating RGBA into RGB and A) is really really slow. So, yes, numpy is an additional dependency, but at least image handling is much more performant now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh im saying ill accept such a pr on my fork thats all. i agree with this move.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually can you help me test it? I copied this bit into my branch and the way i have my tests set up is that it generates a pdf, computes hash, compares the hash and then os.unlinks the generated pdf. However, with this proposed get_img_info function it is inserting an extra object and stream for the thumbnail preview, which of course makes the test fail but i should probably just re-do all my tests? thoughts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i updated mine to 2.0.1 with all these changes but it broke a lot of my tests :/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed another fix to this branch to properly deal with unsupported extensions. Not sure about the inner workings of the test suite, but it seems to me that due to the Pillow changes the hashes of the resources changed.

>> Tests: 28
Test 1 / 28 : test_cache.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 2 / 28 : test_corebox.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 3 / 28 : test_e1252.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 4 / 28 : test_imgmask.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 5 / 28 : test_invoice.py
HASHER  FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    
Test 6 / 28 : test_issue14.py
HASHER  FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    
Test 7 / 28 : test_issue33.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 8 / 28 : test_issue35.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 9 / 28 : test_issue41.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 10 / 28 : test_issue60.py
SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 11 / 28 : test_issue62.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 12 / 28 : test_issue63.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 13 / 28 : test_issue70.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 14 / 28 : test_issue71.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 15 / 28 : test_issue78.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 16 / 28 : test_issue82.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 17 / 28 : test_jpeg.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 18 / 28 : test_nbpages.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 19 / 28 : test_output.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 20 / 28 : test_page_orient.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 21 / 28 : test_page_size.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 22 / 28 : test_py3k.py
OK      OK      OK      OK      OK      OK      SKIP    SKIP    OK      SKIP    SKIP    SKIP    SKIP    
Test 23 / 28 : test_simple.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 24 / 28 : test_stretching.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 25 / 28 : test_template.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 26 / 28 : test_ttfonts.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 27 / 28 : test_unicode.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 28 / 28 : test_winfonts.py
SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    

@Alilino
Copy link

Alilino commented Feb 2, 2019

@pennersr I tried your version and honestly It was a breakthrough. Usually it takes me 230s at best to generate a pdf and now It is only 7s!!! Many thanks!

@Lucas-C
Copy link

Lucas-C commented Jan 6, 2021

Note that image handling has improved a lot, using Pillow, in fpdf2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants