- 
                Notifications
    
You must be signed in to change notification settings  - Fork 612
 
feat: Dropped custom image handling, added full Pillow support #118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
62102b0    to
    fcb0802      
    Compare
  
    | url='http://code.google.com/p/pyfpdf', | ||
| license='LGPLv3+', | ||
| download_url="https://github.com/reingart/pyfpdf/tarball/%s" % fpdf.__version__, | ||
| install_requires=['numpy>=1.15.4', 'Pillow>=5.3.0'], | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree with numpy depency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look here:
Line 1945 in 08e28af
| re_c = re.compile('(...).'.encode("ascii"), flags=re.DOTALL) | 
Here, bytes representing RGBA pixel values are processed using regular expressions. This is extremely slow. If you care about peformance while generating PDFs, this won't cut it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have all the deps you want in my fork
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to this code?:
https://github.com/alexanderankin/pyfpdf/blob/master/fpdf/image_parsing.py#L199
In any case, the way the pixels are being processed (by means of regular expressions seperating RGBA into RGB and A) is really really slow. So, yes, numpy is an additional dependency, but at least image handling is much more performant now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh im saying ill accept such a pr on my fork thats all. i agree with this move.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually can you help me test it? I copied this bit into my branch and the way i have my tests set up is that it generates a pdf, computes hash, compares the hash and then os.unlinks the generated pdf. However, with this proposed get_img_info function it is inserting an extra object and stream for the thumbnail preview, which of course makes the test fail but i should probably just re-do all my tests? thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i updated mine to 2.0.1 with all these changes but it broke a lot of my tests :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed another fix to this branch to properly deal with unsupported extensions. Not sure about the inner workings of the test suite, but it seems to me that due to the Pillow changes the hashes of the resources changed.
>> Tests: 28
Test 1 / 28 : test_cache.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 2 / 28 : test_corebox.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 3 / 28 : test_e1252.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 4 / 28 : test_imgmask.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 5 / 28 : test_invoice.py
HASHER  FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    
Test 6 / 28 : test_issue14.py
HASHER  FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    
Test 7 / 28 : test_issue33.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 8 / 28 : test_issue35.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 9 / 28 : test_issue41.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 10 / 28 : test_issue60.py
SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 11 / 28 : test_issue62.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 12 / 28 : test_issue63.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 13 / 28 : test_issue70.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 14 / 28 : test_issue71.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 15 / 28 : test_issue78.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 16 / 28 : test_issue82.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 17 / 28 : test_jpeg.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 18 / 28 : test_nbpages.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 19 / 28 : test_output.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 20 / 28 : test_page_orient.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 21 / 28 : test_page_size.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 22 / 28 : test_py3k.py
OK      OK      OK      OK      OK      OK      SKIP    SKIP    OK      SKIP    SKIP    SKIP    SKIP    
Test 23 / 28 : test_simple.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 24 / 28 : test_stretching.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 25 / 28 : test_template.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 26 / 28 : test_ttfonts.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 27 / 28 : test_unicode.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 28 / 28 : test_winfonts.py
SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
fcb0802    to
    3403fb7      
    Compare
  
    | 
           @pennersr I tried your version and honestly It was a breakthrough. Usually it takes me 230s at best to generate a pdf and now It is only 7s!!! Many thanks!  | 
    
| 
           Note that image handling has improved a lot, using Pillow, in fpdf2.  | 
    
With this PR, images are processed using PIL. This adds support for any PIL supported image format. Additionally, numpy is used to speed up image processing. This PR effectively makes #90 and #117 obsolete.