Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image optimisation operations #69

Merged
merged 16 commits into from
Jul 10, 2023
Merged

Image optimisation operations #69

merged 16 commits into from
Jul 10, 2023

Conversation

zerolab
Copy link
Collaborator

@zerolab zerolab commented Oct 30, 2018

@kaedroho this is a first pass at using image optimisation binaries.

Opening fairly early to get initial feedback.

On my To-Do list:

  • Decide on the best place to trigger the optimisation
  • Optimisation chain to choose the best match for a given image
  • Tests
  • Documentation

Future To-Do:

  • Possible async processing

@kaedroho
Copy link
Contributor

kaedroho commented Oct 30, 2018

This looks good so far, thanks!

I think it would be really nice if this could be fit in to an Image class somehow. For example:

class JpegOptim(Image):
    def __init__(self, file):
        self.file = file

    @classmethod
    @Image.converter_from(JpegImageFile)  # Will require an @converter_to(JpegImageFile) in Pillow/Wand
     def from_jpeg_file(cls, file):
        return cls(file)

    @Image.operation
    def optimize(self):
        # Loads, optimizes and saves the file

This has the advantage of being able to bypass Pillow/Wand if the image to be optimised is already saved as a JPEG file. But the example above suffers from some issues such as not being able to select output format or filename, and leaves choosing an optimiser up to Willow's routing (which might not be adequate for this use case). I'm happy for this to be done differently if it can't be fit into Image classes, so please just take this as some food for thought!

@zerolab
Copy link
Collaborator Author

zerolab commented Oct 30, 2018

That is exactly the type of feedback I was looking for. Thanks Karl.
I think it can be possible to piggyback on https://github.com/wagtail/Willow/blob/master/willow/image.py#L89

Will let it simmer for a day or two and come back it fresh

@zerolab zerolab marked this pull request as draft February 21, 2022 14:35
@zerolab
Copy link
Collaborator Author

zerolab commented Jun 18, 2022

Note to self: cwebp and oxipng

@zerolab zerolab changed the title [WIP] Image optimisation operations Image optimisation operations Sep 16, 2022
willow/plugins/pillow.py Outdated Show resolved Hide resolved
willow/plugins/pillow.py Outdated Show resolved Hide resolved
willow/plugins/pillow.py Outdated Show resolved Hide resolved
Copy link

@ababic ababic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work so far @zerolab. Really love how this is shaping up. The registry is particularly lovely :)

@zerolab zerolab added this to the Maintenance milestone milestone Jun 21, 2023
@zerolab zerolab marked this pull request as ready for review July 5, 2023 09:52
@zerolab zerolab requested a review from ababic July 6, 2023 08:23
willow/image.py Outdated
return

has_named_file = False
if isinstance(image_file, SpooledTemporaryFile):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for this method is: create a NamedTemporaryFile if needed, optimize in place, and replace the passed image_file. If image_file is a string, we assume a path and try to optimize that directly

note: this is probably the most controversial bit.
Latest Wagtail passes on a SpooledTemporaryFile as of wagtail/wagtail#10284.

Given willow is used in non-Wagtail contexts, I wanted to support: BytesIO (which is what Wagtail passed prior to wagtail/wagtail#10284), NamedTemporaryFile, + string and bytes

The reason for string is that we want to support file paths (#108). I am unsure about bytes, though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This process is a little different to what I was expecting, but not too different. I suppose always converting to NamedTemporaryFile makes sense, as I suspect more optimization packages support file paths as inputs vs raw bytes (plus, it keeps the optimizer design simple). We can also avoiding loading full file contents into memory if we're careful (which it looks like we are being).

willow/optimizers/base.py Outdated Show resolved Hide resolved
docs/concepts.rst Outdated Show resolved Hide resolved
Copy link

@ababic ababic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking very good indeed, @zerolab - fabulous work! I've added a few things for consideration, but nothing blocking :)

docs/guide/optimize.rst Outdated Show resolved Hide resolved

WILLOW_OPTIMIZERS = "jpegoptim,optipng"
# or as a list of optimizer library names
WILLOW_OPTIMIZERS = ["jpegoptim", "optipng"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice inclusion :)

willow/image.py Outdated
return

has_named_file = False
if isinstance(image_file, SpooledTemporaryFile):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This process is a little different to what I was expecting, but not too different. I suppose always converting to NamedTemporaryFile makes sense, as I suspect more optimization packages support file paths as inputs vs raw bytes (plus, it keeps the optimizer design simple). We can also avoiding loading full file contents into memory if we're careful (which it looks like we are being).

willow/image.py Outdated Show resolved Hide resolved
willow/image.py Outdated
Comment on lines 176 to 182
if hasattr(image_file, "seek"):
# rewind and replace the image file with the optimized version
image_file.seek(0)
with open(file_path, "rb") as f:
image_file.write(f.read())

if hasattr(image_file, "truncate"):
image_file.truncate() # bring the file size down to the actual image size
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some nice code right here :)

willow/optimizers/base.py Outdated Show resolved Hide resolved
Comment on lines 23 to 34
return [
"-q 80", # compression factor. 100 produces the highest quality.
"-m 6", # inspect all encoding possibilities for best file size
"-pass 10", # max number of passes
"-mt", # use multithreading if possible
file_path,
f"-o {file_path}",
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: It would be nice if some of these defaults could be configured via settings/env vars. Would make tweaking much easier and possibly prevent a bit of "Why is this the default? I have a different opinion! Argh!" argy-bargy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the most we could offer is the quality setting, but I'd rather we do this once people get noisy about it :D incremental tweaks and so on

willow/plugins/pillow.py Outdated Show resolved Hide resolved
willow/plugins/pillow.py Outdated Show resolved Hide resolved
willow/registry.py Outdated Show resolved Hide resolved
@zerolab zerolab requested a review from ababic July 7, 2023 11:28
willow/registry.py Outdated Show resolved Hide resolved
docs/concepts.rst Outdated Show resolved Hide resolved
docs/guide/extend.rst Outdated Show resolved Hide resolved
docs/guide/optimize.rst Outdated Show resolved Hide resolved
Comment on lines +252 to +250
def test_optimize_with_spooled_temporary_file(
self, mock_unlink, mock_named_temporary_file, mock_process
):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument order here appears not to match the order of the patches.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. You can think of

@a
@b
def func():
    ...

as a(b(func()) so b passes its param first, then a. in our case unlink, then named temp file

https://stackoverflow.com/a/27342171

zerolab and others added 6 commits July 7, 2023 17:13
if `WILLOW_OPTIMIZERS` is not set or is set to a false-y value, nothing gets registered
When set to a true-ish value -> all valid optimizers will be registered.
When set to a list or a comma-separated string of binary names, only valid optimizers from the list will be registered

Co-Authored-By: Arnar Tumi Þorsteinsson <[email protected]>
Copy link

@ababic ababic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superb work on the test additions, @zerolab. That feels like the last piece of the puzzle. Just a couple of small 'cleanliness' tweaks to the file handling to consider, then we can get this mother merged!! 🤟

willow/image.py Outdated
image_file.truncate() # bring the file size down to the actual image size

if named_file_created:
os.unlink(file_path)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit delayed on this one (sorry!), it's important this happen regardless of what issues might occur following temp file creation. So, I would recommend wrapping the above lines in a try:, and these couple in a finally:.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call. at first I wanted it to fail loudly, but then again that is probably not a good idea for auxiliary operations. done

with open(image_file, "rb") as f:
self.assertEqual(self.optimized_size, os.fstat(f.fileno()).st_size)

os.unlink(image_file)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try/finally approach might be a good call in these tests too

Copy link

@ababic ababic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will do nicely. Thanks @zerolab!

@zerolab zerolab merged commit 278dae4 into wagtail:main Jul 10, 2023
@zerolab zerolab deleted the image-optimisiation-operations branch July 10, 2023 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants