Skip to content

Commit

Permalink
Documentation for "insert_htmlbox"
Browse files Browse the repository at this point in the history
Modify page.rst and the respective recipe file.
  • Loading branch information
JorjMcKie authored and julian-smith-artifex-com committed Dec 19, 2023
1 parent 328f7ee commit 008b871
Show file tree
Hide file tree
Showing 4 changed files with 120 additions and 5 deletions.
Binary file added docs/images/img-htmlbox4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/img-htmlbox5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/page.rst
Original file line number Diff line number Diff line change
Expand Up @@ -641,7 +641,7 @@ In a nutshell, this is what you can do with PyMuPDF:

* New in v1.23.8

PDF only. Insert text into the specified rectangle. The method has similarities with methods :meth:`Page.insert_textbox` and :meth:`TextWriter.fill_textbox`, but is **much more powerful**. This is achieved by letting a :ref:`Story` object do all the required processing.
**PDF only:** Insert text into the specified rectangle. The method has similarities with methods :meth:`Page.insert_textbox` and :meth:`TextWriter.fill_textbox`, but is **much more powerful**. This is achieved by letting a :ref:`Story` object do all the required processing.

* Parameter `text` may be a string as in the other methods. But it will be **interpreted as HTML source** and may therefore also contain HTML language elements -- including styling. The `css` parameter may be used to pass in additional styling instructions.

Expand Down
123 changes: 119 additions & 4 deletions docs/recipes-text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -425,13 +425,17 @@ Some default values were used above: font size 11 and text alignment "left". The

How to Fill a Box with HTML Text
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Method :meth:`Page.insert_htmlbox` offers a **much more powerful** way to insert text in a rectangle. Instead of simple, plain text, this method accepts HTML source, which may not only contain HTML tags but also styling instructions to influence things like font, font weight (bold) and style (italic), color and much more. It is also possible to mix multiple fonts and languages, output HTML tables and insert images. Any URI links are also supported.
Method :meth:`Page.insert_htmlbox` offers a **much more powerful** way to insert text in a rectangle.

Instead of simple, plain text, this method accepts HTML source, which may not only contain HTML tags but also styling instructions to influence things like font, font weight (bold) and style (italic), color and much more.

It is also possible to mix multiple fonts and languages, to output HTML tables and to insert images and URI links.

For even more styling flexibility, an additional CSS source may also be given.

The method is based on the :ref:`Story` class. Therefore, complex script systems like Devanagari, Nepali, Tamil and many are supported and written correctly thanks to using the HarfBuzz library - which provides this feature, called *"text shaping"*.
The method is based on the :ref:`Story` class. Therefore, complex script systems like Devanagari, Nepali, Tamil and many are supported and written correctly thanks to using the HarfBuzz library - which provides this so-called **"text shaping"** feature.

Any required fonts to output characters are automatically pulled in from the Google NOTO font library - as a fallback when the optionally supplied user font(s) do not contain some glyphs.
Any required fonts to output characters are automatically pulled in from the Google NOTO font library - as a fallback (when the -- optionally supplied -- user font(s) do not contain some glyphs).

As a small glimpse into the features offered here, we will output the following HTML-enriched text::

Expand Down Expand Up @@ -464,6 +468,9 @@ The result will look like this:

.. image:: images/img-htmlbox1.*

How to output HTML tables and images
.......................................

Here is another example that outputs a table with this method. This time, we are including all the styling in the HTML source itself. Please also note, how it works to include an image - even within a table cell::

import fitz_new as fitz
Expand Down Expand Up @@ -530,7 +537,10 @@ The result will look like this:
.. image:: images/img-htmlbox2.*


Our third example will demonstrate the automatic multi-language support that also includes text shaping for complex scripting systems like Devanagari and right-to-left languages::
How to Output Languages of the World
.......................................

Our third example will demonstrate the automatic multi-language support. It includes automatic **text shaping** for complex scripting systems like Devanagari and right-to-left languages::

import fitz

Expand All @@ -552,6 +562,8 @@ Our third example will demonstrate the automatic multi-language support that als
doc = fitz.open()
page = doc.new_page()
rect = (50, 50, 200, 500)

# join greetings into one text string
text = " ... ".join([t for t in greetings])

# the output of the above is simple:
Expand All @@ -562,4 +574,107 @@ And this is the output:

.. image:: images/img-htmlbox3.*

How to Specify your Own Fonts
.................................

Define your font files in CSS syntax using the `@font-face` statement. You need a separate `@font-face` for every combination of font weight and font style (e.g. bold or italic) you want to be supported. The following example uses the famous MS Comic Sans font in its four variants regular, bold, italic and bold-italic.

As these four font files are located in the system's folder `C:/Windows/Fonts` the method needs an :ref:`Archive` definition that points to that folder::

"""
How to use your own fonts with method Page.insert_htmlbox().
"""
import fitz_new as fitz

# Example text
text = """Lorem ipsum dolor sit amet, consectetur adipisici elit, sed
eiusmod tempor incidunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation <b>ullamco <i>laboris</i></b>
nisi ut aliquid ex ea commodi consequat. Quis aute iure
<span style="color: red;">reprehenderit</span>
in <span style="color: green;font-weight:bold;">voluptate</span> velit
esse cillum dolore eu fugiat nulla pariatur. Excepteur sint obcaecat
cupiditat non proident, sunt in culpa qui
<a href="https://www.artifex.com">officia</a> deserunt mollit anim id
est laborum."""

"""
We need an Archive object to show where font files are located.
We intend to use the font family "MS Comic Sans".
"""
arch = fitz.Archive("C:/Windows/Fonts")

# These statements define which font file to use for regular, bold,
# italic and bold-italic text.
# We assign an arbitary common font-family for all 4 font files.
# The Story algorithm will select the right file as required.
# We request to use "comic" throughout the text.
css = """
@font-face {font-family: comic; src: url(comic.ttf);}
@font-face {font-family: comic; src: url(comicbd.ttf);font-weight: bold;}
@font-face {font-family: comic; src: url(comicz.ttf);font-weight: bold;font-style: italic;}
@font-face {font-family: comic; src: url(comici.ttf);font-style: italic;}
* {font-family: comic;}
"""

doc = fitz.Document()
page = doc.new_page(width=150, height=150) # make small page

page.insert_htmlbox(page.rect, text, css=css, archive=arch)

doc.subset_fonts(verbose=True) # build subset fonts to reduce file size
doc.ez_save(__file__.replace(".py", ".pdf"))

.. image:: images/img-htmlbox4.*

How to Request Text Alignment
................................

This example combines multiple requirements:

* Rotate the text by 90 degrees anti-clockwise.
* Use a font from package `pymupdf-fonts <https://pypi.org/project/pymupdf-fonts/>`_. You will see that the respective CSS definitions are a lot easier in this case.
* Align the text with the "justify" option.

::

"""
How to use a pymupdf font with method Page.insert_htmlbox().
"""
import fitz_new as fitz

# Example text
text = """Lorem ipsum dolor sit amet, consectetur adipisici elit, sed
eiusmod tempor incidunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation <b>ullamco <i>laboris</i></b>
nisi ut aliquid ex ea commodi consequat. Quis aute iure
<span style="color: red;">reprehenderit</span>
in <span style="color: green;font-weight:bold;">voluptate</span> velit
esse cillum dolore eu fugiat nulla pariatur. Excepteur sint obcaecat
cupiditat non proident, sunt in culpa qui
<a href="https://www.artifex.com">officia</a> deserunt mollit anim id
est laborum."""

"""
This is similar to font file support. However, we can use a convenience
function for creating required CSS definitions.
We still need an Archive for finding the font binaries.
"""
arch = fitz.Archive()

# We request to use "myfont" throughout the text.
css = fitz.css_for_pymupdf_font("ubuntu", archive=arch, name="myfont")
css += "* {font-family: myfont;text-align: justify;}"

doc = fitz.Document()

page = doc.new_page(width=150, height=150)

page.insert_htmlbox(page.rect, text, css=css, archive=arch, rotate=90)

doc.subset_fonts(verbose=True)
doc.ez_save(__file__.replace(".py", ".pdf"))

.. image:: images/img-htmlbox5.*

.. include:: footer.rst

0 comments on commit 008b871

Please sign in to comment.