[librsvg: 2/4] Detail the text roadmap
- From: Marge Bot <marge-bot src gnome org>
- To: commits-list gnome org
- Cc:
- Subject: [librsvg: 2/4] Detail the text roadmap
- Date: Mon, 25 Oct 2021 16:40:41 +0000 (UTC)
commit 46c771df0d128ed9dfcff53559eb1c8a49dfacc4
Author: Federico Mena Quintero <federico gnome org>
Date: Fri Oct 22 18:28:06 2021 -0500
Detail the text roadmap
Part-of: <https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/620>
devel-docs/text-layout.md | 137 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 137 insertions(+)
---
diff --git a/devel-docs/text-layout.md b/devel-docs/text-layout.md
index 9f26652c..fa3f9de2 100644
--- a/devel-docs/text-layout.md
+++ b/devel-docs/text-layout.md
@@ -76,8 +76,145 @@ Features that will not be implemented:
* SVG1.1 features like `<font>` and the `glyph-orientation-horizontal` property, that were
deprecated for SVG2.
+## Roadmap summary
+
+Since librsvg 2.52.1 I've started to systematically improve text support. Many thanks to
+Behdad Esfahbod, Khaled Ghetas, Matthias Clasen for their advice and inspiration.
+
+First, I want to get **bidi** to a state where it is reliable, at least as much as LTR
+languages with Latin text are reliable right now:
+
+* Implement `unicode-bidi`. See the detailed roadmap below.
+
+* Add tests for the different combinations of `text-anchor` and `direction`; right now
+ there are only a few tested combinations.
+
+* Even if white-space handling remains semi-broken, I think it's more important to have
+ "mostly working" bidi than completely accurate white-space handling and layout.
+
+Second, actually overhaul librsvg's text engine by implementing the SVG2 text layout algorithm:
+
+* Implement the `text-orientation` property, and implement fallbacks from the deprecated
+ `glyph-orientation-vertical` to it. If this turns out to be hard with the current state
+ of the code, I will defer it until the SVG2 text layout algorithm below.
+
+* Implement the SVG2 text layout algorithm and `white-space` handling at the same time.
+ See the detailed roadmap below.
+
+Third, implement all the properties that are not critical for the text layout algorithm,
+and things like `@font-face`. Those can be done gradually, but I feel the text layout
+algorithm has to be done all in a single step.
+
+## Detailed roadmap
+
+### Implement `unicode-bidi`
+
+The property is parsed only with SVG1.1 values. Parsing SVG2 values is a trivial change.
+Supporting this property involves looking at both `direction` and `unicode-bidi` and
+inserting Unicode control characters at the start and end of each text span, so that the
+bidi and shaping engines know what to do.
+
+### Add tests for combinations of `text-anchor` and `direction`
+
+These are easy to add now that librsvg's tests make use of the Ahem font, in which each
+glyph is a 1x1 em square.
+
+### Implement the `text-orientation` property
+
+This may just be the property parser and hooking it up to the machinery for properties.
+Actual processing may be easier to do in the SVG2 text layout algorithm, detailed below.
+
+### Implement the SVG2 text layout algorithm and `white-space` handling.
+
+**Shaping:** One thing librsvg does wrong is that for each `<tspan>`, or for each
+synthesized text span from a `<text>` element, it creates a separate `pango::Layout`.
+This means that text shaping is not done across element boundaries (SVG2 requirement).
+Implementing this can be done by creating a string by recursively concatenating the
+character content of each `<text>` element and its children, and adding
+`pango::Attribute`s with the proper indexes based on each child's character length. This
+creates an un-shaped string in logical order with all the characters inside the `<text>`,
+to be used in the next steps.
+
+Pango details: create a single `pango::Layout`, per `<text>` element, with
+`pango::Attribute` for each text span. Set the layout to `set_single_paragraph_mode()` so
+it does not break newlines. Pango will then translate them to
characters in the
+`Layout`, and the white-space handling and SVG2 text layout algorithm below can detect
+them.
+
+**Bidi control:** The `unicode-bidi` property requires adding control characters at the
+start and end of each span's text contents. For example, `<tspan direction="rtl"
+unicode-bidi="bidi-override">foo</tspan>` should get rendered as `oof`. The CSS Writing
+Modes 3 spec has a [table of control
+codes](https://www.w3.org/TR/css-writing-modes-3/#unicode-bidi) for each combination of
+`direction` and `unicode-bidi`. Implementing this involves adding the control characters
+while recursively building the string from each child of `<text>` as in the "Shaping"
+point above.
+
+**White-space handling:** SVG2 has a new `white-space` property that obsoletes `xml:space`
+from SVG1.1. Implementing this depends on the concatenated string from the steps above,
+so that white-space can be collapsed on the result. Maybe this needs to be done before
+inserting bidi control characters, or maybe not, if the state machine is adjusted to
+ignore the control characters.
+
+**SVG2 text layout algorithm:** This is the big one. The spec has pseudocode. It depends
+on the shaping results from Pango, and involves correlating "typographic characters"
+(Pango clusters) with the un-shaped string in logical order from the "Shaping", and the
+information about discarded white-space characters. The complete text layout algorithm
+would take care of supporting multi-valued `x/y/dx/dy/rotate`, `textPath` (see below),
+plus bidi and vertical text.
+
+### Text rendering
+
+Librsvg is moving towards a "render tree" or "display list" model, instead of just
+rendering everything directly while traversing the DOM tree.
+
+Currently, the text layout process generates a `layout::Text` object, which is basically
+an array of `pango::Layout` with extra information.
+
+It should be possible to explode these into `pango::GlyphItem` or `pango::GlyphString` and
+annotate these with `x/y/rotate` information, which will be the actual results of the SVG2
+text layout algorithm.
+
+Although currently Pango deals with underlining, it may be necessary to do that in librsvg
+instead - I am not sure yet how `textPath` or individually-positioned `x/y/dx/dy/rotate`
+interact with underlining.
+
+### Wrapped text in a content area
+
+This roadmap does not consider the implementation fo wrapped text yet.
+
+### User-provided fonts, `@font-face` and WOFF
+
+This involves changes to the CSS machinery, to parse the `@font-face` at-rule. Librsvg
+would also have to obtain the font and feed it to FontConfig. I am not sure if FontConfig
+can deal with WOFF just like with normal `.ttf` files.
+
## Issues
https://gitlab.gnome.org/GNOME/librsvg/-/issues/795 - Implement the unicode-bidi property.
https://gitlab.gnome.org/GNOME/librsvg/-/issues/795 - Implement SVG2 white-space behavior.
+
+https://gitlab.gnome.org/GNOME/librsvg/-/issues/599 - Something is wrong with text scaled
+with a transformation; this is not critical but it bothers me a lot.
+
+### Issues that have not been filed yet
+
+From the spec: "It is possible to apply a gradient, pattern, clipping path, mask or filter
+to text." We need better tests for the objectBoundingBox of the whole `<text>`; I think
+they are wrong for vertical text, and this shows up when filling its spans with gradients
+or patterns. Clip/mask/filter do not work on individual spans yet.
+
+
+
+
+
+## Glossary so I don't have to check the Pango docs every time
+
+PangoItem - A range within the user's string that has the same
+language/script/direction/level/etc. (Logical order).
+
+PangoLayoutRun - same as PangoGlyphItem - a pair of PangoItem and the PangoGlyphString it
+generated during shaping. (Visual order).
+
+PangoGlyphString - The glyphs generated for a single PangoItem.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]