declarative charts - components and concepts #6

heckj · 2022-03-22T00:47:25Z

heckj
Mar 22, 2022
Maintainer

Design notes - I'm starting with a discussion, but as this evolves, I'll move it into an explicit markdown file in the technotes directory, and ultimately I intend to include portions of this within the documentation for the library.

I'm starting from the core idea of how to frame a declarative structure from an interview on building intuitive data visualization tools and layering that on what's been done in Vega-lite (Dominic, interviewed, is one of the co-authors of Vega-lite) and Observable's Plot, which is another similar take in declarative charting from the folks who originally created D3.js.

Quoting that interview:

And the core idea is that visualization is not just a particular type, so it's not just a horizontal bar chart, or a bubble chart, or a radar plot. But instead, a visualization is described as a combination of basic building blocks, kind of, like in language, we have words that we combine using rules, which is grammar. And so, the words in the grammar of graphics are two things. One is marks and the other one is visual encodings.

So, a mark is, for instance, a bar or a line or a point and encoding is a mapping from data properties to visual properties of that mark. So, for instance, a bar chart is a bar mark that maps some category to X and some continuous variable to Y. And that's how you describe a bar chart. And now I think what's cool about this is, if you want to change from a horizontal to a vertical bar chart or some column or vote chart, you don't have to change the type. You just swap the channels in the encoding.

These other libraries declaratively map values from data provided into the visual properties of marks. Some use the term "visually encoded" to reference transforming the original data, or specific fields from it, into visual elements within a chart. There's also an element of transformation, potentially pre-processing the data provided to the declaration to build out aggregates, such as mean, median, or max - or bin to count values within a data set. And finally, some of them also describe declaratively the interactive aspects of a chart - brush, selection, etc.

Because swift has a lot of functional aspects built into it as a language, I'm currently uncertain how much would make sense to add in terms of declarative transformations over what swift already provides, or if something like map and reduce, which are already a part of any sequence, so most of that work already. There are two types of transformation which aren't directly represented, and which I'm personally interested in using: aggregate and bin. (I find - and use - a histogram view of data to show it's distribution, which bin provides).

For the mapping of data to visual properties of marks, once you have a raw value from the data, it needs to potentially be mapped or transformed into a value that can represent the property visually. Often this is a scaling the value so that it fits within the available rendering area. That's accomplished with scale. A scale has a domain (the range of its input values), a range (where to maps to), and variations of the scale allow for that mapping do the mapping linearly, logrithmicly, using a power scale, etc. Additionally, scales can be configured to have different handling techniques - dropping, clamping, or processing values that are outside of either in the input "domain" or the output "range". You can also derive useful visual aspects from scales - in particular, they make up the key information needed to display a visual axis for chart, including labelling that axis to show input values.

I'd like to starting off with a Linear and Logarithmic scale, and potentially expand to enable a scale that maps from data values to color as well, using a few different color schemes. I want to be able to declarare specific scales and their configuration, but also infer good default scales based on just the input data. I've previously done some work creating a scale within another repository (swiftviz/swiftviz, hosted DocC docs). As we develop this library, the API for scales may need to be updated or changed, but it's usable today. The scale implementation in the SwiftViz library also includes nice value implementations to round scales to nice visual number representations, based on the implementation suggested in “Nice Numbers for Graph Labels” in the book “Graphics Gems, Volume 1” by Andrew Glassner.

There are a wide variety of marks to potentially display. I think a good starting set includes Bar, Dot, and Line. Ipropose that we start with developing those three, then expanding into others such asArea, Arc, adding a Stacked` variation.

When describing how a mark is displayed, it's often relevant to know the kind of data that it's displaying. These types are described in research literature as one of:

quantitative (represented in swift by Double, and possibly Float)
ordinal (represents in swift by Int)
nominal (or categorical) (represented in swift by String)
temporal (represented in swift by Date, or mapped down into TimeInterval, which is a Double)

The visual properties of some marks are interpreted differently, depending on the type of data. For example, if the x value for a bar chart is coming from a quantitative value, that could describe a fairly specific position within a range, where if it was coming from a categorical value, then it's position is inferred based on how you sort the available categories to display. Additionally, some marks visual properties aren't amenable to all types of data - for example the x, and y position for a dot mark don't make a lot of sense if both are categorical or ordinal data.

For rendering the chart with SwiftUI, it seems pretty obvious this would be best done with SwiftUI's Canvas. I tried some experiments with creating charts using just shapes and offsets from the original SwiftUI offerings, but drawing, and aligning, axis values were very complicated and limited. The SwiftUI Canvas offers an API that provides a context (GraphicsContext) that can be updated and layered, as well as an explicit size so that positions can be explicitly determined up front.

I'd like to use iterative development and advancement to build this API, building ultimately to where the declaration can include the concept of charts that facet or stack from multiple declarations. The idea being that multiple charts could merge and align scales for the data mapping to provide nicely aligned views for exploring differences, progress, or sequences.

For mapping data from objects, using KeyPath seems to make a lot of sense. But handling, and leveraging, type safety while managing this declarative mapping, is something I don't have firm vision on how to achieve. The inspiring libraries build from dynamically typed languages, so failure modes when mapping doesn't work need to be sorted out and handled. In addition to mapping values from objects (structs or classes), it's worthwhile to support a raw sequence of values, and a sequence of tuples as input data for a chart.

There's a common set of visual attributes that many marks use, and we may want to define additional channels as the other libraries have done. The starting attributes that are common include position (x, y), size, color, and possibly shape. I'm uncertain if it makes sense to describe the opacity of a mark separately from it's color.

I don't have a full design worked out. My initial sense is the declaration will describe a pipeline of how to transform the data provided into the visual properties for one of more kinds of marks, and potentially configuration of the chart, or the chart components, within. The scales, if not explicitly defined, get inferred from the values in the data, and the range values that the scales map into provided by the size of the canvas, adjusted for insets, margins, and additional visual elements. These additional elements include axis, ticks, and labels - and potentially a legend and/or caption for a chart.

For values that can be explicitly declared, or inferred with good defaults, the declaration looks like a good match for a modifier style. An example of which is scale - inferred from data by default, but potentially explicitly stated as a modifier to the visual property for a mark.

As a rough guide for what I'm envisioning for the format, I'm hoping to see something akin to:

Chart {
  BarMark(data: sourceData) {
    VisualChannel(x, \.node)    // an ordinal/Int value
    VisualChannel(y, \.latency) // a quantitative/Double value
      .logScale(0.1, 100).      // an explicit logarithmic scale for latency that maps from 0.1 to 100
  }
}

Foundation includes UnitMeasurements that may be interesting to leverage, especially for it's pre-built locale-aware details of how to describe those different kinds of values. I don't want to constrain a chart to requiring a type of unit measurement for the labeling, but we might be able to add it as a declaration when mapping to take advantage of locale aware converting and displaying of values.

I think it's key to provide effective accessibility for the charts as well - although what that means may be quite specific to the chart itself. At the very minimum, the chart should support dynamic type, but I don't have a full sense of the opportunities for making the data displayed in a chart more accessible.

heckj · 2022-03-23T20:45:45Z

heckj
Mar 23, 2022
Maintainer Author

When I made a visualization of a histogram in another project, the visual parts of a chart (or at least an initial set of such) became reasonably clear - along with variables that I used to adjust the display of the chart. I drew this and took a picture, working out some of those specifics:

That hashed-out core the area where the marks are displayed. Bars, points, lines, etc. Around that rectangular region, there's a gap, the length of which I'm calling the chart's inset. Between the outside edge of that view and that inset line is the chart's margin.

There might be additional visual elements around the core of the chart - most frequently one (or more) visual axis, possibly with ticks and labels. For RTL languages, a typical combination would be an axis on the left, and another on the bottom - but I think it makes sense to enable and allow for the chart to have an axis on any of the four sides, and for the chart symbols to be oriented against any edge.

If you're displaying an axis, you might have a rule across the length that it spans - or not. And you might have ticks, which identify recurring (typically regularly) distances. If you have ticks, you start with 2 - one on each end - and go up in number from there. The exact number is half art, but a useful algorithm I've found works off the idea that stepping in values of 1, 2, 5, or variations of that over powers of 10, make easily understandable - but not horribly dense - axis marks. If you're displaying categorical data, the you might have a tick half-way under the bar/line/etc that represents that category - or just a label. If you're drawing a tick, it has a length: tickLength, and could be directed inward towards the mark symbols, or outward. Any labels associated with the tick have a labelOffset from the rule, and an anchoring position (leading, center, trailing - or maybe one of each of the 9 unit locations to accommodate any combination. Labels can also be rotated. The depth of the margin needs to be a minimum of the space needed to display the axis, ticks, and labels - if they're displayed.

The ticks of an axis can be expanded across the chart itself, creating repeating rules to make it easier to see where values land - especially the difference is wide.

In addition to the ticks, there could also be a legend - commonly used for showing multiple categories together as separate series using a line, area, or dot mark. Putting the legend outside the chart and its margins seems like the best bet, as you don't want it to obscure any detail within the chart, and it's hard to known "a priori" where chart data will be displayed.

Likewise a chart often has either a title or a caption - the difference there being primarily focus, size of text, and potentially the location of such text. I've also drawn that outside the margins of the chart.

This is just my initial noodling on the topic - whatever ends up being configurable needs to be completely identified, documented, and would become part of a technote I think, as well as included within the documentation.
Some of these - such as chart margin, inset, and legend, are attributes of the chart - influenced by the marks within the chart. Others (axis, labels, what ticks to use, etc) are specific to the mark, in that they're derived or defined by the scale you declare and set (or that the library infers when you provide the data to map to the mark).

0 replies

heckj · 2022-03-24T19:29:53Z

heckj
Mar 24, 2022
Maintainer Author

Some of this thread has been pulled out and refined into a tech note: DesignGoals

1 reply

heckj May 26, 2022
Maintainer Author

A baseline is now operational within the project:

point, line, and bar marks are roughly implemented
support for margin and inset
optional axis (X and Y) are available and can be rendered, defined on each mark, with controls to position things.

Examples are being included into snapshot testing, which I'm using to capture and store png's of the rendered images, which turns out to be one of the more effective way to debug combinations of things. I expect that snapshot library to grow quite a bit over time.

To include a visual example:

Chart(margin: 10) {
    LineMark(data: self.data,
             x: QuantitativeVisualChannel(\.xValue),
             y: QuantitativeVisualChannel(\.yValue))
    .xAxis(tickLength: 5, tickPadding: 8)
    .yAxis(tickLength: 5, tickPadding: 8)
}

results in:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

declarative charts - components and concepts #6

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

declarative charts - components and concepts #6

heckj Mar 22, 2022 Maintainer

Replies: 2 comments · 1 reply

heckj Mar 23, 2022 Maintainer Author

heckj Mar 24, 2022 Maintainer Author

heckj May 26, 2022 Maintainer Author

heckj
Mar 22, 2022
Maintainer

Replies: 2 comments 1 reply

heckj
Mar 23, 2022
Maintainer Author

heckj
Mar 24, 2022
Maintainer Author

heckj May 26, 2022
Maintainer Author