type
Post
Created date
Jul 28, 2022 01:28 PM
category
Data Science
tags
Decision Making
status
Published
Language
From
summary
slug
password
Author
Priority
Featured
Featured
Cover
Origin
Type
URL
Youtube
Youtube
icon
Table of content
Week 1Data types (What)Analyse and Action (Why)One of the most important actions is Analyse. 1. Why analyse?2. Why bother getting the insight?Mark and Channels (How)Marks :Week 2Week 3Data-Ink Ratio vs. ChartJunksInformation DensityWhat is ChartJunk?BUT having ChartJunk does not mean it is bad as it : In conclusionElements of StorytellingHow to Lie with Visualisations5. Problems of charts with two axesFive Design Sheet MethodologyRealisation DesignWeek 4ColorColour ModelsColour SpacesHow to effectively use color in Data Visualisation Gestalt Principles of Visual Perception What is Gestalt? Why Gestalt Principles of Visual PerceptionThere are 6 Gestalt principles to follow :Visual Hierarchy with Figure-GroundLayoutTypographyThings to evaluate TypographyAlignmentHierarchy Week 5 Idioms for networks and treesHow to Arrange Networks and TreesUsing Gestalt : ConnectionChord diagramUsing Gestalt : EnclosureWeek 6Techniques for repeating thingsWeek 7 Map ProjectionsMap Projection DistortionThere are 2 types of projectionMap Idioms Week 8 - Visualisation of geographic fieldsIsolines or Contour LinesWeek 10 - Animation for Data Visualisation and Tool for Creating VisualisationsAnimation for data visualisationPotential Pitfalls of Animated VisualisationsWeek 11 - Immersive Data VisualisationMixed reality VR for immersive visualisationMonash Immersive Analytics Lab
TOB by week
TOB by topic
Week 1 (Textbook chapters 1.1 to 1.6, 2, 5)
Data visualisation definition
data types
Week 2
Table idioms 1 (Textbook chapter 7 (pages 146–148, 150–153, 155–157, 168–170))
Textbook chapter 7 (pages 146–148, 150–153, 155–157, 168–170), scans from
Kirk 2019 on Moodle.
Week 3
Required reading by National Geographic on Moodle
Week 4
Colour,
Required reading about gestalt principles on Moodle and blog posts by Lisa
Charlotte Rost about use of colour for data visualisation.
layout,
typography (including label placement)
Week 5 Idioms for networks and trees
Textbook chapter 9 and scans from Kirk 2019 on Moodle.
Week 6 Table idioms 2 and repeating patterns
Required reading about radar charts and repeating patterns and scans from Kirk
2019 on Moodle.
Week 7
map idioms
(dot maps, proportional symbol maps, choropleth
maps, area cartograms, flow maps).
Required readings by axismap and scans from Kirk 2019 on Moodle.
Week 8
Scalar field visualisation/terrain visualisation
(contour lines, shaded relief, colour mapping, line integral convolution), web maps
Week 9
Data classification, interactive visualisation
Textbook chapter sections 6.5 and 6.7,
required readings by axismaps and Lisa Charlotte Muth on classification on Moodle.
Week 10
Animation for data visualisation and tools for creating visualisations
Week 11 Immersive data visualisation
Week 1
In week one, I learnt about the fundamental data types, marks and channels.
Data visualisation definition (There is no universal definition!)
definition 1
"Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively." - Munzner, Visualization Analysis & Design
definition 2
Bernie uses the framework of what, why and how to introduce them.
Data types (What)
Generally, you gotta understand three questions :
What are the types of attributes?
- Categorical, 2. Ordered → (Ordinal, Quantitative)
If the data attributes can be ordered, what are the types of order direction?
What are the types of dataset?
- Table (Easy to explain: cell, column and row)
2. Tree (with node and vertexes)
- Spatial
- Fields
- Unlike table, fields stores the continuous values
- Geometry
- There are many position, spatially ,of which it consists of different values.
More detail can be found via this link.
Analyse and Action (Why)
Your action needs to have a clear goal of which you want to
- analyse,
- search
- or query.
And when we say action (those 3 above), we need to have targets to be ACTED, which are from :
One of the most important actions is Analyse.
Using why why why, we can know the first principles of analysing things:
1. Why analyse?
Essentially, you want to analyse to get the insight.
2. Why bother getting the insight?
Two reason: one is to consume and produce
3.1. Why consume?
- (Discover) some new information that is not understood; that is to Verify or Generate a hypothesis
- Or (present) the situation (like making a dashboard)
3.2. Why produce?
- Generate new output
- Derive (like taking average of some values) or Transform a the data (making a new attributes)
Mark and Channels (How)
Marks :
Note :
- Bar chart uses Line, not Area.
- Area chart and Treemap use Area
Week 2
This week is to understand Visualisation idioms for table data.
Essentially, you want to know which types of charts to use, that we call IDIOMs here. To select the types of which is based on :
- attribute type and table layout
- dataset type
- action and target
- shape
As mentioned, we analyse the types of chart according to What, Why, How framework.
As an example, we examine a scatterplot.
What:
- Is to find out how many types of attributes are there
Why:
- The purpose of the graph
How :
- How do you present the chart based on Marks and Channels
Here is the table you want to find out the quantity of attributes corresponding to the kind of chart
What Why How framework
Scatterplot matrix (SPLOM)
Document 1
stacked bar chart
Document 2
Pie chart
Document 3 & 4
Week 5 Idioms for networks and trees
Using Gestalt : Connection
Using Gestalt : Enclosure
Week 6 Content : Idioms for Tables
Venn diagram
Radar chart
Parallel Coordinate Plot (High-dimensional)
Streamgraph
ISOTYPE
Without What Why How
Proportional Symbol Chart
Word Cloud
Dot Matrix Chart
Waffle Chart
Histogram
Heat Map
Box Plot
Slope Chart
Bump Chart
Spiral plot
Week 7 Map projections
Week 3
Data-Ink Ratio vs. ChartJunks
Calculation is no way to be done, generally it is a reference:
- High ratio is good; which means ink used is to describe the data.
- Whereas Low D-I ratio means the elements were not describing the data directly, and somehow distracting.
Information Density
What is ChartJunk?
- Redundant and distracting-from-understanding elements of a chart
- Does not add values to understanding the data
ChartJunk Elements
- heavy or dark grid lines,
- unnecessary text,
- inappropriately complex or gimmicky font faces,
- ornamented chart axes, and display frames, pictures, backgrounds or icons within data graphs, ornamental shading and
- unnecessary dimensions.
Example 2 of ChartJunk
BUT having ChartJunk does not mean it is bad as it :
- increases memorability, and
- create a positive attitude towards an artifact.
In conclusion
we need to find a balance between decorative graphical design and increasing data-ink ratio
Elements of Storytelling
How to Lie with Visualisations
We, as readers, often are being manipulated of what we perceived from the content consumption. One of the mediums is through Visualisations.
There are kinds of way to prevent ourselves to be tricked :
5. Problems of charts with two axes
Five Design Sheet Methodology
- Helps us structure our approach to ideation.
Stages : Brainstorm → Initial Design → Realisation Design
Realisation Design
Take the best of the previous
designs and explore in greater
detail.
Focus on:
- Description of algorithms / techniques
- Dependencies: e.g. software libraries, compatibility, etc.
- Estimate time and effort to build the solution
- Specific requirements of materials, hardware (desktop, tablet, phone, etc.)
Week 4
Color
Colour Models
There are 2 kinds of alternative representations to encode the RGB color models
- HSV (for hue, saturation, value)
- HSL (for hue, saturation, lightness)
Colour Spaces
How to effectively use color in Data Visualisation
Rules are referenced from here
- Rule : If you need more than seven colors in a chart, consider using another chart type or to group categories together.
- Rule : Consider using the same color for the same variables
- Rule : Make sure to explain to readers what your colors encode
- Rule : Consider the color grey as the most important color in Data Visualisation
- Rule : Make sure your contrasts are high enough
- Rule : Consider where your colors appear in relation to each other.
- Rule : Use intuitive colors
- Rule : Use light colors for low values and dark colors for high values
- Rule : Don’t use a gradient color palette for categories and the other way round
- Rule : Use lightness to build gradients, not just hue
- Rule : Consider using two hues for a gradient, not just one
- Rule : Consider using diverging color gradients.
- Rule : Consider color-blind people
Gestalt Principles of Visual Perception
What is Gestalt?
It means Form or Shapes.
Why Gestalt Principles of Visual Perception
When it comes to identify which visual elements are signals (the information we want to communicate) and which might be noise (clutter), consider this principles.
There are 6 Gestalt principles to follow :
Visual Hierarchy with Figure-Ground
- Graphical representation in which elements are ranked according to their importance.
- Important elements are graphically emphasised and less important elements are de-emphasised.
- Representation :
- Bold, Italic, Saturation, color text
- Visual depth for accentuating one object over another
- Perception : one object stands in front of another and appears to be closer to the reader.
What does Figure-Ground mean ?
- Figures:
- important objects, become objects of attention and standout from the background
- Grounds:
- things less important, the background.
Layout
There are a few rules for designing layout; it sounds simple but is very essential.
Typography
Things to evaluate Typography
- Is it READABLE?
- Is it Aesthetically appealing?
Before diving into other concepts, a number of terms needed to be understood first :
Text = what you typed
Character = a numerical code that represents the character
Glyph = a visual symbol that represents a character
Character encoding = code matches to the Glyph
Style = Italic, Bold or so
Font = Digital files of all Glyph of geometry of font
Typeface = the set of font (often misunderstood as font)
Typography = An art of text (READABLE & AESTHETIC)
How Many Typefaces?
• Generally use a single typeface, but vary weight, size, case, italic/regular, and colour to create a visual hierarchy / figure-ground.
• If your really need, use a maximum of two typefaces, but make sure they go well together (this is difficult to get right!).
- Generally combine one serif and one sans serif type family.
Alignment
Hierarchy
- Important notes can be bold.
- Other text and colour-coded annotations can use a normal font, and special terms can use italic style.
Week 5 Idioms for networks and trees
How to Arrange Networks and Trees
Using Gestalt : Connection
Node-link diagram
Force-directed graph drawing is an algorithm to make the node-link diagram more visually appealing.
Alluvial diagram
A series of stack bar connected with curve lines
Sankey diagram
Nodes can be placed in anywhere
Chord diagram
Shows flows between multiple nodes, in which they are arranged in a circleChord Diagram with Bundling
Using Gestalt : Enclosure
Shows hierarchical relationship
hierarchical relationships with multiple categories,
Good for medium size network
Overall
Excel Version
Week 6
Resource : Visualizing Patterns on Repeat All the Biomass of Earth, in One Graphic Radar: More eval than pie?
Techniques for repeating things
Periodic line chart
OVERLAYING TIME FRAMES
AGGREGATION
SMALL MULTIPLES
ANIMATION
Week 7
▪ Use Projection Wizard to select a map projection
Map Projections
Map Projection Distortion
- The relative area of objects or/and angles are distorted.
There are 2 types of projection
1. Mercator
Example of an
angle-preserving
/ conformal projection. Advantage
- Use for naval navigation, where bearings are measured on a map showing a small section of the world.
Disadvantages
- Area is hugely inflated towards the poles.
- Not useful for showing the entire world.
2. Map Projections for World Maps
Example of an
area-preserving
(or equal-area) projection.Advantage
- Useful for showing the entire world.
- Useful when the size of areas is compared.
Disadvantages
- Angles (and shapes) are increasingly distorted towards the border of the map.
Types Map Projections for World Maps
Map Idioms
Dot maps
Alternatives
– choropleth or bin map by counting points per area
– convert to scalar field and use isocontours or colour mapping
- Don't work well with strongly varying density
Proportional symbol maps
Design Principles for PSM
Choropleth Maps
Color tips for quantitative attributes
- Primarily luminance changes: the greater the value, the darker.
- Slight change in hue is possible, to increase the number of distinguishable colours. Change hue for diverging distributions.
- Change in saturation is possible, but should not be main variable.
Delusion
Granularity
Rmb to NORMALISE data!
Week 8 - Visualisation of geographic fields
Isolines or Contour Lines
Iso = Equal (Greek)
Isolines or Contour Lines
Isocontours: contour interval
Isocontours = contour lines = isopleth (interval)
Comparison of all mapping of scalar field/terrain
Week 10 - Animation for Data Visualisation and Tool for Creating Visualisations
“Overview first, zoom and filter, then details-on-demand.” - Ben Shneiderman (1996)
Visualisation tools :
Link 1 : The Chartmaker Directory
Animation for data visualisation
Latin animare = “to bring life”
Sequences of static graphic depiction (frames), the graphic content of which, when shown in rapid succession, begins moving in a fluid motion.
There are 2 types of animation :
Potential Pitfalls of Animated Visualisations
- Directing user attention
- change blindness
- Difficulty in detecting changes in scenes.
- cognitive load
- length of animation (running time)
Week 11 - Immersive Data Visualisation
Mixed reality
VR for immersive visualisation
Monash Immersive Analytics Lab
Dunno
Week 3
FAQ
W1
Difference between Table and Field
difference between color channels
Which Channel share the same with Identity and Magnitude channels?
There is a channel that appears in both of the two ranked lists of channels:
position.
In the magnitude channels, we have "position on a common scale", or "position on an unaligned scale". In the identity channels, we have "spatial region".
They are actually all related to the "spatial position" of the graph, which means - spatial position is the only channel that is both a magnitude channel and an identity channel, also it is the most effective one.
How to identify the type of dataset
Your example - "For example, network can be represented as an adjacency matrix, or a field can be represented has a high-dimension table etc. ? If so how do we definitively distinguish among them?"
- -----
It is not based on how we store the data or how we represent the data to distinguish the dataset types. We should consider the structure of the data.
Why not the format?
- 1) Most of the dataset types can be possibly stored in a table or in a rational database - e.g., a network dataset can be stored as two variables: "user_1", "user_2". Each row of data with two users means they are connected.
- 2) A dataset stored in a json or xml format could also be table data.
Why not the representation?
- for example, the adjacency matrix you mentioned is basically a 2D heatmap. So a heatmap could possibly be used to represent a table dataset or a network dataset.
So how to differentiate the data types?
- you need to first understand the data types: items, attributes, like, positions, and grids.
- then based on what data types are in the dataset, you can decide the dataset type: e.g., if there is a link between the items in the dataset, then this might be a network or tree dataset; if there are geo-positions, then this might be a geometry or a field dataset.
What are Fields, spatial fields and grid
Grid is related to cells:
Suppose that we want to show the distribution of the NBA shot locations - shown below. To get this, we actually need to first divide the space into a set of cells (either square cells - grid or hexagon cells); then we can get statistics of how many players have made the shots inside each cell.
Fields and spatial fields:
they are the name of the dataset that is based on cells.
What is part-to-whole relation (in pie chart) ?
part-to-whole relation is about - these four types of ingredients belong to the pancake
- Author:Jason Siu
- URL:https://jason-siu.com/article%2F77924d0f-292a-4b1b-896f-f212467cfd4f
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts
Values alignment - Ray Dalio & Naval Ravikant
The AAA Framework - Ali Abdaal
Bouncing Back with Gain Recovery Calculator: The Art of Recouping Financial Losses in Leveraged Investments
Zero to One Extract: Questions that every business must answer
Life - Principles
I'm a master of analysis, coding is just a bonus!