Have you heard of the term "onomatopoeia viz"? According to the Merriam-Webster dictionary, onomatopoeia is "the naming of a thing or action by a vocal imitation of the sound associated with it (such as buzz, hiss)." In data visualization, Mike Freeman coined the phrase "onomatopoeia viz" to describe data visualizations that resemble the actual topic. Inspired by Mike Freeman, as well as by Alli Torban beeswarm plot about...bees, I decided to journey into the world of onomatopoeia vizes myself and create a violin plot about violinists.
Violin Plots Explained
At a high level, violin plots show the shape of the entire distribution in a dataset. Each "violin" represents a group. The width of each "violin" shows the frequency of data points in those sections. The wider sections show higher frequency, while thinner sections imply lower frequency. The length, on the other hand, helps us identify the distribution. Easy, right?
If you've ever created a box plot or a beeswarm plot, you might immediately spot the similarities between these three chart types. Box plots, beeswarm plots, and violin plots show the distribution of individual data points. Any of these three graph types can be the right choice if you're conducting exploratory analysis. Violin plots and beeswarm plots are great at showing the overall distribution of a dataset, as well as the position of each individual point. But, unlike beeswarm plots (where the dots are moved so that they don't overlap), the data points can overlap and the distribution of data takes the shape of a "violin" in a violin plot.
The Data
The data file that I used to create my violin plot about violinists (or my onomatopoeia viz) came from Adaptistration, a website about music and orchestra management. The data captures the compensation of Music Directors, Executives, and Concertmasters in the US in 2020, alongside Total Expenditure figures. A total of 48 US symphonies and philharmonics were included.
Below is a view of the original data set.
Source: Adaptistration
Like with most data sets, this one was not perfect. I had to spend 2-3 hours finding, cleaning, and re-organizing my data before I was ready to move to the next step: creating the actual violin plot.
You can download the final data file here.
How I Built My Violin Plot
To create my violin plot, I used RAWGraphs and Figma. If you're not familiar with RAWGraphs, it is an intuitive and versatile data visualization tool that was conceived "for designers and vis geeks." RAWGraphs includes templates for violin plots, as well as templates for other more advanced graph types (e.g., beeswarm plots, Sankey diagrams, circle packing). Figma is a vector graphics editor and prototyping tool. You can export the graph in .svg (scalable vector graphics) format from RAWGraphs and then further edit it in Figma.
Here's what my original graph from RAWGraphs looked like versus my final graph, edited in Figma.
Original violin plot created in RAWGraphs:
Final violin plot, edited in Figma:
RAWGraphs is not the only tool that you could use to create a violin plot. Flourish and Tableau are two other non-coding that I often recommend. Flourish requires zero coding skills. In Tableau, you'd need to create a few calculations, as violin plots are not standard templates. You can find a tutorial here.
If you are a skilled coder, then here are a few tutorials on how to create a violin plot in matplotlib (Python), d3.js (JavaScript), Plotly (Python), and Seaborn (Python).
Learn more about our upcoming public Data Visualization and Analytics Bootcamp scheduled for March 31/April 1.
What We Can Learn From This Graph
My violin plot shows the annual salaries of Music Directors, Executives, and Concertmasters. What we can spot right away is that most values are clustered around the median across all three professions. However, when looking at Music Directors, some of them (likely from top US symphonies and philharmonics) can earn millions of dollars per year. That's not the case with Executives and especially not with Concertmasters.
While data points overlap in a violin plot, my main goal was to show the shape of the data (or of the violin) for each group. For Music Directors, we can see that the violin shape is elongated, suggesting a wider distribution and a few outliers. For Executives, the violin shape is wider than for the other two groups, in particular
below $0.5MM, which means that most Executives earn below $0.5MM. Finally, for Concertmasters, we can see that there are fewer outliers than for Music Directors, as Concertmasters are unlikely to earn millions of dollars.
Closing Thoughts
Next time you find yourself working with a data set that shows the distribution, I encourage you to explore violin plots. I really enjoyed exploring this graph type. Violin plots are an intuitive alternative to box plots and can help engage your audience with the data being presented. Whether you're choosing RAWGraphs, Flourish, Tableau, Python, or JavaScript to create a violin plot, try out Figma as well. It's a great, easy-to-learn tool that can make your visualizations stand out and shine in the crowd.
a violin plot about violinists- that's a definitive way of catching the audience's attention!