Skip to main content
Beta v10|PLEASE REPORT ALL ISSUES|Report a Problem|Please allow minimum of 48 hrs for Problem Reports to be fixed
← Back to Data Science samples
📈Data Science·15 min·Sample Lesson

Data Visualization Basics

A well-chosen chart compresses thousands of numbers into a single image your brain can read in seconds. A bad chart hides patterns or, worse, suggests patterns that are not there. Visualization is one of the data scientist is most important skills, both for thinking through a problem and for explaining results to a non-technical audience. The choice of chart depends on what kind of data you have and what question you are asking.

A short field guide. Use a histogram or density plot for one numeric variable to see its distribution. Use a bar chart for one categorical variable to compare counts. Use a scatter plot for two numeric variables to see if they are related. Use a line chart for a numeric variable that changes over time. Use a box plot to compare distributions across groups. Avoid pie charts for anything other than a few categories with very different sizes; humans are bad at comparing angles.

You want to compare the test score distributions of three different classes. Which chart is most appropriate?

Three quick rules for honest visualization. Start the y-axis at zero for bar charts to avoid exaggerating differences. Label axes clearly with units. Avoid 3D effects, which distort the data your viewer is trying to read. The deeper rule is intent: ask yourself what question the chart is meant to answer, and design the chart to answer that one question as cleanly as possible. Anything else is decoration.

🎯

Improve a Bad Chart

Find a chart in a recent news article that strikes you as misleading or hard to read. Identify three specific problems: missing axis labels, truncated axes, irrelevant 3D, too many colors, etc. Redraw it on paper using a chart type and design that answers the underlying question more honestly. Compare your version to the original and explain the changes.

Edward Tufte, a famous designer of charts, wrote that the best visualizations show the data and nothing else. Anytime your chart contains visual elements that do not help the viewer understand the data, those elements are noise. Strong data scientists keep their charts honest, focused, and unflashy.

Want to keep learning?

Sign up for free to access the full curriculum — all subjects, all ages.

Start Learning Free
Free Sample Lesson | Free Sample | HYVE CARES | HYVE CARES