Skip to main content
Machine Learning & Deep Learning

⏱ About 10 min10 XP

Sounds and Words as Data

Close your eyes for a moment and listen. Maybe you hear traffic outside, a fan humming, someone talking. Your brain knows what each sound is because you have heard similar sounds many times before. Machines can learn to recognize sounds and words too — but first those sounds and words have to be saved as data.

Saving Sounds as Data

Sound is made of vibrations in the air — waves that travel into your ears. A microphone can capture those waves and turn them into numbers that a computer can save and read. Think of a recording of your voice. The microphone listens very fast — thousands of times per second — and writes down a number each time. That long list of numbers IS the recording. It is sound data. A machine can study millions of voice recordings and learn to understand spoken words. That is how voice assistants work — they have been trained on enormous amounts of sound data.

The Big Idea

Sound is turned into numbers by a microphone. Those numbers are saved as data. Machines study millions of sound recordings to learn how to understand voices and music.

Words written on a page are data too — text data. Every word you type in a message, every sentence in a book, every caption under a photo — all of that is text data that a machine can study. Machines trained on billions of words can learn a lot about language. They learn that 'happy' and 'joyful' mean similar things. They learn that a sentence usually ends with a period. They learn what kinds of words go together. The machine does not understand words the way you do — but it finds patterns in the data, and those patterns are very useful.

Match each type of data to a real-world example of it.

Terms

Sound data
Text data
Image data
Number data

Definitions

The daily temperature written on a chart
A recording of someone singing a lullaby
Every page in a library book
A photograph of a sunset

Drag terms onto their definitions, or click a term then click a definition to match.

Here is something cool: machines can work with all these kinds of data together. A video is image data AND sound data at the same time. A social media post might have text data, image data, and even the time it was posted — number data. The more kinds of data a machine can read, the more it can learn about the world.

Your Favorite Song Is Data

Any song you love has been saved as sound data — a long list of numbers that your phone or speaker turns back into sound waves for your ears. Music and machines have more in common than you might think!

How does a machine save a sound?

Which of these is an example of text data?

Be a Sound Recorder

  1. Find a quiet place and close your eyes for thirty seconds. Really listen.
  2. Write down or draw every sound you hear — traffic, birds, a clock ticking, voices.
  3. Now look at your list. You just made a record of sound data!
  4. Pick one sound from your list and describe it: Is it high or low? Loud or soft? Long or short?
  5. Imagine you are a machine that has never heard that sound before. What patterns would help you recognize it next time?