A New Type of Story

Why we created a visual podcast


As a student I read a book on film sound, Audio-Vision, which had a big impact on me. Well, the book itself is hardcore seventies academia style, but it had a beautiful introduction by Walter Murch. And if you have never heard of the guy, you sure know his work. He is the absolute genius who edited iconic films like The Godfather and Apocalypse Now. 

These films are packed with brilliant montage. Remember the helicopter scene where we enter the mind of a Vietnam soldier through the sound of a propellor, with images of palm trees, fire and a hotel room? Or the iconic restaurant murder by Michael Corleone in The Godfather, where the subway sound comes screaming through the room? We didn’t even pay attention to that, because we’re completely concentrated on the face of Michael. 


I soon learnt that sound and voices work on a subconscious, intimate level. And there’s a good reason for that.

In his introduction, Walter Murch notes that:

“We begin to hear before we are born, four and a half months after conception. From then on, we develop in a continuous and luxurious bath of sounds: the song of our mother’s voice, the swash of her breathing, the trumpeting of her intestines, the timpani of her heart. Throughout the second four-and-a-half months, Sound rules as solitary Queen of our senses: the close and liquid world of uterine darkness makes Sight and Smell impossible, Taste monochromatic, and Touch a dim and generalized hint of what is to come.”

As Murch explains, we are born into sound. It’s the earliest input we learn to process, when our brains are just a glob of neurons. We hear cars, music, laughter and voices before we know what they are. People are introduced to us by their voices before we see them.

But then we are born.

A sudden wealth of visuals is presented to us. We see cars, trees, bikes and birds, but above all: people. We stare unceasingly at people’s faces, unable to process the information, but transfixed by the magic. We look at our mother’s face for days on end, and at all the other faces looking at us. And even though we learn over the years that it’s impolite to stare at someone’s face for very long, we will continue to be fascinated by people for the rest of our life.

As an avid podcast listener since the very beginning, I consumed thousands of hours of human voices. I discovered well crafted shows like This American Life, Serial, S-Town, Radiolab and Reply All. I listened to startup founders being interviewed (“deconstructed”) in podcast like The Tim Ferris Show and Decode Recode and I laughed out loud while I followed the misadventures of Startup. I learned about meditation though Gil Fronsdal and Sam Harris and got a bit more enlightened with every episode. Over the years I immersed myself in a wide range of subjects, sometimes by discovery, but most often by recommendation. I learnt about philosophy, history, caliphism, economics, shakespearean politics, relation dynamics and prison life in a way I had never been able to, if it weren’t for podcasts.

Lately I realised that I probably listen more than I read. 

Yet I noticed was that I often searched images online when I was listening. Sometimes I was just curious to see the person who is being interviewed; what does he or she look like? Or I wondered what the location was like where they were talking. Or I wanted to know what they were talking about; a book, an app, a photograph, a place on the map. I googled pictures of Othello when they talked about black identity in theatre, I searched for the labyrinth of John McLemore and found the picture of the guys behind a successful scamming company in India when I listened to Reply All. I love listening, but I now realise - in retrospect - that I was in uterus all the time; I was listening, but was kept in the dark. I wanted to see.

To build something new

A good time to build something is when you wish it were there, but it’s not. Of course you first have to find out if that’s the case. 

When I researched online I did find a number of interesting projects that added images to podcasts. For example, the Guardian ran an experiment called Strange Bird, where they showed strings of images/links/articles in a text message format, which was kind of cool. A new platform Entale was introduced as a separate app on your phone to show images on audio timelines. And then Spotify announced Spotlight which promised something similar (although we haven’t heard from them since). 

The problem with these apps was that they didn’t totally convince me: they mostly put images on a timeline which appear at the moment a conversation turns to a topic. But that requires you to either wait for them to appear… (which we don’t like) or to take out your phone at the most inconvenient moments (e.g. riding a bike, or cooking a recipe). This at best becomes a sort of elaborate powerpoint, which is not how we wanted stories to feel. We wanted something that is free-flowing, unattached to the timeline of the audio: a visual space. 

So we just decided to build something different, to design it from the ground up. 


We first built a simple slider on a phone where you could swipe through images and while you listen to an audio track. It was cool because it immediately introduced a new story logic. How do things combine? We started making stories with it, just using the phone. We recorded a lot of interviews on the street, took pictures and uploaded it to the player.

I loved the result, it felt genuine, personal and reminded me of the beloved Humans of New York, but in a different format. You were listening to people talk about their lives, their work, their ideas, choices they made or difficulties they challenged, while interacting with the interviewer. I felt very human, I actually got to know these people.

We produced a ton of stories, without much prior knowledge about interviewing or podcasting. We learned everything along the way. We got to meet many, many interesting characters and this experience installed a new sense of curiosity in all of us. You suddenly look at the world differently, with every person you see you think: “What’s their story?”

What happens is you start to pay attention to all the stuff under the surface, beyond the attitude and appearance. We explored that level beneath the everyday things people do and proclaim, people’s inner world, which is of infinite size and consequence. When you make these stories you soon realise that the real interesting stuff right is right there in front of us, within the people you meet everyday. Stories just waiting to be invited, to come to the surface. 


We developed our language with every new story we made. Pictures and audio became this universe in itself. Then, at some point we introduced movement: video. We felt that although photography is a beautiful medium, it is also a frozen aesthetic, while we wanted to get as close to the real thing as possible. This was a big change. Mixing the two felt natural and suddenly seeing the people in their natural behaviour made it much easier to relate to them. Characters came alive, they became people in all their complexity. Things started moving and suddenly the storytelling format fell into place. It became this natural form which somehow was in between a documentary and a podcast. 

The Storypix player, which is now the result of this journey, is a podcast you can listen to, while videos play fullscreen. At any time you can go back and forth through the visuals without interrupting the audio. This means that you can put your phone in your pocket when you want and just listen to the story, like you would with a podcast. And when step off your bike or finished cooking and open your phone, the visuals are there. 

Taking a walk

A metaphor I like to describe the Storypix format, is like listening to someone during a walk. You sometimes look at them while you listen to them; you can see the person, how they move, what they look like, you can see their expressions. But you can also enjoy the environment around you while you walk. You don’t need to look at the person the whole time, after all, what’s important is what they tell you along the way. But when you do look (read: open your phone), they are there. 

We build Storypix for the love of human stories. It has given us a lot: new encounters with strangers, new insights and probably a richer sense of understanding for the diversity of people around us. We hope it will do the same for you.