Visualizing the world's Twitter data - Jer Thorp
Transcriber: Andrea McDonough
Reviewer: Bedirhan Cinar
A couple of years ago, I started using Twitter, and one of the things that really charmed me about Twitter is that people would wake up in the morning, and they would say, "Good morning!" which I thought, I'm a Canadian, so I was a little bit, I liked that politeness. And so, I'm also a giant nerd, and so I wrote a computer program that would record 24 hours of everybody on Twitter saying, "Good morning!" And then I asked myself my favorite question, "What would that look like?"
Well, as it turns out, I think it would look something like this. Right, so we'd see this wave of people saying, "Good morning!" across the world as they wake up. Now, the green people, these are people that wake up at around 8 o'clock in the morning. Who wakes up at 8 o'clock or says, "Good morning!" at 8? And the orange people, they say, "Good morning!" around 9. And the red people, they say, "Good morning!" around 10. Yeah, more at 10's than, more at 10's than 8's.
And actually, if you look at this map, we can learn a little bit about how people wake up in different parts of the world. People on the West Coast, for example, they wake up a little bit later than those people on the East Coast. But that's not all that people say on Twitter, right? We also get these really important tweets, like, "I just landed in Orlando!! [plane sign, plane sign]" Or, "I just landed in Texas [exclamation point]!" Or "I just landed in Honduras!" These lists, they go on and on and on, all these people, right?
So, on the outside, these people are just telling us something about how they're traveling. But we know the truth, don't we? These people are show-offs! They are showing off that they're in Cape Town and I'm not. So I thought, how can we take this vanity and turn it into utility? So using a similar approach that I did with "Good morning," I mapped all those people's trips because I know where they're landing, they just told me, and I know where they live because they share that information on their Twitter profile.
So what I'm able to do with 36 hours of Twitter is create a model of how people are traveling around the world during that 36 hours. And this is kind of a prototype because I think if we listen to everybody on Twitter and Facebook and the rest of our social media, we'd actually get a pretty clear picture of how people are traveling from one place to the other, which actually turns out to be a very useful thing for scientists, particularly those who are studying how disease is spread.
So, I work upstairs in the New York Times, and for the last two years, we've been working on a project called "Cascade," which in some ways is kind of similar to this one. But instead of modeling how people move, we're modeling how people talk. We're looking at what does a discussion look like.
Well, here's an example. This is a discussion around an article called "The Island Where People Forget to Die." It's about an island in Greece where people live a really, really, really, really, really, really long time. And what we're seeing here is we're seeing a conversation that's stemming from that first tweet down in the bottom left-hand corner. So we get to see the scope of this conversation over about 9 hours; right now, we're going to creep up to 12 hours here in a second.
But we can also see what that conversation looks like in three dimensions. And that three-dimensional view is actually much more useful for us. As humans, we are really used to things that are structured as three dimensions. So, we can look at those little off-shoots of conversation, we can find out what exactly happened. And this is an interactive, exploratory tool so we can go through every step in the conversation.
We can look at who the people were, what they said, how old they are, where they live, who follows them, and so on, and so on, and so on. So, the Times creates about 6,500 pieces of content every month, and we can model every single one of the conversations that happen around them. And they look somewhat different.
Depending on the story and depending on how fast people are talking about it and how far the conversation spreads, these structures, which I call these conversational architectures, end up looking different. So, these projects that I've shown you, I think they all involve the same thing: we can take small pieces of data and by putting them together, we can generate more value; we can do more exciting things with them.
But so far we've only talked about Twitter, right? And Twitter isn't all the data. We learned a moment ago that there is tons and tons, tons more data out there. And specifically, I want you to think about one type of data because all of you guys, everybody in this audience, we, we, me as well, are data-making machines. We are producing data all the time. Every single one of us, we're producing data.
Somebody else, though, is storing that data. Usually, we put our trust into companies to store that data, but what I want to suggest here is that rather than putting our trust in companies to store that data, we should put the trust in ourselves because we actually own that data. Right, that is something we should remember. Everything that someone else measures about you, you actually own.
So, it's my hope, maybe because I'm a Canadian, that all of us can come together with this really valuable data that we've been storing, and we can collectively launch that data toward some of the world's most difficult problems because big data can solve big problems, but I think it can do it the best if it's all of us who are in control. Thank you.