Redefining the dictionary - Erin McKean
Now, I have any of y'all ever looked up this word, you know, in a dictionary? But yeah, that's what I thought. Um, how about this word, you know, I'll show it to you: lexicography, the practice of compiling dictionaries. Known as we're very specific, that word compile—the dictionary is not carved out of a piece of granite, out of a lump of rock. It's made up of lots of little bits, little discrete—that's spelled D-I-S-C-R-E-T-E—bits, and those bits are words.
Now, one of the perks of being a lexicographer, besides getting to come to TED, is that you get to say really fun words like lexicographical. Lexicographical has this great pattern; it's called a double dactyl. And just by saying double dactyl, I've sent the geek needle all the way into the red. But lexicographical is the same pattern as higgledy-piggledy, right? It's a fun word to say, and I get to say it a lot.
Now, one of the non-perks of being a lexicographer is that people don't usually have a kind of warm, fuzzy, snuggly image of the dictionary. Right? Nobody hugs their dictionaries. But what people really often think about the dictionary is they think more like this: just to let you know, I do not have a lexicographical whistle, but people think that my job is to let the good words make that difficult left-hand turn into the dictionary and keep the bad words out.
But the thing is, I don't want to be a traffic cop. For one thing, I just do not do uniforms. And for another, deciding what words are good and what words are bad is actually not very easy, and it's not very fun. And when part of your job is not easy or fun, you kind of look for an excuse not to do them.
So, if I had to think of some kind of occupation as a metaphor for my work, I would much rather be a fisherman. I want to throw my big net into the deep blue ocean of English and see what marvelous creatures I can drag up from the bottom. But why do people want me to direct traffic when I would much rather go fishing? Well, I blame the Queen.
Well, why do I blame the Queen? Well, first of all, I blame the Queen because it's funny. But secondly, I blame the Queen because dictionaries have really not changed; our idea of what a dictionary is has not changed since her reign. The only thing that Queen Victoria would not be abused by in modern dictionaries is our inclusion of the f-word, which has happened in the American dictionary since 1965.
So there's this guy, right? Victorian era, James Murray, first after the Oxford English Dictionary. I do not have that hat; I wish I had that hat. So he's really responsible for a lot of what we consider modern in dictionaries today. When a guy who looks like that in that hat is the face of modernity, you have a problem. And so James Murray could get a job on any dictionary today; there would be virtually no learning curve.
And of course, people are saying, "Okay, computers, computers! What about computers?" The thing about computers is I love computers. I mean, I'm a huge geek; I love computers. I would go on a hunger strike before I let them take away Google Book Search from me. But computers don't do much else other than speed up the process of compiling dictionaries. They don't change the end result because what a dictionary is, is it's Victorian design merged with a little bit of modern propulsion—it's steampunk!
What we have is an electric velocipede, you know? We have Victorian design with an engine on it; that’s all. The design has not changed. And okay, what about online dictionaries, right? Online dictionaries must be different. This is the Oxford English Dictionary online, one of the best online dictionaries. This is my favorite word, by the way: erinaceus, pertaining to the hedgehog, family of the nature of a hedgehog—very useful word.
So look at that, online dictionaries. Right now, our paper thrown up on a screen—this is flat! Look how many links there are in the actual entry, right? Those little buttons, I have them all expanded except for the date chart, so there's not very much going on here. There's not a lot of click-ness, and in fact, online dictionaries replicate almost all the problems of print except for searchability. And when you improve searchability, you actually take away the one advantage of print, which is serendipity.
Serendipity is when you find things you weren't looking for because finding what you are looking for is so damn difficult. Now, when you think about this, what we have here is a 'hand butt' problem. Does everybody know the hand butt problem? A woman’s making a ham for a big family dinner. She goes to cut the butt off the ham, throws it away, and she looked at this piece of ham; she's like, "This was a perfectly good piece of ham! Why am I throwing this away?"
Well, my mom always did this, so she calls it "Mom," and she says, "Mom, why'd you cut the butt off the ham when you're making a ham?" She says, "I don't know. My mom always did it." So they called Grandma. My grandma says, "My pan was too small!" So it's not that we have good words and bad words; we have a pan that's too small.
You know that ham butt is delicious; there's no reason to throw it away. The bad words, see, when people think about a place and they don't find a place on the map, they think, "This map sucks!" When they find a nice spot, or a bar, and it's not in the guidebook, they're like, "Ooh, this place must be cool; it's not in the guidebook!" When they find a word that's not in the dictionary, they think, "This must be a bad word!"
Why? It's more likely to be a bad dictionary. Why are you blaming the ham for being too big for the pan? So, you can't get a smaller ham. The English language is as big as it is. So if you have a hand butt problem and you're thinking about the hand butt problem, the conclusion that it leads you to is inexorable and counterintuitive: paper is the enemy of words.
How can this be? I mean, I love books; I really love books! Some of my best friends are books, but the book is not the best shape for the dictionary. They're like, "Oh, my people are going to take away my beautiful paper dictionaries!" No, there will still be paper dictionaries. When we had cars, when cars became the dominant mode of transportation, we didn't round up all the horses and shoot them. You know, there's still going to be paper dictionaries, but it's not going to be the dominant cake. Dictionary—the book-shaped dictionaries are not going to be the only shape dictionaries come in. And it's not going to be the prototype for the shapes dictionaries come in.
So, think about it this way: if you have an artificial constraint, artificial constraints lead to arbitrary distinctions and a skewed worldview. What if biologists could only study animals that made people go "oh"? Right? What if we made aesthetic judgments about animals, and only the ones we thought were cute were the ones that we could study? We know a whole lot about charismatic megafauna and not very much about much else, and I think this is a problem. I think we should study all the words because when you think about words, you can make beautiful expressions from very humble parts.
Lexicography is really more about material science. We are studying the tolerances of the materials that you use to build the structure of your expression—your speeches and your writing. And then often people say to me, "Well, okay, how do I know that this word is real?" That I think, "Okay, if we think words are the tools that we use to build the expressions of our thoughts, how can you say that screwdrivers are better than hammers?"
How can you say that a sledgehammer is better than a ball-peen hammer? They're just the right tool for the job. And some people say to me, "How do I know if a word is real?" You know, anybody who's read a children's book knows that love makes things real. If you love a word, use it—that makes it real.
Being in the dictionary is an artificial; it doesn't make a word any more real than any other real, any other way. If you love a word, it becomes real. So if we're not worrying about directing traffic, if we've transcended paper, if we are worrying less about control and more about description, then we can think of the English language as being this beautiful mobile.
And anytime one of those little parts of the mobile changes, is touched—anytime you touch a word, you use it in a new context, you give it a new connotation—you verb it, you make the mobile move. You didn't break it; it's just in a new position, and that new position can be just as beautiful.
Now, if you're no longer a traffic cop, the problem with being a traffic cop is that there can only be so many traffic cops in any one intersection or the cars get confused, right? But if your goal is no longer to direct the traffic, but maybe to count the cars that go by, then more eyeballs are better. You can ask for help. If you ask for help, you get more done, and we really need help.
Library of Congress—17 million books, of which half are in English. If only one out of every 10 of those books had a word that's not in the dictionary in it, that would be equivalent to more than two unabridged dictionaries. And I find an undeclared word, like a word like "on dictionary," for example, in almost every book I read.
What about newspapers? The newspaper archive goes back to 1759—58.1 million newspaper pages. If only one in 100 of those pages had an undeclared word on it, it would be an entire other OAD—that’s a more than 500 thousand more words. So that's a lot!
I don't really even been talking about magazines. I'm not talking about blogs, and I find more new words on Boing Boing in a given week than I do in Newspapers. There’s a lot going on there, and I'm not even talking about "polysymy," which is the greedy habit some words have of taking more than one meaning for themselves.
So, if you think of the word "set," a set can be a Badger's burrow; a set can be one of the pleats in an Elizabethan ruff. And there's one number definition in the OED. The OED has 33 different numbered definitions for "set." Tiny little word, 33 numbered definitions. One of them is just labeled "miscellaneous technical senses." You know, do you know what that says to me? That says to me it was Friday afternoon, and somebody wanted to go down the pub—that's a lexicographical cop-out to say "miscellaneous technical senses!"
So, we have all these words, and we really need help. And the thing is, we can, we can ask for help. Asking for help's not that hard! I mean, lexicography is not rocket science! See, I just gave you a lot of words and a lot of numbers, and this is more of a visual explanation.
If we think of the dictionary as being the map of the English language, these bright spots are what we know about, and the dark spots are where we are in the dark. If that was the map of all the words in American English, we don't know very much. And we don't even know the shape of the language. If this was the dictionary—if this was the map of American English, look! We have a kind of lumpy idea of Florida, but there's no California. We're missing California from American English. We just don't know enough, and we don't even know that we're missing California. We don't even see that there's a gap on the map.
So, again, lexicography is not rocket science. But even if it were, rocket science is being done by dedicated amateurs these days. You know, it can't be that hard to find some words! So now, scientists in other disciplines are really asking people to help, and they're doing a good job of it. For instance, there's eBird, where amateur bird watchers can upload information about their bird sightings and then ornithologists can go and help track populations, migrations, etc.
And there's this guy, Mike Oates. Mike Oates lives in the UK; he's a director of an electroplating company. He's found more than 140 comets! He's got so many comets, they named a comet after him! It's kind of out past Mars; it's a hike! I don't think he's getting his picture taken there anytime soon, but he found 104 comets without a telescope. He downloaded data from the NASA Soho satellite, and that's how he found them.
We can find comets without a telescope; shouldn't we be able to find words? Now, y'all know where I'm going with this because I'm going to the Internet, which is where everybody goes! And the Internet is great for collecting words because the Internet's full of collectors. And this is a little-known technological fact about the Internet, but the Internet is actually made up of words and enthusiasm! And words and enthusiasm actually happen to be the recipe for lexicography! Isn't that great?
So the problem—there are a lot of good word-collecting sites out there now, but the problem with some of them is that they're not scientific enough. They show the word, but they don't show any context: where it came from, who said it, what newspaper was it in, what book. Because a word is like an archaeological artifact. If you don't know the provenance or the source of the artifact, it's not science; it's a pretty thing to look at.
So, a word without its source is like a cut flower. You know, it's pretty to look at for a while, but then it dies. It dies too fast. So this whole time I've been saying "the dictionary," though, "a dictionary," "the dictionary," not "a dictionary," or "dictionaries." And that's because, well, people use the dictionary to stand for the whole language. They use it cynically.
And one of the problems of knowing a word like "sanok dhaka kelly" is that you really want an excuse to say "sanok topically!" And so this whole talk has just been an excuse to get me to the point where I could say "syntactically" to all of you. So, I'm really sorry, but when you use a part of something, like "the dictionary" is a part of the language or "a flag" stands for the United States, it's a symbol of the country, then you're using it cynically.
But the thing is, we could make the dictionary the whole language! If we get a bigger pan, then we can put all the words in! We can put in all the meanings! Doesn't everybody want more meaning in their lives? And we can make the dictionary not just be a symbol of a language; we can make it be the whole language!
And you see, what I'm really hoping for is that my son, who turns seven this month, I want him to barely remember that this is the form factor that dictionaries used to come in. This is what dictionaries used to look like. I want him to think of this kind of dictionary as an 8-track tape. It's a format that died because it wasn't useful enough; it wasn't really what people needed.
And the thing is, if we can put in all the words, we no longer have that artificial distinction between good and bad. We can really describe the language like scientists. We could leave the aesthetic judgments to the writers and the speakers. If we can do that, then I can spend all my time fishing, and I don't have to be a traffic cop anymore. Thank you very much for your kind attention!