Podcast: Play in new window
Subscribe: Apple Podcasts | RSS | More
James Ker-Lindsay is Visiting Professor at the London School of Economics. His research focuses on conflict, peace and security in South East Europe (Western Balkans, Greece, Turkey and Cyprus), European Union enlargement, and secession and recognition in international politics.
He has created many Youtube videos explaining his subject.
*****
The Financial Times recently made a change to their style guide. The style guide of a newspaper is a list of rules about how articles should be written, all big newspapers have them, it’s to keep the language and spelling and so on consistent across the whole publication.
So most style guides specify, for example, that numbers up to ten should be spelled out, and numbers from 11 onwards are written as digits. They also say what spelling or what grammar style to use if there is more than one version considered correct.
The change that the Financial Times made was to treat the word ‘data’ as a singular rather than as a plural. That means instead or writing ‘the data are showing’ something, they will use English the way most of us do and write ‘the data is showing’ whatever.
There is a concept in linguistics … yes, I did study it, in case you didn’t notice that before … the concept is in sociolinguistics, which is about the interaction between language and society, called hypercorrect.
Correct language is pretty easy to understand for the most part, if I say ‘the dog are barking’ that’s easy to spot as being incorrect grammar, if I say ‘the dog is barking’ you all know that’s correct. But it’s important to understand that those rules aren’t made up by some professor in an academy somewhere. The correct use of a language is the way it is spoken by its speakers; dictionaries and grammar books describe the language, they don’t proscribe it.
So ‘the dog are barking’ is clearly wrong, but if, instead of saying ‘the dog is always barking’, if instead of that I say ‘the dog does be barking all the time’, that is not incorrect grammar. It’s not an error, it’s not wrong, it’s not nearly incorrect, it’s not a mistake at all. ‘The dog does be barking all the time’ is not standard English, it’s a Hiberno-English dialect, but saying ‘the dog does be barking all the time’ conforms with the rules of that dialect, that’s why it is emphatically not wrong and not a mistake.
Now, it’s true that standard English is sometimes considered to have a higher-status than some dialects of Hiberno-English, so you mightn’t want to use that in a job interview, or if you were reading the news, but that’s a different argument.
Lots of people speak different languages, and even the people who don’t can often speak several different dialects, or registers, of their own language, and it’s very common to signal the type of interaction that you’re having with someone by what register you speak to them in. In a lot of European languages this is formally marked out, there is a distinction between tu and vous in French, Du and Sie in German. In English, that distinction is not so formalised, but it does exist, you hear it in things like the switch between using ‘I haven’t got’ and ‘I don’t have’.
But the thing about the distinction being more subtle in English is that it can get misunderstood, and people can arbitrarily make up rules that miss the point.
I’m sure everyone has in their lives that Maude Flanders figure who wags her finger and tells us ‘Don’t say ‘who’, say ‘whom’’, but I’m sure that if you asked them why you should say whom instead of who, they wouldn’t be able to do much more than mutter that ‘it’s just wrong’.
And the next time that you get a finger wagged in your face, you can tell me that they are not just wrong, but they are wrong in two separate and ways, either of which alone would invalidate what they are saying.
The first is that who and whom are two different words, they aren’t a wrong and right version of the same word. They are the equivalent of he and him. One is used as the subject of a sentence, one is used as the object. So you would say ‘He gave it to me’, but ‘I gave it to him’. In English, his only crops up in pronouns, it’s invisible if you say ‘John gave it to me’, ‘I gave it to John’.
So to switch that into a question, you can say ‘Who did you give it to?’, or if you wanted to follow a very particular outdated grammar format, you could say ‘To whom did you give it?’, but there’s absolutely nothing in historical grammar to say that the first of those is not correct, so Maude is wrong on that level straight off.
But even if she was right, she’d still be wrong, because there is nothing written anywhere that just because something in language was correct in the past, it is still correct now. Language just doesn’t work that way. Languages change, all languages change, unless they are dead, and they change for a load of different reasons, but one of them is necessity, and our language just wouldn’t work if we tried to keep it the same as it was hundreds of years ago.
But that’s got nothing to do with Maude wagging her finger at you and making badly-grounded complaints about your grammar. What’s going on there is that someone is trying to win a game of social one-upmanship, to show off that they are better educated than you and put you down in the process.
This type of behaviour usually comes from people who are insecure about their own level of education and trying to cover for that by displaying superiority, and that insecurity is probably why so many people doing that are actually making silly grammar mistakes themselves.
Probably the richest vein of excuses for finger-wagging comes from Latin. There was a collective gasp of horror when newer versions of Star Trek changed their introductory line from To boldly go where no man has gone before, they changed it to To boldly go where no one has gone before. So they fixed the potentially sexist bit, but didn’t fix the supposedly bad grammar of a split infinitive, ‘to boldly go’.
In case you didn’t get that chiding, someone somewhere decided that since Latin infinitives are one single word, they can’t be split; The infinitive is the base form of a verb, before it is changed for tense or for who it relates to. And since the one-word Latin infinitives are not split, then English two-word infinitives – to go – also should not be split, you shouldn’t put the word boldly in between them to make to boldly go.
What they didn’t explain was why. The only justification was because that’s the way it’s done in Latin, rather missing the point that we’re not speaking Latin. The Romans did a whole load of things that we don’t do, like crucify people, but that alone isn’t a good reason to do the same.
As I said, languages change over time, they usually tend towards simplification and regularity, but that isn’t always the case. English has many regular verbs, you add ‘ed’ to make the past tense; look / looked, walk / walked. But English still has a lot of what are called strong verbs, mostly inherited from German, where a vowel change makes the past tense, like bleed / bled, forget / forgot, begin / began.
Languages tend towards regularity, but not always. One exception in English is hang. The original past tense was regular, hanged, but the word was very close to some strong verbs like sing, sink, ring, which go to sang, sank, rang, and over time what was originally a mistake, the word hung, became accepted.
But there was a catch. In the past, when handing down a death sentence, judges had a prepared text to read, which included saying ‘you will be hanged by the neck until dead’, and this didn’t change because it was part of an accepted legal formula, so by staying the same, judges found themselves out of step with modern usage, which had shifted.
Now the finger-waggers couldn’t say that someone as important and educated as a judge got their grammar wrong, but did that stop them? Not at all. Out of nowhere, they just invented a new rule saying ‘pictures are hung, people are hanged’. This has absolutely no basis in anything other than a desire to tell people off, without risking offending someone in authority. And that brings me back to the data are.
The justification for this, like the imaginary rule against split infinitives, is Latin. Data, we are scolded, is a Latin plural and therefore should have a plural verb – are, not is – applied to it. It’s true, in Latin, the word data is the plural of datum, but this is still wrong on so many levels.
The first, and to me the most important, is that it sounds terrible. To me, that’s enough – if it sounds bad, ditch it. You might think that’s subjective, but it’s actually not, I’ll explain in a minute.
Another reason it’s wrong is that we aren’t speaking Latin, so there is no reason to follow Latin grammar rules. And in particular, when words transfer from one language to another it is common for their grammatical properties to change.
There is a direct analogy, from a language closely related to Latin. Spaghetti in Italian is a plural; it’s the plural of spaghetto, but nobody every says ‘the spaghetti are hot’. That’s because in English spaghetti is singular, it’s uncountable like water or air.
The difference is that spaghetti is an everyday word, the word data is one the is strongly associated research and universities where I think you might get an oversupply of people who are anxious to show off how educated they are and how well they fit into their chosen academic clique, and the do this by using this awkward phrasing that distinguishes them from the great unwashed. It sounds odd, it catches in the ear, and that’s exactly what it is meant to do. It is intended to make you pay attention to how clever the speaker is.
I said that it sounds terrible, and that’s not a subjective judgement, and I said I’d say why. I mentioned uncountable nouns, things like liquids or powders, that are treated as singular in every language that I know of – the water is cold, the flour is in the cake. But English also treats very small things that are difficult to count as singular, like rice, sand, even spaghetti. These are treated as plurals in other languages, notably Romance languages descended from Latin.
Data – particularly in the context that researchers refer to it – is typically a large number of individual points of information. It fits very well into the conceptual frame that applies to rice and sand, that even if the individual components could be counted, they are not considered separately and are too small to be practically counted out individually.
Objectively speaking, the word data in English fits very well into that category, and aside from the snobbery and finger-wagging, that’s why it sounds bad to say ‘the data are’.
So I’m glad that the Financial Times are making that change. And I may have to tag Dan McClellan, a guest we had on the podcast a while back on this.