Attention readers: This blog has moved to a new home at https://chenghlee.wordpress.com/.

Thursday, August 14, 2008

"You keep using that word."

"I do not think it means what you think it means." Meghan Daum writes in the Chicago Tribune about misuse of the word "nonplussed", which in the common parlance, has come to mean "unfazed, unperturbed or unconcerned", even though its proper definition (as provided by the Oxford English Dictionary) is "surprised and confused". Similarly, the word "peruse" is commonly used in the sense of "to skim or to read over quickly" even though its proper definition is "to read thoroughly or carefully."

This phenomena, of course, is simply the evolution of language through what linguist Mark Liberman calls "change by mistake", though embarrassingly, I have to admit to misusing both these words.

I bring this up because (1) it's amusing and (2) it serves as a reminder as to how difficult processing language can be, especially from a computational (read "natural language processing") standpoint. In a world where words can acquire meanings quite contrary to what appears in a dictionary, how are humans, let alone computers, expected to truly understand the semantics of an ambiguous statement such as "He was nonplussed by the situation."? In a world where we are allowed to take a Humpty Dumpty-esque approach to language ("When I use a word, it means just what I choose it to mean -- neither more nor less."), I'm not entirely convinced that we will ever reach a point where NLP has anything but domain specific applications.

No comments: