Adjective ordering is a big ugly problem in English, but at least it's not an ugly big problem. Recently I saw a claim by Mark Forsyth that adjectives are ordered, "opinion-size-age-shape-colour-origin-material-purpose." I was interested to see how well adjectives really can be ordered.
To approach this problem, I looked at empirical usage in Google's ngram dataset, which contains n-word phrases from a couple million books.
First, I identified the fifty most common adjectives in the corpus, sorted by the number of volumes in which they appear. For this analysis, I used volume count rather than the total number of occurrences ("match count"). I wanted to avoid biases caused by the many highly technical books in the Google ngrams dataset, such as soil quality reports that contain the word "loamy" thousands of times.
Some of these words, like "many," could be categorized as quantifiers or determiners instead of adjectives. This doesn't worry me too much; there are many competing ways of categorizing parts of speech, and this one seems as good to me as any.
Here's the distribution of these counts. Each bar shows the total count in that range. They're all clustered in a small range since I've selected the most common ones.
Next, I looked at all bigrams, or ordered pairs of words. Here are the twenty most common bigrams.
Below is the distribution over bigrams. On the right side, you can easily pick out some of these common bigrams, such as "many other."
Here are uncommon adjective pairs: pairs of adjectives that don't commonly occur together in either order. "Able" and "better" dominate this list; I guess they just don't like other adjectives.
Since I'm interested in adjective ordering, here is a list of adjective pairs that never occur switched. In millions of volumes, the phrases "more few" and "better little" never appear. This supports the claim that English has rigid rules about adjective order.
I wanted to get a more precise idea about how strict adjective ordering rules are in English. For each pair of two words, I measured the ratio of how prevalently it occurred in one order, compared to in the other order. Below is the distribution of ratios. The large spike on the left shows that most pairs of adjectives only occur in one order. However, there is a tail of adjectives that can occur in either order.
Next, I made a table of the most commonly switched adjectives. I only considered pairs that have a total count above 20,000. Below, each pair is shown in its less common order. This table shows some weaknesses of the data: in the bigram "present many," "present" is probably a verb incorrectly tagged as an adjective in the Google ngram dataset. Many of these pairs make sense, though, which calls into question the idea that it's possible to come up with a reliable rule for adjective order without considering the larger context.
|# This order||# Switched|
Next, I tried to come up with a rule for adjective order without considering the larger context. I tried to organize all the adjectives into a strict order. This type of rule is easy for a computer to remember, but probably not for a human.
I created this ordering by trying to minimize the total number of out-of-order counts. E.g. if the word "more" is before "few" in the list, this will add a penalty of 529,771 (the volume count of "few more"). Below is the order. Surprisingly, no obvious clusters jump out to me; for example, "first" and "last" are not next to each other, nor are "small" and "large."
To evaluate how good this order was, I checked what proportion of bigrams were in the right order, weighted by volume count. Choosing a random order would produce a score of 50% according to this metric. This list got 39 million correct and 3.9 million incorrect, about 91%.
Here are the most prevalent bigrams which aren't explained by the ordering above. Many of these are set phrases which are difficult to describe in general terms. "Good little" and "great little" are similar, as are "early next" and "early last."
I also tried to create a mapping from the adjectives onto a range of real numbers, so that there would be a more well-defined notion of "distance." I needed to get pretty hacky to try to keep the adjectives from clumping up. Eventually I got it to work, but I didn't really trust the output.
Using empirical analysis to learn grammar rules is a difficult problem for reasons that I'm not qualified to explain. One interesting thought problem is "colorless green ideas sleep furiously," a sentence that wouldn't show up in empirical usage because it's illogical despite being grammatically correct.
Here are some future directions for exploration:
- What happens when we consider a larger set of adjectives?
- How can considering the larger context predict adjective order? Will we just end up needing to build a complete model of English grammar?
- The ordering I proposed is not easy for a human to remember. Can we come up with some human-memorable rules that are almost as good?