How do people really order adjectives?

06 Nov 2016
Paul Kernfeld dot com

Adjective ordering is a big ugly problem in English, but at least it’s not an ugly big problem. Recently I saw a claim by Mark Forsyth that adjectives are ordered, “opinion-size-age-shape-colour-origin-material-purpose.” I was interested to see how well adjectives really can be ordered.

To approach this problem, I looked at empirical usage in Google’s ngram dataset, which contains n-word phrases from a couple million books.

First, I identified the fifty most common adjectives in the corpus, sorted by the number of volumes in which they appear. For this analysis, I used volume count rather than the total number of occurrences (“match count”). I wanted to avoid biases caused by the many highly technical books in the Google ngrams dataset, such as soil quality reports that contain the word “loamy” thousands of times.

Some of these words, like “many,” could be categorized as quantifiers or determiners instead of adjectives. This doesn’t worry me too much; there are many competing ways of categorizing parts of speech, and this one seems as good to me as any.

other more first many new same such long different small good high much few own little most important last large second only next best great full possible least right short able several better whole free old less clear open certain special early true single strong common real difficult third present

Here’s the distribution of these counts. Each bar shows the total count in that range. They’re all clustered in a small range since I’ve selected the most common ones.

Distribution of word counts

Next, I looked at all bigrams, or ordered pairs of words. Here are the twenty most common bigrams.

# Volumes
manyother1896916
manydifferent1112353
severalother1064290
manymore1011133
littlemore971626
lastfew791315
nextfew681235
firstfew662956
severaldifferent649970
otherimportant590660
fewmore529771
fewother510301
greatmany496622
manynew466728
manysuch427247
mostother424131
onlyother404452
wholenew363142
goodold361629
severalimportant353381

Below is the distribution over bigrams. On the right side, you can easily pick out some of these common bigrams, such as “many other.”

Distribution of bigram counts

Here are uncommon adjective pairs: pairs of adjectives that don’t commonly occur together in either order. “Able” and “better” dominate this list; I guess they just don’t like other adjectives.

# Volumes
abledifficult0
betterearly0
betterstrong0
lessearly0
betterdifficult22
abletrue26
ableimportant28
bettershort29
abledifferent34
betterspecial36
mostonly36
ableclear44
betterfree45
betterfull53
strongdifficult54
betterclear55
ableshort58
bettertrue64
betterimportant69
cleardifficult78

Since I’m interested in adjective ordering, here is a list of adjective pairs that never occur switched. In millions of volumes, the phrases “more few” and “better little” never appear. This supports the claim that English has rigid rules about adjective order.

# Volumes
fewmore529771
littlebetter216561
otherbetter47697
muchbetter29010
onlyfull22803
onlyclear21965
ownshort20869
thirdless12178
ownbetter9970
fewbetter8986
manybetter8707
onlyearly7419
fewless6966
owndifficult5651
samedifficult3999
ownmost3529
onlydifficult3386
leastearly3204
ownless3164
wholebetter2690

I wanted to get a more precise idea about how strict adjective ordering rules are in English. For each pair of two words, I measured the ratio of how prevalently it occurred in one order, compared to in the other order. Below is the distribution of ratios. The large spike on the left shows that most pairs of adjectives only occur in one order. However, there is a tail of adjectives that can occur in either order.

Distribution of ratios

Next, I made a table of the most commonly switched adjectives. I only considered pairs that have a total count above 20,000. Below, each pair is shown in its less common order. This table shows some weaknesses of the data: in the bigram “present many,” “present” is probably a verb incorrectly tagged as an adjective in the Google ngram dataset. Many of these pairs make sense, though, which calls into question the idea that it’s possible to come up with a reliable rule for adjective order without considering the larger context.

# This order# Switched
presentmany1942620655
newsecond1588117531
goodmany214851243742
firstright963511308
newbest1670220125
newsingle2124325997
fullmany1142314120
littlegood4810061019
thirdsuch997813206
bestsingle3118541493
realhigh1237316961
highgood1081715599
thirdnew1018214921
possiblesuch1519022326
importantfirst72241108048
smallsingle1893028332
littlespecial870913078
firstgood3452253308
longlast137801225530
firstbest985316361

Next, I tried to come up with a rule for adjective order without considering the larger context. I tried to organize all the adjectives into a strict order. This type of rule is easy for a computer to remember, but probably not for a human.

I created this ordering by trying to minimize the total number of out-of-order counts. E.g. if the word “more” is before “few” in the list, this will add a penalty of 529,771 (the volume count of “few more”). Below is the order. Surprisingly, no obvious clusters jump out to me; for example, “first” and “last” are not next to each other, nor are “small” and “large.”

own same least only last certain many whole next most first few several much other present second such little different true third important best possible less more single large great able difficult new real common special small free good better strong clear early old long full high short open right

To evaluate how good this order was, I checked what proportion of bigrams were in the right order, weighted by volume count. Choosing a random order would produce a score of 50% according to this metric. This list got 39 million correct and 3.9 million incorrect, about 91%.

Here are the most prevalent bigrams which aren’t explained by the ordering above. Many of these are set phrases which are difficult to describe in general terms. “Good little” and “great little” are similar, as are “early next” and “early last.”

# Volumes
greatmany372076
rightnext166553
otherfirst42086
earlynext33063
earlysecond19933
presentseveral19043
goodfirst18786
greatlittle18349
earlythird17082
opennew14529
oldsingle12988
goodlittle12919
oldcommon10409
singlebest10308
goodnew9673
earlylast9565
bestlittle9306
strongcommon8742
moresuch8503
differentfirst8256

I also tried to create a mapping from the adjectives onto a range of real numbers, so that there would be a more well-defined notion of “distance.” I needed to get pretty hacky to try to keep the adjectives from clumping up. Eventually I got it to work, but I didn’t really trust the output.

Using empirical analysis to learn grammar rules is a difficult problem for reasons that I’m not qualified to explain. One interesting thought problem is “colorless green ideas sleep furiously,” a sentence that wouldn’t show up in empirical usage because it’s illogical despite being grammatically correct.

Here are some future directions for exploration: