How do people really order adjectives?

Adjective ordering is a big ugly problem in English, but at least it’s not an ugly big problem. Recently I saw a claim by Mark Forsyth that adjectives are ordered, “opinion-size-age-shape-colour-origin-material-purpose.” I was interested to see how well adjectives really can be ordered.

To approach this problem, I looked at empirical usage in Google’s ngram dataset, which contains n-word phrases from a couple million books.

First, I identified the fifty most common adjectives in the corpus, sorted by the number of volumes in which they appear. For this analysis, I used volume count rather than the total number of occurrences (“match count”). I wanted to avoid biases caused by the many highly technical books in the Google ngrams dataset, such as soil quality reports that contain the word “loamy” thousands of times.

Some of these words, like “many,” could be categorized as quantifiers or determiners instead of adjectives. This doesn’t worry me too much; there are many competing ways of categorizing parts of speech, and this one seems as good to me as any.

other more first many new same such long different small good high much few own little most important last large second only next best great full possible least right short able several better whole free old less clear open certain special early true single strong common real difficult third present

Here’s the distribution of these counts. Each bar shows the total count in that range. They’re all clustered in a small range since I’ve selected the most common ones.

Distribution of word counts

Next, I looked at all bigrams, or ordered pairs of words. Here are the twenty most common bigrams.

		# Volumes
many	other	1896916
many	different	1112353
several	other	1064290
many	more	1011133
little	more	971626
last	few	791315
next	few	681235
first	few	662956
several	different	649970
other	important	590660
few	more	529771
few	other	510301
great	many	496622
many	new	466728
many	such	427247
most	other	424131
only	other	404452
whole	new	363142
good	old	361629
several	important	353381

Below is the distribution over bigrams. On the right side, you can easily pick out some of these common bigrams, such as “many other.”

Distribution of bigram counts

Here are uncommon adjective pairs: pairs of adjectives that don’t commonly occur together in either order. “Able” and “better” dominate this list; I guess they just don’t like other adjectives.

		# Volumes
able	difficult	0
better	early	0
better	strong	0
less	early	0
better	difficult	22
able	true	26
able	important	28
better	short	29
able	different	34
better	special	36
most	only	36
able	clear	44
better	free	45
better	full	53
strong	difficult	54
better	clear	55
able	short	58
better	true	64
better	important	69
clear	difficult	78

Since I’m interested in adjective ordering, here is a list of adjective pairs that never occur switched. In millions of volumes, the phrases “more few” and “better little” never appear. This supports the claim that English has rigid rules about adjective order.

		# Volumes
few	more	529771
little	better	216561
other	better	47697
much	better	29010
only	full	22803
only	clear	21965
own	short	20869
third	less	12178
own	better	9970
few	better	8986
many	better	8707
only	early	7419
few	less	6966
own	difficult	5651
same	difficult	3999
own	most	3529
only	difficult	3386
least	early	3204
own	less	3164
whole	better	2690

I wanted to get a more precise idea about how strict adjective ordering rules are in English. For each pair of two words, I measured the ratio of how prevalently it occurred in one order, compared to in the other order. Below is the distribution of ratios. The large spike on the left shows that most pairs of adjectives only occur in one order. However, there is a tail of adjectives that can occur in either order.

Distribution of ratios

Next, I made a table of the most commonly switched adjectives. I only considered pairs that have a total count above 20,000. Below, each pair is shown in its less common order. This table shows some weaknesses of the data: in the bigram “present many,” “present” is probably a verb incorrectly tagged as an adjective in the Google ngram dataset. Many of these pairs make sense, though, which calls into question the idea that it’s possible to come up with a reliable rule for adjective order without considering the larger context.

		# This order	# Switched
present	many	19426	20655
new	second	15881	17531
good	many	214851	243742
first	right	9635	11308
new	best	16702	20125
new	single	21243	25997
full	many	11423	14120
little	good	48100	61019
third	such	9978	13206
best	single	31185	41493
real	high	12373	16961
high	good	10817	15599
third	new	10182	14921
possible	such	15190	22326
important	first	72241	108048
small	single	18930	28332
little	special	8709	13078
first	good	34522	53308
long	last	137801	225530
first	best	9853	16361

Next, I tried to come up with a rule for adjective order without considering the larger context. I tried to organize all the adjectives into a strict order. This type of rule is easy for a computer to remember, but probably not for a human.

I created this ordering by trying to minimize the total number of out-of-order counts. E.g. if the word “more” is before “few” in the list, this will add a penalty of 529,771 (the volume count of “few more”). Below is the order. Surprisingly, no obvious clusters jump out to me; for example, “first” and “last” are not next to each other, nor are “small” and “large.”

own same least only last certain many whole next most first few several much other present second such little different true third important best possible less more single large great able difficult new real common special small free good better strong clear early old long full high short open right

To evaluate how good this order was, I checked what proportion of bigrams were in the right order, weighted by volume count. Choosing a random order would produce a score of 50% according to this metric. This list got 39 million correct and 3.9 million incorrect, about 91%.

Here are the most prevalent bigrams which aren’t explained by the ordering above. Many of these are set phrases which are difficult to describe in general terms. “Good little” and “great little” are similar, as are “early next” and “early last.”

		# Volumes
great	many	372076
right	next	166553
other	first	42086
early	next	33063
early	second	19933
present	several	19043
good	first	18786
great	little	18349
early	third	17082
open	new	14529
old	single	12988
good	little	12919
old	common	10409
single	best	10308
good	new	9673
early	last	9565
best	little	9306
strong	common	8742
more	such	8503
different	first	8256

I also tried to create a mapping from the adjectives onto a range of real numbers, so that there would be a more well-defined notion of “distance.” I needed to get pretty hacky to try to keep the adjectives from clumping up. Eventually I got it to work, but I didn’t really trust the output.

Using empirical analysis to learn grammar rules is a difficult problem for reasons that I’m not qualified to explain. One interesting thought problem is “colorless green ideas sleep furiously,” a sentence that wouldn’t show up in empirical usage because it’s illogical despite being grammatically correct.

Here are some future directions for exploration:

What happens when we consider a larger set of adjectives?
How can considering the larger context predict adjective order? Will we just end up needing to build a complete model of English grammar?
The ordering I proposed is not easy for a human to remember. Can we come up with some human-memorable rules that are almost as good?