This paper aims at investigating how Cantonese tones evolved from Middle Chinese phonology through multinomial logistic regression analysis by using the open source statistical software “R”. The study first established a database of Chinese characters, comprising the 10 items of phonological information of Middle Chinese, including the upper speller in fanqie (反切上字), the lower speller in fanqie (反切下字), initials (字母), voiced or not (清濁), quanci (全次), group (攝), Guangyun rhymes (韻), four categories. (等), openness of mouth (呼), hongxi (洪細), Pingshui rhymes (平水韻目), four tones (四聲) and level and oblique tone configurations (平仄), and Cantonese tones. Then it analyzed the phonological information of Middle Chinese that could produce statistically effective predictions on modern Cantonese tones through multinomial logistic regression, in order to generalize the conditions and patterns for the evolution of Middle Chinese phonology to Cantonese tones.
The results of the study show that 1) the four tones of Middle Chinese phonology determine the differences between the four tones in Cantonese tones; 2) voiceness of Middle Chinese phonology determines the differences between the upper and lower (陰陽) tones in Cantonese; 3) the group of Middle Chinese phonology determines the difference between upper entering (陰入) and middle entering (中入) tones in Cantonese; 4) the quanci of Middle Chinese phonology determines the difference between lower rising (陽上) and lower departing (陽去) tones in Cantonese. The predictive power of these four factors is 0.86, meaning that the four phonological information of Modern Chinese effectively predict 86% of Cantonese tones of Chinese characters. Arrangement of the four factors in descending order of their predictive powers would be four tones, voiced or not, group and quanci.
The significance of the study is illustrated in the following five points. 1) it provides scientific data that validated the traditional view on Cantonese tones, especially the results of 1), 2) and 4); it proves that group is a condition determining the difference between upper entering and middle entering tones in Cantonese; 3) it suggests the ordering of the strength of the impact of the four factors that affect Cantonese tones; 4) it established “R” for further applications by academics in investigating other relationships between ancient Chinese phonology and tones; 5) it reaches for a forward move in the statistical science of Chinese characters through employing contemporary statistics in investigating Chinese characters.