The morphology entry contains four fields: root form, surface form, type of conversion, and a flag for using the morphology during recognition.
The root and surface forms are patterns that the recognize and generate functions use. The generate function matches a pattern in the root form and converts it to the surface form. The recognize function goes from the surface form to the root form. As an example, the following rule converts a word ending in "s" to its plural form ending in "ses":
|root ||surface ||type ||use
|*s ||*ses ||P ||1 |
Note that the "*" in the root form matches all character up to, but not including, the "s". This set of characters will be copied to the output of the generation function replacing the "*". The "s" after the "*" in the surface form is required or the "s" will not be copied during the conversion.
"used in recognition" is used to consider the rule in the recognition process
(ex: the default for 1st person verb is * -> * which will match any
words to itself. Also the 3rd person singular verb tense is
identical to the default noun plural form which would produce
too many matches.)
Rules are used when trying to locate a word in the database. If the actual word does not appear, the rules
list is checked for any matches, and the root is looked up. Also, when a word will be used in a response to
the user, it is sent through the cirmGenerate function to ensure the word has the correct person and time
Rules for English conversion from surface form to root form
the file is in the following format:
<root form> <surface form> <form type> <used in recognition>
root form gives the base of the word surface form gives the result of the
surface rule on the base word form type is:
|Form Type Desc||Example
|plural||churches, puppies, cars|
|past||(I) was, (I) walked|
|past participal||(I) walked, (I) sung|
|present participal||(I am) walking|
|first singular||(I) am, (I) walk|
|second singular||(you) are, (you) walk|
|third singular||(he) is, (it) walks|
|first plural||(we) are, (we) walk|
|third plural||(they) are, (they) walk|
During matching, a * will match any number of characters.
A 'C' will match a single consonant, and a 'V' will match any single vowel.
The above two entries describes how the system will convert a word into the
present participal verb form (VG).
- The first entry states if the word ends
with two vowels, then a consonant, simply tack on an 'ing'. The '*' will
match any number of characters. This would convert the word "look" to
- The second entry states if the word ends with a vowel then a consonant,
double the last consenant and append
'ing'. The reason the rule does not need to state "*CVC" is because the
rule above already caught all "*VVC" words, and the only pattern left
would be "*CVC". This rule would convert the word "hit" to "hitting".