Tuesday, January 09, 2007

 

Considerates do as considerates are


In a recent post, I ruminated about a problem with ETC's poems, specifically its inablity to distinguish between abstract and concrete nouns and issues with syntactically specfic words. The sample poem I used illustrates other problems. One of these is the inapproriate substitution of an adjective for a noun:
Considerates in mathematics

This malapropism is a result of the way I programmed the system to initiate itself. The poem object accepts a set of "seed" words in its constructor (nouns, verbs, and adjectives) and then uses those to build up contextual sets of related words. The programming is quite strict about getting what it wants and assumes that if it's been told a given word is a noun, then it must be a noun. The Web interface prompts a user for lists of nouns, verbs, and adjectives. The way "Considerates" came to be was a by way of a user's keying in the adjective considerate in the noun text field. Thinking it had a noun, it composed the line to be structurally similar to: "Lessons in mathematics," making a plural noun from the singular error.

That's how the mistake happened. But the cause is more complex. "Seed words" is a technical concept, an initiator (first cause?). You can't get poetry from technical concepts, only from literary concepts. What we really want is for the poem to be about something, to have a topic, e.g.: "The death of a beautiful woman."

Now the software evenutally has to figure out the grammatical categories into which the words in the topic fall, but it should figure that out on its own. After all, if we asked a "real" poet to write about the death of a beautiful woman, he/she might quite conceivably compose a line containing something like "sorrow for the lost," without our telling him/her that death is a noun and beautiful an adjective.

An improvement in ETC3 over its predecessors is that it actually takes a topic and parses it on its own. Which of course means that it has to have a decent parser. It does. Every word in its lexicon contains all of that word's inflections. If it's a word ETC3 knows about, it will find it, just like a "real" poet.

Software that thinks it's software will never be a threat, but software that thinks it's a writer just might be.

Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?