Tuesday, June 27, 2006

 

Why ALG is hard (cont'd)


To get a poem that appears original, we need to step away from template-based generation, this one for example: The Instant Muse Poetry Generator. We need a grammar of English that lets us range over its full (well, maybe not full, but a lot of its) syntactic possibilities. Should be easy, right? Just bone up on our phrase-structure grammar, or for the truly brave, a tree adjoining grammar. But instead of using the tree structure to parse already written texts, we'll just reverse the process, using the grammar as the rules for a transducer. So far, so good. And intuitive. Until we try to implement one.

Say we have very simple phrase-structure grammar:

S ->{ NP, VP [NP] }- Sentence expands to a noun phrase, a verb phrase, and an optional noun phrase.

NP->{ det, [adj]+, N, [PP] } - Noun phrase expands to a determiner, an optional list of adjectives, a noun, and an optional prepositional phrase.

PP-> { prep, NP } - Prepositional phrase expands to a preposition and a noun phrase.

See the problem? If we set the grammar free to build utterance trees based on the grammar, sooner or later (actually sooner) we'll get sentences like "Mary sold marigolds under the tree beside the road along the creek by the house of the saints of the house of Peter in time for a nap." Which won't do at all.

Controlling this,while maintaining the intercity of an object hierarchy is very difficult. We don't want super classes controlling the bases, not do we want containers to know anything about how their members work.

The trick is to develop a grammar in which no right-side value contains any left-side constituent. That's how etc works, but still imperfectly. And the more comprehensive the grammar becomes, the harder it is to control what is essentially recursion.

Comments:
Generative constituent based grammars are great for aesthetic text generation in that they are really easy to understand and work with, especially for novice poet-programmers. But they are prone to the extreme recursive expansions you describe.

One way to deal with this is by including a contextual state object with each recursive call, to help control recursive meandering. Each expansion of the grammar can touch base with the context to see where it is at and how much/little to expand. Essentially applying constraints.

Another way is to expand the capability of the grammar by making expansion calls essentially like function calls which can take parameters. If you include the ability for conditional statements and global/local variables, you can create a grammar capable of controlling itself upon expansion. This is the technique used by Bulhak's Dada Engine (e.g. the Postmodern Essay Generator).

http://dev.null.org/dadaengine/manual-1.0/dada_toc.html

This way you wouldn't have to have the restriction you suggest of a grammar with no recursion where no right side contains any left side constituents.

The more semantically, syntactically, narrativley, rhetorically specific and interelated the desired outcome, the more complex the generative system needs to be to meet all these constraints. The great thing about poetry generation in particular is you can achieve quite a bit with a very loose system.

Ideally, it would be great to have software which allowed you this level of control if desired. The ALAMO group (Paul Braffort in particular) has developed some advanced systems (in the 80's !) and continue research in this area, but little is known about their work in North America.

http://www.paulbraffort.net/langage/struct_ling_et_signif/malta_pages.html?pagenum=1
http://www.paulbraffort.net/langage/struct_ling_et_signif/literalgo_pages.html?pagenum=1


Also Pablo Gervás has developed some generative systems using evolutionary algorithms, intelligent agents and case based reasoning systems. Interesting stuff:

http://nil.fdi.ucm.es/nilweb/projects/frogs/index.html
http://nil.fdi.ucm.es/nilweb/projects/Wasp/index.html
http://www.gtrlabs.com/files/active/0/modeling-literary-style-for.pdf

There is Hisar Manurung's thesis work on natural language generation architectures for poetry, but i've found this to restrictive poetically in terms of the strong formalisms he's applied to his efforts. Interesting still:

http://www.gtrlabs.com/files/active/0/poetryArchitecture.pdf

I think anyone who has seriously delved into aesthetic text generation like you have will bump into this problem of how best to generate text. For the GTR Language Workbench, i've created a grammar system (called 'bp') similar to Bulhak's Dada engine, but with more data, conditional and looping structures. It can also call standard java functions from the grammar so that it can, for example, fetch a particular rhyming word for a part of speech, or fetch a random quote from the internet, mis-spell something etc. I think the ability to access external functions like this from within the grammar is important. Also persistence is a neat concept. If for example, you had a grammar rule which created a new name for a reoccuring character in a narrative. Wouldn't it be great if you could save the state of that rule evalutaion in a variable, and have it be present the next time you run the script (say the next day , etc?). The grammar would then have a memory it could draw upon and utilize in future expansions. In the bp grammar, you can build up complex objects, so the grammar could construct a new "person" with properties like a name, age, etc. which can then be utilized from within the grammar: eg. S -> $person.name VP NP.

The trick is creating a system that is useable by the 'average' poet-programmer (i'm not quite sure what an average poet-programmer would look like?). Making the system simple enough that working with it is a handson, iterative, interactive and above all an enjoyable process.
 
Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?