Friday, June 30, 2006
Speaking of constraints - a detour
I actually have a progression in mind for the series of Why is ALG hard? posts. But David Ayre's insightful comments, here and here, are worth pausing for. Especially in regard to the notion of constraints.
(BTW, ETC 1 controlled recursion in the way David suggests via a control structure that certain nodes passed down the utterance tree. But I just didn't like it--it was aesthetically displeasing (there's no accounting for taste), hard to change, and clumsy (which of course could be due to competency problems on my part). ETC 2 controls recursion by removing its possibility from the grammar. I'm still deciding about ETC 3. )
But back to the general notion of constraints. Aesthetic language generation presents certain interesting classes of problems in regard to constraints. For example, suppose you want to develop a system that generates sonnets (setting aside for now why you would ever want to do that). There are at least two categories of constraints that will have to be coded to solve this problem. First the text must rhyme and scan in a very specific way. Second, the text must present as meaningful expression.
These two sets of constraints will inevitably find themselves in contention with each other. Consider Shakespeare's Sonnet 101:
O truant Muse, what shall be thy amends
For thy neglect of truth in beauty dyed?
Both truth and beauty on my love depends;
So dost thou too, and therein dignified.
Make answer, Muse: wilt thou not haply say
'Truth needs no colour, with his colour fix'd;
Beauty no pencil, beauty's truth to lay;
But best is best, if never intermix'd?'
Because he needs no praise, wilt thou be dumb?
Excuse not silence so; for't lies in thee
To make him much outlive a gilded tomb,
And to be praised of ages yet to be.
Then do thy office, Muse; I teach thee how
To make him seem long hence as he shows now.
For thy neglect of truth in beauty dyed?
Both truth and beauty on my love depends;
So dost thou too, and therein dignified.
Make answer, Muse: wilt thou not haply say
'Truth needs no colour, with his colour fix'd;
Beauty no pencil, beauty's truth to lay;
But best is best, if never intermix'd?'
Because he needs no praise, wilt thou be dumb?
Excuse not silence so; for't lies in thee
To make him much outlive a gilded tomb,
And to be praised of ages yet to be.
Then do thy office, Muse; I teach thee how
To make him seem long hence as he shows now.
It doesn't take long to see the problem. The sentences may or may not coincide with lines, and in fact take up to three lines to complete. If an evaluation function decides that an utterance is subpar, and either asks for or provides a replacement (depending on the designer's philosophy about such things), meter and rhyme will evenutally be disrupted. Another evaluation function decides that that portion of the text no longer scans or rhymes within acceptable limits, and makes another change and now one or more utterances aren't well-formed English. We can't just let these guys duke it out until one gets tired and gives up--that only works in the humanities seminar. We have to cast this as an optimization problem and that makes me tired just thinking about it.
And if you are a diligent little OOP programmer, there is another, less theoretically interesting, but nasty problem to solve. What's the collection of objects that make up this piece? A collection of utterances? A collection of lines? Something else?
We'll leave that discussion for another day. For now, it might be interesting to observe how a human generates a form-centric text. (Click on "Play back the poem").
Tuesday, June 27, 2006
Why ALG is hard (cont'd)
To get a poem that appears original, we need to step away from template-based generation, this one for example: The Instant Muse Poetry Generator. We need a grammar of English that lets us range over its full (well, maybe not full, but a lot of its) syntactic possibilities. Should be easy, right? Just bone up on our phrase-structure grammar, or for the truly brave, a tree adjoining grammar. But instead of using the tree structure to parse already written texts, we'll just reverse the process, using the grammar as the rules for a transducer. So far, so good. And intuitive. Until we try to implement one.
Say we have very simple phrase-structure grammar:
S ->{ NP, VP [NP] }- Sentence expands to a noun phrase, a verb phrase, and an optional noun phrase.
NP->{ det, [adj]+, N, [PP] } - Noun phrase expands to a determiner, an optional list of adjectives, a noun, and an optional prepositional phrase.
PP-> { prep, NP } - Prepositional phrase expands to a preposition and a noun phrase.
See the problem? If we set the grammar free to build utterance trees based on the grammar, sooner or later (actually sooner) we'll get sentences like "Mary sold marigolds under the tree beside the road along the creek by the house of the saints of the house of Peter in time for a nap." Which won't do at all.
Controlling this,while maintaining the intercity of an object hierarchy is very difficult. We don't want super classes controlling the bases, not do we want containers to know anything about how their members work.
The trick is to develop a grammar in which no right-side value contains any left-side constituent. That's how etc works, but still imperfectly. And the more comprehensive the grammar becomes, the harder it is to control what is essentially recursion.
Thursday, June 22, 2006
Erica has a moment of self doubt
Yesterday, when I checked up on Erica's work, I found this poem in which she asks, "Am I succinct shit?" Alas, I know just how she feels!
Wednesday, June 21, 2006
If the singular of buses is bus, why isn't the singular of fuses, fus?
Because stemming isn't a natural process--it's the reversing of an inflection. And inflection rules are based on phonetics not orthography. It's the long u sound that says the base form of fuses is fuse. (And don't get me started on silent letters!)
How to get this done? With this, a weapon that should be in every ALG arsenal: The CMU Pronouncing Dictionary.
Now that we're informed, answer me this: How come the base form of binding isn't binde? After all, the base of tithing is tithe.
Tuesday, June 20, 2006
Why aesthetic language generation (ALG) is hard
A lot of the problems of natural langauge generation (NLG) have been solved. NLG systems are almost always concerned with specific application domains whose vocabularies are small and about which a relatively few kinds of messages are to be articulated. But aesthetic language generation wants to range over an entire language (in etc's case, English). So we need statistics. A small domain langauge can be completely anaylzed for its grammatical and idiomatic conventions. They can be identified and articulated in rules specific to that domain. But it's impossible to analysize all of English. So instead, we model it. Etc's model is pretty much the distribution of the frequencies of various kinds of bigrams.
But "all words are rare." For the model to have any shot at reasonable depicting the langauge, it needs to be able to determine that various forms of a given word have the same base. For example, the model should identify rise and rising as the same word. Easy, you say, just stem it. Right. Sounds like a solved problem.
It isn't. Information retrieval relies heavily on stemming, to compare a vector of search terms to some set of vectors in candidate documents. Rise in the search terms should line up with rising in candidates. But these stemming routines would stem both rise and rising to ris. Doesn't matter that ris is not a word. IR doesn't care. It's only interested in the simplest thing that matches.
But not in ALG. Riddle me this poet programmers--how would you code a general method to properly stem these participles?:
rising
betting
kissing
dying
panicking
Just one of scores of problems a poetry machine has to solve.
Monday, June 19, 2006
Does (the) language matter?
Last week I talked a little about starting a new etc version. This is etc's 4th iteration. I wrote the first version(as a course project) in Python and used Jane Austen's novels as the source text for the language model. Python lends itself quite well to NLP. It was really strong string handling and regular expression capabilities. And it was enormously powerful collections and indexing features. But its slow. Its interpreted and typeless (which sounds crazy but is actually quite useful), both of which contribute to degraded performance.
This first etc could only compose single-line poems. And it took forever to load (about 30 minutes) because I rebuilt the language model each time and maintained it in memory. For all its flaws, however, it established the basic architecture for all the etc's to come: A statistical language model attached to an analytic transformational grammar.
For Version 2, for my thesis, I used C++ and stored the language model (based on the Brown Corpus and WordNet) in SQLServer. C++ was my native language at the time and SQLServer came free to me (as part of the University's site license). C++ makes for wicked-fast executables. This version showed that it was possible to develop software that generated cohesive compositional structures. (Most of Erica's published work were edited versions of these poems.) But the work was very rough: No stemming (which also made the problem of word rarity even worse) and surface realization was primitive at best. And I learned that the performance problems were not in the code, but in the DB--etc is I/O bound, not CPU bound. But I was on the right track.
The current version's language model uses the British National Corpus and the grammar is in C#. C# because I wanted to learn it. The BNC solved problems with repetitiveness but at the price of even the worse performance and stylistic problems I outlined in my last post. C# is an excellent language for aesthetic text generation. Excellent string handling and a ton of useful collection and indexing libraries. (And Microsoft's IDE is the gold standard.) But in spite of that I'm writing this latest version in Java, with MySql as the DB. Lots of reasons for this.
My peers in electronic writing tend to shy away from MS and sharing code and thoughts embodied in code is a lot more difficult when programmers are speaking different languages. I want to get better at Java. But, if the truth be told, MS is starting to annoy me. (Full disclosure: When I was a consultant, I wanted things to be hard; otherwise people wouldn't need consultants. The harder the better. And I was quite grateful to MS for their buggy code--again, folks needed competent consultants to build workarounds and the hourly rate just kept on going up. That's how I got my motorcycle.) But now that I'm retired from consulting, I want things to be easy (or easier). And though the non-Microsofts of the world haven't yet figured out what MS has (that the GUI interface is just about the only thing most users care about), there are fewer bugs and much less nonsense.
Java is not quite as good a language as C# (expected since C# is a revised clone of Java and MS could correct weaknesses) and Sun's documentation is bad (Javadoc is a really flawed concept). But Sun's IDE is catching up to MS and Java is as much a philosophy as a language, so I'm quite encouraged. And JDBC is a dream come true.
Java is interesting in its connotation. Whereas C# phrasing connote lightness and agility, Java code (to me at least) connotes an odious, contemplative, and dark set of mysteries. Kind of cool.
What I aim to find out is if Java leads to better poetry. Wouldn't that make Scott McNealy wish he hadn't quit quite so soon!
Wednesday, June 14, 2006
Once more into the breach
I've begun iteration four of the poetry engine. The current version is OK, but decidely flawed. It grew as a response to a couple of problems in its immediate ancestor. First, its (the ancestor's) poetry became repetitive, a problem I traced to the paucity of words from which it could work. Iused the Brown corpus as the text source in that version. With "only" a million words, the engine soon began reusing some of them--a great many bigrams appeared only once in that corpus, so selecting by one of the word's context usually got the other word.
The current version uses the British National Corpus. With 85,000,000 usable words, it rarely repeats itself, at least semantically. But now the engine is woefully repetitive in structure and style. And the context of the works tend to a kind of high-level quotidian, a result of so much of the text's being drawn from journalistic sources where the practicalities of government and business dominate. So there's lots of references to monetary amounts and parliament.
And performance suffered. The datasets are huge and take a long time to retrieve.
But worst is the weakness of structure. This is a serious design problem. Though the current system implements a "structure" class that attempts to stitch together the pieces of a poem into a compositional whole, it is only and completely semantically based. No question and answer. No hinges. No shifting speaker. And the machine only generates one type of poem, a mildly abstract sort of free verse. No MFA specials, no radical abstractions, no sonnets.
My understanding of how these monsters can best be designed has followed a kind of geodesic pattern. From a design hypothesis I've developed a system, then examined its output. Discovering weakness in content and form leads to a better design hypothesis, which I then test as a functioning system. The problem is that the time it takes to go from recognizing basic flaws to delivering a working system takes longer and longer. The last go round took over a year. I anticipate about that much time for this new version.
I welcome any reader comments on where the Erica goes wrong and suggestions on how she could do better.
Tuesday, June 06, 2006
affectatiyuns can be dangertus
This past Friday I attended a performance by the computational artists Franziska Baumann and Matthew Ostrowski at the Slought gallery in Philadelphia. Baumann's solo performance was notably compelling. Using her voice and a complement of computational devices, she delivered a performance (with her co-star, the CyberGlove) that ranged from the haunting to the comic.
The glove (developed in collaboration with the technolgists at STEIM ) is a mechanism for holding a set of electronic sensors that allow Baumann to manipulate the sounds of her voice in a number of different ways. The glove itself is an above-the-elbow, sheer black evening accessory. Thin, segmented, metallic strips run from the glove's fingertips to its cuff, nearly to Baumann's shoulder. But these are cosmetic only--the "real" wires are openly visible, connecting the sensors in the glove's fingers to a junction box strapped to her waist. There are three main sets of sensors. Those in the fingers respond to flex and relaxation motions. Those in the wrist respond to rapid axial motions of the lower arm. A proximity sensor responds to changes in distance between the palm of her hand and another sensor, which at various times in the performance, she kept attached to her belt or held in her other hand. Finally, a set of three buttons on a small panel, which she held against her microphone, gave her additional control over the characteristics of the sounds all of this allowed her to generate.
Most interesting is that none of this equipment generates its own sounds. All of the sounds begin with her voice. The glove not only lets her mouthful the sounds of her voice, but to sustain it for extended periods. So she can strike a note and then elaborate that note electronically while strike new notes in an eerily effective duet with herself. In effect her voice was its own accompaniment.
All of this is interesting, but what made the evening so memorable was the quality of her performance. She began by striking a single very high note and holding for a very long time. She let the audience know at the outset that before she is anything else, she is a singer. During the course of her 45-minute set, she sang avocally, in German, and in English. She voiced clicks and gutturals and she puffed whispers, all the while recombining these various voicings into altogether new sounds.
There's a lesson here for us working in computational poetics. Baumann's success is possible only through her collaboration, as a accomplished musicl artist, with accomplished technologists. It takes the best efforts of both sides of the collaboration to make a piece of art that audiences will want more of.
Take Brian Kim Stefans's Kludge as an example. (BTW: The Kludge just linked is a first draft; Brian is working on a new version of Kludge as hinted here.) This is an altogether remarkable text. Never the same twice (a virtue computational writers, including me, often claim for their work), the text is nevertheless compelling. It's simply something one would want to read. (The easiest way to get a sense for the text is to keep the mouse away from the flash window when it appears, then move it around the text to the & icon. Then click. However the various texts come into being after these clicks, regardless of how they stutter and restart, they are unified with a serious singleness of purpose and a steadfast adherance to the influence of Gertrude Stein. Wonderful stuff!
But the technology is so 2005. Actually more like 1985. The work works because of the use of the fixed-width font, which essentially allows the text to be constructed within a fixed coordinate system. Letters can appear at discrete positions with easily computed x,y coordinates. The letter immediately beneath (10, 11) will be in position (10, 12). The one immediately preceding it will be at posistion (9, 11). This way letters never overlap. This would be much harder using proportional fonts. Kludge is effective because Brian is a gifted writer, a really gifted writer. The technology adds dimensions that will never be available to plain text and Brian is pretty good at that too, but he's not as good a technologist as he is a writer.
Compare Kludge with Aya Karpinska and Daniel Howe's Open-ended, a remarkable piece that they showed at E-fest 2006. This kind of programming demands a great deal from the programmer, from the complex math of the spinning cubes to getting the text to align with the moving sides to keeping the display from flickering. Very tough stuff.
When asked during their seminar presentation if they had considered expanding the program to allow other sorts of polyhedrons and more than two such shapes, their response was that to do so would require writing more texts and that the difficulty of producing quality texts was the biggest constraint working against such an elaboration. This in effect is the obverse of Kludge. Where Kludge is the work of a gifted writer with a flair for technology, we now have gifted programmers with a flair for writing.
And then there's Judd Morrissey's collaboration with Lutz Hamel. Their Error Engine would not be possible without the talents of the gifted writer paired with the gifted computer scientist. The Error Engine is text that one does want to read presented in ways impossible without AI emulation.
As computational poets seek to be heard and to bring into the world a new polemic, they would be well advised to learn from the examples of Franziska Baumann and Judd Morrissey and recognize that the very best work will have deep roots in both its disciplines.
Saturday, June 03, 2006
What if I cared?
Today Erica received a small package from an online literary magazine, containing a CD of the current issue, her complimentary copy, and a note thanking her for her submission. The problem is that we sent the poems to the journal over three years ago and had never heard a thing from them.
This kind of thing happened a lot when I was sending out machine-aided poetry to little magazines and literary journals. Some editors took months and months to respond. A surprising number never responded at all (and Erica was obsessive about including the standard SASE).
One editor wanted to publish a piece, but asked for some changes first. We made the changes (what did we care?--all we wanted was the publication credit) and sent it back, with a copy of their note. Back it came to us with the standard rejection slip. So we sent it back, with one of our clever letters asking what had happened. Finally, back came an apology and an acceptance.
Another editor accepted two pieces and warned about a 12- to 15- month delay before they would appear in his magazine. It's been three years and nothing.
Early on in the project, I wrote a few sonnets as a way of learning about the process of building up a form, necessary to conceptualizing a design for poetic text generation, which is only form. On a whim, I sent a couple to a new-formalist publication and got back a rather snippy little rejection slip informing me that they only accepted traditional forms and that I should try someplace else that published open forms. They hadn't even read the pieces (which really did scan and rhyme).
Now, I have only been interested in finding out if machine-written and machine-aided poems could find spots "out there," and enough of them did to answer the question. (It's a numbers game, really. I learned that, on average, Erica would be accepted every eighth submission. And since she has hundreds of poems, it actually was fairly easy to achieve that kind of submission rate--it was just a question of getting the process right and that's what we teach at Wharton.)
But what if I cared? What if I were a young writer trying to establish a foothold in the literary world and emotionally invested in the work and its reception? How disheartening such treatment must be! How many talented young folks just give up?
Hey editors! If you are going to start a literary magazine, don't do it Judy Garland and Mickey Rooney style--to save the orphanage or the college or the world. Do it because you think there's a need and an audience for your editorial practices. And for Pete's sake, if you solicit open submissions, show your writers some courtesy and respect--without them, all you've got are books of blank pages.
Friday, June 02, 2006
Pure product and the strain of imagination
Last week I posted a couple of meditations on the C library's strcpy() function: its elegance and its risk. These two ways of examining a small function illustrate an essential tension in software development. On the one hand, system design and implementation is well established as a set of engineering practices. NATO (of all organizations) decided in 1968 that a piece of software is an engineered artifact. On the other hand, actually composing a program is a compositional exercise.
The IEEE Standard Glossary of Engineering Terminology defines software engineering as "The application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software; that is the application of engineering to software." One can't argue that large-scale systems development can be anything other than "systematic" and "disciplined" and be successful. Software is just too complex. But quantifiable? Barry Boehm has applied economic principles to software development in order to enable better schedule and budget estimation. His Constructive Cost Model (COCOMO) does a pretty good job. Studies show that COCOMO estimation results in estimates that are within 20% of actual results 70% of the time. On a $500,000 project, users of COCOMO can expect to be no more that $100,000 off in their estimates most of the time. But 30% of the time, they will be off by even more. The best estimator out there is wrong 1/3 of the time, hardly a good argument in support of software engineering's being quantifiable.
But the strcpy() example shows how much software development does embrace practices that approximate, in the abstract, engineering practices. The programmer has to know strcpy()'s limitations as a functioning artifact in a way similar to how an electrical engineer has to know how much current a given size wire can support. There's nothing creative in that; it's the application of empirical observation to real-world problems. And without it, systems break and break hard.
But there's that nagging little insight that strcpy() is an extraordinarily elegance piece of programming, so much so that reading it forces an informed reader into contemplation of the C programming language--reminiscent of Barthes's observation (Critical Essays) that literary texts are always, in some way, reflections on themselves and their use of language.
Nobody ever says I have to develop a program this afternoon or I have to engineering a program. We say "I have to write a program."
Different languages have not only different vocabulary, but different affects. Just compare a Java program to its C# equivalent. Though C# closely models Java, the Java text somehow seems more deliberate, even stolid. C# texts seem lighter and somehow, risky. C is poetic; Python is prose.
Programs are texts, keyed one character at a time. Characters form key words that form statements that form functions that form programs that form systems. Much like word, sentence, paragraph, chapter, book. And doesn't the successful writing of a book require a systematic, disciplined approach?
I am convinced that failure in application development is a systemic problem rooted in misguided allegiance to a set of developmental practices that privilege engineering and failure to understand the process and practices of writing (the non-quantifiable) and how doing so would improve software quality.
What I don't know is what the balance should be: How much of engineering and how much of writing are the right mix? How can we (can we at all) measure the weight of writing on software quality? How can we better train programmers to include writing practices in their programming? Can the practices of literary analysis be applied to software texts as ways of discovering flaws in old programs and better ways of composing new ones?
Would a good start be to begin teaching computer programming in the English Department as well as the Computer Science Department?