Thursday, May 25, 2006

 

Erica's first epigraph!


Erica has been epigraphed here, a first as far as we know. In ETC's amalgaming spirit, this jennifer person used two poems from our Web site for the epigram, one for the title and one for the text. We think we like jennifer.

Since making this discovery, Erica has taken to speaking of herself as Wonder Woman and in the 3rd person: "Wonder Woman will do some e-composition a little later." "Wonder Woman needs time to think." "Wonder Woman is a cyborg!" (Actually, that last may have been "Wonder Woman is so bored." We're not sure.)

But our interest is piqued: Authority drawn from a split quote generated from software in support of an incomplete tryptych that questions political and artistic balance and its own authority. Can un-ekphrastic text from the machine be far behind?

Wednesday, May 24, 2006

 

Under construction


My posts have been just too long for the narrow text portion the original template allowed. So I've changed it and, of course, have to redo all the links and so on. I hope to have that all taken care of over the next day or two.

Tuesday, May 23, 2006

 

Between essence and descent


In our last post, we looked at the elegant compositional simplicity of the C strpcy() function. Today, we'll talk about just how dangerous that bit of poetry really is.

strcpy()'s successful operation depends on the destination buffer's being large enough to contain the contents of the source buffer, determined by the presence of a zero-value null terminator somewhere in the source buffer. Consider this code fragment:

char dest_buffer[16];
strpcy(dest_buffer,"Hello world!")

This works just fine because the compiler-supplier null terminator, which falls immediately after the ! character at position 12, is well short of the end of the destination buffer's final position 15 (offsets in C are zero indexed).

But what about this?:

char dest_buffer[16];
strpcy(dest_buffer,"This is the way the world ends");


A seasoned and grizzled old timer would take one look at this and shiver and should the snippet be his own code remark, "That is not what I meant at all."

As we saw in the last post, strcpy() just copies each character from the string literal to the dest_buffer array, but it doesn't stop at position 16. It just keeps right on copying--to position 30. Positions 16 to 30 exist, immediately following the memory space allocated to contain dest_buffer. But that space is probably being used for something else, another variable perhaps, or an argument list on the stack, or (shudder) executable code. And whatever was there gets overwritten. If the gods are in a good mood, the program crashes. If they are in their usual demonic phase, this bit of sloppy code actually changes the program!

It gets worse. Consider some typical startup code:

int main(int argc, char** argv, char** env) {
char* pc = NULL;
if ( argc > 1 ) {
char* pc = (char*)malloc(12);
strcpy(pc, argv[1]);
}
// Do stuff
if ( pc != NULL )
free(pc);
}


At first glance, this code's author seems to have been cautious. She checks that there are indeed command line arguments before attempting to process one (argc represents the number of command line parameters passed to the program). And she makes sure that the call to malloc() (which asks the OS for a block of memory) was successful before trying to release the memory via the call to free(). Nice defensive code.

BUT…..

How does she know that argv[1]'s length is less than 12 and will fit into the allocated buffer pointed to by pc. She doesn't and when (not if) it is too long to fit, this program will crash and burn (itself or its user or its user's machine).

BTW: This is the cause of all of those buffer-overrun security flaws in Windows, where clever hackers have figured out where the code buffers are in a program or OS function and simply send that code to an accommodating function that, blissfully ignorant of incoming buffer sizes, copies the code right into a place where if can execute and do whatever damage it wants.

Most of these security holes could have been avoided by the simple and very common practice that says "Never, ever, ever use strcpy()." (How sad that such beauty must wilt, unseen and ever innocent.)

If you need to copy strings, use strncpy(), which takes a third argument, the maximum number of bytes to copy. You can control that value by using whatever mechanism you used to allocate the destination buffer--common programming wisdom at least since I started using C about 20 years ago.

There's a disconnect here between the elegance of strcpy()'s composition and its practical application and a host of questions to be invented about the roles and utility of radiant program texts and the relative balance of aesthetic and engineering practices in the development of useful applications.

Next: poetic software vs. engineered texts.

Sunday, May 21, 2006

 

strcpy() - An analysis


Back when we still taught C and C++, I always told my students that when they could pick up the C library function strcpy() and understand it without thinking about it, they were on their way to becoming C programmers. strcpy() is a deceptively simple function whose purpose is to copy data in character format from one memory location to another. Here's its source (the real source uses p1 and p2, rather than pDest and pSrc, but I've altered the code just a touch to make it a bit more understandable to any novice readers):
char* strcpy(char* pDest, char* pSrc) {
    char* pRet = pDest;
    while ( *pDest++ = *pSrc++);
    return pRet;
}
There's a lot going on here that will be obscure to the beginning C programmer: pointers to start with, pointer arithmetic, the post-increment operator, a no-op statement, and the definition of true in C.

pSrc points to the original copy of the string data the programmer wants to copy and pDest to the new copy. The real work happens in the while() "loop." Character by character and via the dereference (*) operator, the truth expression copies a byte from the source buffer to the destination buffer, until it finds a zero in the source buffer. (Since there is no string data type in C, it conceptualizes strings as arrays of ASCII encoded bytes, concluding with a value of zero, the "null terminator.")

This is downright strange-one expects a truth expression to be just that, a statement that evaluates to true or false, not an expression that performs actual work. A beginner would expect something more like this:
while ( *pSrc != 0) {
    *pDest = *pSrc;
    ++pDest;
    ++pSrc;
}
Now the truth expression performs an action more recognizable as a truth test. "*pSrc != 0" says to look at the character stored at address pSrc and if it's not zero, evaluate to false. If we're not at the end of the buffer, i.e.: the current value that pSrc points to is not 0, copy the byte. Then increment both pointers and test again. Continue until we find that damn null terminator.

The ++ increment operator is actually pretty amazing all by itself. Pointers are addresses. Incrementing them adds a value to the address equal to the size of the data type to which it has been declared to point, and one mechanism of pointer arithmetic. Pointer arithmetic works on addresses, not referenced values. It's what allows the programmer to smoothly traverse a buffer. And this is only possible, because C stipulates that arrays, which have highly tight integration with pointers in C, must be contiguous in memory. Now there is no physical reason why this has to be, and there are all kinds of reasons we might want to relax this linguistic constraint (less wasted buffer space being the first obvious candidate for some kind of managed memory system). But it allows pointer arithmetic.

There are two versions of the increment operator, pre and post. The student version uses the pre-increment operator. The real strpcy() uses the post-increment operator. Why? When the C compiler finds a pre-increment operator affixed to a term in an expression, it first increments that value and then evaluates the expression. When it finds a post-increment operator, it evaluates the expression and then increments the term. What the original strpcy() does is copy a byte and then advance the two pointers. Very cool. Since the student version uses only a single term for each expression, pre and post have the same effect, but students just plain prefer pre.

But what on earth is the truth expression evaluating? This too is very cool. In C, 0 is false. Anything that is not false is true (that much, at least, makes sense). So in C, any numeric value, and therefore any expression that evaluates to a numeric value (actually all expressions) other than 0 is true. Assignment (=) operations in C evaluate right to left, so the entire expression evaluates to the final result of the term on the left sign of the equals sign. *pDest = *pSrc, as a full expression assumes the final value of the dereferenced pDest. If the assignment was from a *pSrc that contained the null terminator, then pDest now also points to value of 0 and the entire expression is false (zero) and the while() loop terminates. If it contains any ASCII code at all, pDest would not be zero and would therefore be true (not false), allowing the loop to continue.

But what loop? A loop implies a body containing at least one executable statement. That's why the student version performs its assignments and increments in a block following the truth expression. The original appears not to have a body. But it does-it's that timid little semicolon following the truth expression. The semicolon is the C statement terminator. while() statements are not executable statements and so are not terminated in this way. (In fact, a semicolon in this position almost always is a bug.) But C also allows an empty statement, indicated by just the semicolon. This is a null operation, a leftover from early assembler programs where program code had a way of getting ahead of the hardware. Programmers would place null statements, mnemonically NOOP, for "no op" in critical spots to slow the program down. NOOP persists in C and C++ as the null statement.

The net result is the simple line "while ( *pDest++ = *pSrc++);" conflates work and truth test into a single syntagm. How cool is that?

A final word: Why return the original value of the destination buffer? Hasn't the work been performed already? Of what use could that be in a program? C library functions try to follow the UNIX idiom of allowing commands to be chained together and for the output of one command to be used as input to another (pipes and redirection operators are among the things that makes one suspect that UNIX thinks you are bothering it.) Return the original destination address allows the programmer to use the output of strcpy() in another C string function. E.g.: strcat(strcpy(p1,p2)," Add this to the copied array.").

So there you have it, compositional elegance in a simple function, an elegance that demands a lot it terms of reading competency from the programmer. An elegance so tight, it becomes poetic.

But in spite of all of the technical background information, writing such code is not a technical but a compositional competency. The programmer no more thinks of the history of C and its esoteric notions of truth when writing such a function than the poet writing in English thinks of the etymology of his words and the evolution of English syntax. He may stop for a moment to remember that the subject of an infinitive clause ought to be in objective case, but for the most part he just writes.

Next up: strcpy()'s dark side.

Thursday, May 11, 2006

 

Gone fishin'


Actually, gone riding. Even though we just took a few days off to do some new computational art, we still need a real vacation. We're going to try for our Iron Butt certificate, riding 1,000 miles in less than 24 hours, an achievement we're be very proud of. We plan to ride from my home in South Jersey to Tampa, FL, beginning tomorrow evening and finishing Saturday afternoon. Interstate 95 is an especially fine road for this, more so through the southern states where the speed limits are more civilized than they are in the rude, cold, and very anal North.

Wednesday, May 10, 2006

 

Power to the arts


I posted another Ream Appropriation this afternoon, a big one--an 82-meg mp3. All the appropriations are pretty big. That's what computer programming is good for--making and managing things we can't with our naked selves. Computational technology lets us store unimaginable amounts of data and to get back a particular one of them in seconds. It allows us to record entire books in minutes, books algorithmically composed in hours.

So far, artists haven't really advantaged themselves of the deep possibilities available in the tools and techniques for building large-scale applications. And large is really large. Rule of thumb: Any application with fewer than 20,000 lines of code should be considered trivial. Another way to think about that: 20,000 lines is 300 pages of single-spaced text. Output equivalent to a fair-sized novel doesn't even get you started.

Until artists and technologists forge the kinds of collaborations that saddle up the real power of computational systems, we're just dabbling. Ream Appropriations is a case in point.

Tuesday, May 09, 2006

 

Ream appropriations


We've been on a bit of a hiatus, partly because of the demands of the semester's end, but also because of a new project we've untertaken, Ream Appropriations.

For background: http://grandtextauto.gatech.edu/2006/04/06/notes-on-ream/

And here: http://grandtextauto.gatech.edu/2006/05/02/the-ream-goes-on/#more-1175

Our deft thefts of Ream: http://etc.wharton.upenn.edu/Ream/

Tuesday, May 02, 2006

 

Excellence in all things


Business Week has published its first-ever undergraduate business school rankings and Wharton heads the list. Business Week's print edition rankings have Wharton as academically number 1, with an A+ teaching quality grade. Anyone associated with Wharton will be unsurprised. The faculty there inspire not only their students, but each other. There's remarkably little academic jealousy and petty petulance. These guys are deeply respectful of each other's work and contributions, and they care just as deeply about their students.

I've done the adjunct circuit. From the department chair who advised me to handle the fact that there was insufficient lab space to accomodate all of the students in my class by making lab sessions optional ("That way no one will come") to the adjunct coordinator who suggested that I hold my required office hours in one of the campus cafes to the really absurd, I've seen, if not "it all," at least quite a bit of what's wrong in undergraduate education. I'm in a position to know that Wharton just plain gets it and just plain gets it right.

Even more interesting to me is how much more value Wharton faculty find in the work I've been doing with texts than humanities faculty find in my core IT and teaching competencies. The assumption seems to be that my very affiliation with Wharton must make me a genicidal, WTO-mongering, colonial robber baron--inhabiting the land of the status quo and on the very wrong side of the class struggle and that such an intellect couldn't ever understand poetry or literary theory--that's for the experts.

I'm trying to figure out why they work Wallace Stevens into so many of their discussions of theories of texts and the role of the avant-garde.

This page is powered by Blogger. Isn't yours?