Tag Archives: tokenx

the tradeoff: elegance vs. performance

Oh snap – I just fixed this by turning on caching in the Cocoon sitemap. Thanks Brian Pytlik Zillig for pointing out that this is where that functionality is useful! And note to self (and all of us): asking questions when you’re torn between solutions can lead to a third solution that does much better than either of the ones you came up with.

With programming or web design, “clean and elegant” is a satisfaction for me second only to “it’s working by god it’s finally doing what it’s supposed to.”  So what am I to do when I’ve got a perfectly clean and elegant solution – one that involves zero data entry and only takes up a handful of lines in my XSLT stylesheets – that crunches browser speed so hard that it takes nearly a minute to load the homepage of my application?

I’ve got a choice here: Two XML files (one for each problem area) that list all of the data that I’d otherwise dynamically be grabbing out of all files sitting in a certain directory. This is time-consuming and not very elegant (although it certainly could be worse). The worst part is that it requires explicit maintenance on the part of the user. Wouldn’t it be nice to be able to give my application to any person who has a directory of XML files without any need for them to hand-customize it, even just a small part?

On the other hand, I can’t expect Web users to sit there and wait at least 30 seconds for TokenX to dynamically generate its list of texts, an action that would take a split second if it were only loading the data out of an XML file. I already have all the site menu data stored in XML for retrieval, meaning that modifications need only take place once and that nested menus can be easily entered without having to worry about the algorithm I’m using to make them appear nested on the screen in the final product.

You can tell from reading my thought process here what the solution is going to be. It’s too bad, because aiming for elegance often ends up leading you to better performance at the same time. Practicality vs. idealism: the eternal question to which we already know the answer.

What I’m doing this summer at CDRH: overview

I’ve been here at CDRH (The Center for Digital Research in the Humanities) at the University of Nebraska-Lincoln since early May, and the time went by so quickly that I’m only writing about what I’m doing a few weeks before my internship ends! But I’m in the thick of things now, in terms of my main work, so this may be the perfect time to introduce it.

My job this summer is (mostly) to update TokenX for the Willa Cather Archive (you can find it from the “Text Analysis” link at http://cather.unl.edu). I’m updating it in two senses:

  1. Redesigning the basic interface. This means starting from scratch with a list of functions that TokenX performs, organizing them for user access, figuring out which categories will form the interface (and what to put under each), and then making a visual mockup of where all this will go.
  2. Taking this interface redesign to a new Web site for TokenX at the Cather Archive.* The site redesign mostly involves adapting the new interface for the Web. Concretely, I’m doing everything from the ground up with HTML5 and styles separated into CSS (and aiming for modularity, I’m using multiple stylesheets that style at different levels of functionality – for example, the color scheme, etc., is separated from the rest of the site to be modified or switched out easily). The goal is to avoid JavaScript completely, and I think we’re going to make it happen. We’re also aiming for text rather than images (for example, in menus) and keeping the site as basic and functional as possible. After all, this is an academic site and too much flashy will make people suspicious. 😀
  3. The exciting part: Implementing as much of TokenX with the new interface as I can in the time I’m here. Why is it exciting?
    • TokenX is written in XSLT, which tends to be mentioned in a cursory way as “stylesheets for XML” as though it’s like CSS. It’s not. It’s a functional programming language with syntax devised by a sadist. XSLT has a steep learning curve and I have had 9 weeks this summer to try my best to get over it before I leave CDRH. I’m doing my best and it’s going better than I ever imagined.
    • I’m also getting to learn how XSLT is often used to generate Web sites (which is what I’m doing): using Apache Cocoon. Another technology that I had no idea existed before this summer, and which is coming to feel surprisingly comfortable at this point.
    • I have never said so many times in a span of only two months, “I’m glad I had that four years of computer science curriculum in college. It’s paying off!” Given that I never went into software development after graduating, and haven’t done any non-trivial programming in quite a long time, I had mostly dismissed my education as something that could end up being so relevant to my work now. And honestly, I miss using it. This is awesome.

I’m getting toward the end of implementing all of the functionality of TokenX in XSLT for the new Web site, hooking that up with the XSLT that then generates the HTML that delivers it to the user. (To be more concrete for those interested in Cocoon, I’m writing a sitemap that first processes the requested XML file with the appropriate stylesheet for transformation results, then passing those results on to a series of formatting stylesheets that eventually produce the final HTML product.) And I’m about midway through the process of doing from Web prototype to basic site styling to more polished end result. I’ve got 2.5 weeks left now, and I’m on track to having it up and running pretty soon! I’ll keep you updated with comments about the process – both XSLT, and crafting the site with HTML5 and CSS – and maybe some screenshots.

* TokenX can be, and is, used for more than this collection at CDRH. Notably it’s used at the Walt Whitman Archive in basically the same way as Cather. But we have to start somewhere for now, and expand as we can later.