Practical OCaml

Suppose you were trying to run some experiments about L1 D-caches. (You may also suppose that this is a homework problem, but that’s life.) You’re given a trace of loads and stores at certain addresses. These addresses are 32-bits wide, and the trace is in a textual format:
1A2B3C4D L
DEADBEEF S
1B2B3C4D L
represents a load to 0x1a2b3c4d, followed by a store to 0xdeadbeef, followed by a load to 0x1b2b3c4d. (You might notice the two loads may be in conflict, depending on the block and cache size and the degree associativity. In that case, you might be in my computer architecture class…)

This is problematic. Naturally, you’d like to process the trace in OCaml. But did I mention that the trace is rather large — some 600MB uncompressed? And that some of the addresses require all 32 bits? And some of the statistics you need to collect require 32 bits (or more)? OCaml could process the entire trace in under a minute, but the boxing and unboxing of int32s and int64s adds more than twenty minutes (even with -unsafe). I felt bad about this until a classmate using Haskell had a runtime of about two and a half hours. Yeesh. C can do this in a minute or less. And apparently the traces that real architecture researchers use are gigabytes in size. Writing the simulator in OCaml was a joy; testing and running it was not.

There were some optimizations I didn’t do. I was reading memop-by-memop rather than in blocks of memops. I ran all of my simulations in parallel: read in a memop, simulate the memop in each type of cache, repeat. I could have improved cache locality by reading in a block of memops and then simulating in sequence; I’m not sure how the compiler laid out my statistics structures. I could’ve also written the statistics functions in C on unboxed unsigned longs, but didn’t have the patience. I’d still have to pay for boxing and unboxing the C structure every time, though. Still: one lazy summer week, I may give the code generation for boxed integers a glance.

JS2/ES4

After reading Brendan Eich’s annotated slides from @media Ajax. I was formerly of two minds: on the one hand, I’d started to feel like JavaScript was hopeless, BASIC with closures and a dynamic object system that precludes efficient compilation; on the other, I’d started to feel the JS FP elitisim that Brendan so acutely calls out. The structured type fixture example in Brendan’s talk is particularly convincing — I could use that, definitely.

Then again, I’m not sure I’ll ever get the chance. It’s interesting that PL — and many other fields — is often more defined by the tools it happens to use (or happened to use at some point in the past) rather than problems of interest. What circumstances determine the used features of a programming language? How can feature use be encouraged (or discouraged)?

C# GC Leaks

Reading this experience report from the DARPA challenge via Slashdot, I wondered: if event registration caused object retention, how can we deal with these memory issues in Flapjax?

Worrying about memory leaks in JavaScript is a losing battle — the browsers have different collectors. But given functional reactive programming in other settings (e.g., Java/C#), how can we solve this kind of problem? We looked at deletion briefly, but never had time to address the issue in detail. The complexity in our case is that the event registration encodes the entire application — how do you decide that a certain code path is dead? It may be helpful to view a functional-reactive program as a super-graph of data producing, splitting, merging, and consuming nodes; the application state is the subgraph induced by the active nodes reaching the consumers. Then there’s a certain static overhead for the program, plus the dynamic overhead of its active components.

Most of the Flapjax code I’ve written has been for the user interface, so binding and unbinding isn’t much of an issue: if the interface changes temporarily (e.g., tabs), the older behavior is usually recoverable and shouldn’t be collected. When working with more complex models with longer lived state, a deletion policy is more important. Shriram has been working on larger Flapjax applications with application logic as well as presentation — I wonder if he’s run into serious GC issues.

Lifting in Flapjax

In the Flapjax programming language, ‘lifting’ is the automatic conversion of operations on ordinary values into operations on time-varying values. Lifting gets its name from an explicit operation used with Flapjax-as-a-library; we use the transform method call or the lift_{b,e} functions. To better understand lifting, we’ll walk through a simple implementation of the Flapjax library.

I consider the (excellent) Flapjax tutorial prerequisite reading, so it will be hard to follow along if you’re not vaguely familiar with Flapjax.

The following code is all working JavaScript (in Firefox, at least), so feel free to follow along.

Continue reading

The price of cloneNode

So the function fix_dom_clone described in my last post isn’t exactly cheap. In fact, it’s far and away where my lens library is spending most of its time.

Function Calls Percentage of time
fix_dom_clone 4920 47.22%
deep_clone 6022 5.57%
get 3513 5.51%
dom_obj 20842 5.34%

I’ve implemented a few optimizations to reduce the number of times it needs to be called, but its recursion is brutal. The DOM treeWalker might be more efficient than what I have now. I don’t think it can matter much, because according to this website, IE and Safari don’t support it.

Attack of the cloneNodes

So the solution to the bug I had yesterday was fixed with a call to element::cloneNode to avoid aliasing. This introduced, to my great consternation, another bug — some DOM nodes were reverting to their default value. Had I written this down in my (as yet hypothetical) bug journal, it might have become more clear. Instead, I slaved away in Firebug for a few hours without results.

Thinking about it clearly, the problem had to be in cloneNode. I ended up having to write the following recursive fix-up function:

/**
 * List of all DOM event handler names.
 */
var dom_events = 
['onblur', 'onfocus', 'oncontextmenu', 'onload',
'onresize', 'onscroll', 'onunload', 'onclick',
'ondblclick', 'onmousedown', 'onmouseup', 'onmouseenter',
'onmouseleave', 'onmousemove', 'onmouseover',
'onmouseout', 'onchange', 'onreset', 'onselect',
'onsubmit', 'onkeydown', 'onkeyup', 'onkeypress',
'onabort', 'onerror']; // ondasher, onprancer, etc.

/**
 Fixes copy errors introduced by {@link element#cloneNode}, e.g. failure to copy classically-registered event handlers and the value property.

 @param {element} o The original DOM element
 @param {element} copy The result of o.cloneNode()
 @return {element} A modified copy with event handlers maintained
*/
function fix_dom_clone(o, copy) {
    if (!(dom_obj(o) && dom_obj(copy))) { return; }
    
    for (var i = 0;i < dom_events.length;i++) {
        var event = dom_events[i];
        if (event in o) { copy[event] = o[event]; }
    }
    if ('value' in o) { copy.value = o.value; }
    
    // recur
    var o_kids = o.childNodes;
    var c_kids = copy.childNodes;
    for (i = 0;i < o_kids.length;i++) {
        fix_dom_clone(o_kids[i], c_kids[i]);
    }
}

Oof. Unsurprisingly, there are a few efficiency issues.

My bug was weird and unexpected, and the W3C DOM level 2 spec doesn't allude to problems like this, but looking at a Mozilla bug report on the topic, it seems that the W3C DOM level 3 spec says that "[u]ser data associated to the imported node is not carried over". I guess if that's true for event handlers, it's also true for the value property. Oh well. I'd feel better about this irritating API "feature" if they said "associated with".

Another nasty bug — and an idea

I spent about two hours tracking down a DOM node sharing bug — nodes were being put into a new structure outside of the document before the salient data had been read out. While there was no information in these nodes, the lens system insisted that they still be there. (More on that — eventually.)

After finally tracking it down and writing a version of cloneNode that also copies event handlers, everything worked. Between this and the last prototype aliasing bug I had, I got an idea. A programmer could keep a “bug journal”, a list of bugs found and described first by their behavior, then by their solution (and, if those two aren’t descriptive enough, the underlying problem should be described as well). For example, two days ago I ran into my first genuine typing bug in JavaScript — a type checker would have rejected my program, and from the errors generated it wasn’t obvious where the problem was.

This practice could be useful in a few ways. First, the process of writing down the description can help the programmer find the solution. Tedious, but perhaps worthwhile. Surely some bugs would end up being described post facto, since it’s not worth the time when the fix is fairly clear. Second, the solution may add to a ‘bag of tricks’ at the programmer’s/team’s disposal. Third, the bug and solution tease out invariants in the program, and so the bug journal could be gleaned for inter-module and system-level documentation.

Fourth, and dearest to my heart, I think it’s an interesting way to evaluate programming languages. The bug log of a programmer writing an e-mail interface in Java and that of one writing such an interface in JavaScript would provide for some interesting contrasts. To provide more than anecdotal evidence, you’d need to use a much larger sample size of programmers and kinds of programs being written.

I think the bug journal would differ profoundly from examination of bug tracking logs. Only the truly mysterious bugs and the large, architectural shortcomings make it into the tracker. On a daily basis, programmers struggle with making buggy code workable — before it ever hits version control. So bug tracking logs can highlight difficulties in design and with the team, but a bug journal shows exactly what a programmer has to deal with.