Sunday, June 19, 2016

Reflections on PLDI

My original reason for attending the conference was to meet with Fabrice Rastello to discuss our upcoming project on performance debugging and auto-tuning. While he is of course very busy at this kind of venue (and everywhere else, I think!) we did manage to find an hour for a focused discussion about the state of current work on the project, and the basic components to be added during my internship. Our project relies substantially on finding and/or creating a flexible emulation platform that supports custom ISA definitions along with stable execution of concurrent programs. Our top candidates are gem5 and QEMU, though both suffer from chronic disorder in the codebase, and neither offers all the functionality we need. While it would be ideal to apply some of the fancy tools that were presented this week at PLDI, such as Heule's work on automatically learning the x86-64 instruction set, we may be stuck manually implementing a lot of instructions in one of these messy emulators.

Since my advisor Brian Demsky and I have recently submitted a security paper to CCS, we thought it would be advantageous to enter the same work in the PLDI student research competition. This turned out to be a bit of a squeeze, since our CCS submission was a bit early to begin with (one benchmark missing, for example), and we had not thought much about presenting the ideas in short form. But the effort paid off--several people offered interesting feedback about our tool during the poster session, which led to a few important additions to my presentation for the second round. I've asked the judges for their feedback as well, though I think the competition has been more than enough extra work for them already :-)

One thing I always enjoy at conferences is talking with people who I rarely see anywhere else. This week I caught up with Matt Brown from UCLA, who is continuing his work on self-typed and self-optimizing interpreters in lambda calculus. He also told me about an interesting project at Viewpoints Research where he did an internship recently--combining copy-on-write immutable semantics with traditional programming language features to make that paradigm more widely accessible. Kirshanthan Sundararajah from Purdue is working on a cache locality optimization for dual tree traversals--i.e., traversing the same tree twice simultaneously, for example as nested subtrees. This technique automatically inverts the traversal at the point where the inner subtree becomes fully cache bound. He claims the optimization can be done entirely at compile time, based on static analysis, though I'm waiting to see the paper!

Of course I enjoyed discussing research with the other student contestants as well, though you can read about their work in the official abstracts. Three of the finalists had techniques that I may be able to use in various aspects of my security work. It was especially convenient that my poster was placed in between two of them, so I had plenty of opportunities to discuss the details.

Papers and Presentations


There were 10 papers that I found especially interesting and relevant to my current and recent work, or that I believe are especially important for the future of programming language research. The presentations were mostly disappointing--in most cases I was able to learn more about the technique from less than one page of the paper than from the entire 30 minute talk. It makes me wonder if there should be a separate acceptance phase for presentations, such that some accepted papers would not be presented during the conference sessions. This would free up significant time for other activities which might be more beneficial for the attendees anyway.

  1. End-to-end verification of information-flow security for C and assembly programs 
    • Verification of low-level implementations is miserably technical, yet is handled in an elegant manner by Costanzo et al., and was also presented in a very clear and understandable way.
  2. On the complexity and performance of parsing with derivatives 
    • This paper stands out for me only because the talk was excellent--despite the speaker's distracting habit of getting stuck on a phrase and taking several attempts to finally say all the words.
  3. A design and verification methodology for secure isolated regions 
    • Isolated regions provide great security benefits, but only up to the limits of our verification techniques. While I'm no expert on the subject, this seemed like an excellent improvement over existing approaches. 
    • My only complaint about the presentation is that the author completely neglected to introduce the Intel SGX instruction set, or explain anything about secure isolated regions. 
  4. Transactional data structure libraries
    • Transactional software components are one of my favorite topics, and I believe the efficiency of software development in large, business-critical applications will come to depend more and more on them. 
    • The presentation covered all the main points, but was a bit confusing on just about every aspect. Several members of the audience asked basic questions like, "xyz... what exactly is that?"
  5. Data-driven precondition inference with learned features
    • Learning preconditions is a fundamental step towards many aspects of reliable computing. This technique significantly reduces the manual effort required to establish accurate preconditions.
    • However, the presentation was not useful at all. The time would have been much better spent reading the paper.
      • The basic problem was never established: i.e., that inference operates on a fixed set of atomic predicates which must be manually specified or derived from somewhere.
      • Focused mainly on the PIE workflow, rather than explaining how it infers the aforementioned set of atomic predicates.
      • The transition to the subtopic of inferring loop invariants was totally abrupt, entirely lacking any attempt to relate the two applications of the inference technique.
  6. Input responsiveness: using canary inputs to dynamically steer approximation
    • Sophisticated tools often miss opportunities to greatly improve their performance by taking a few simple, intuitive steps to learn things that a concrete analysis couldn't resolve in years of runtime. This approach is a great example.
    • The talk really needs to focus on a basic example, showing exactly what kind of approximations are tolerated by the conventional approach, and how the irrelevant options are pruned by the canary tests.
  7. Stratified synthesis: automatically learning the x86-64 instruction set
    • Everyone knows the details of hardware instructions (or lack thereof) are a constant source of trouble in low-level implementation. After just a few weeks working with DynamoRIO, I found a bug in the encoding of a multimedia instruction that had not been encountered despite years of regular use in dozens of research projects. This technique could be a game changer for all kinds of bare metal tools.
    • The presentation never bothered to introduce the most basic aspects of the technique. It dove directly into comparisons of various things, and salient details about something or other, but never explained how the thing actually works. The first paragraph of section 3.1 is far more useful than this entire 30-minute talk.
  8. Fast synthesis of fast collections
    • Well what more could you want from a synthesis tool? But the presentation didn't give me an idea of its capabilities, and the paper seems to relate everything in terms of prior work, which is never adequately summarized. 
    • The basic mechanics of the tool were also completely elided from the talk. Reading one page of the introduction (paragraphs 3 through the end) filled in the blanks for me in less than 2 minutes.
  9. Just-in-time static type checking for dynamic languages
    • Definitely a must-have for any type-sloppy language. I can't critique the talk because I missed it :-)
  10. FlexVec: auto-vectorization for irregular loops
    • An interesting approach to improving vectorization in a cost-effective way, but I'll have to read the paper to say anything more about it (missed the talk). 


I was only able to attend half a day of ISMM, and while I found all the topics interesting, many of the presentations were missing the key points. My favorite paper is Liveness-based garbage collection for lazy languages, not because I thought it was the most important idea, but because it was the only one I could reasonably understand without reading the paper! I'll read the others later--but as for promoting their work at the conference, those authors certainly missed their chance with me.


The talks in the "Worst-Case Analysis and Error Handling" session were all very interesting, but none provided significant information beyond the published abstract. Fortunately I brought my laptop and was able to use the time to make slides for my presentation.


The Fess Parker hotel was luxurious, and the meals and coffee breaks were well supplied with excellent fare. Here in Orange County, most of the beach-front property is private, so it was especially nice to enjoy waterfront views from the lunch table. However, I was slightly annoyed that the hotel room rates were totally unaffordable, even for a shared a room. Parking was $19 per day, making it unreasonable to drive from another location as well. It turned out ok though--I found an airbnb up on one of the hills and packed my trusty road bike to facilitate the commute. Santa Barbara is a nice town to bike in, once you know the pleasant way across the freeway and some efficient routes through town. It was also relieving to get away from the sterile confines of the hotel in the evenings and enjoy the more organic setting of a neighborhood full of trees, birds, fresh breezes, and the inimitable tones of wild coyotes.

No comments:

Post a Comment