The Icon Programming Language

•September 7, 2006 • 2 Comments

The threads on closures and continuations for Java reminds me of my erstwhile favorite language: Icon.
Icon is an Algol-family language with a twist. Functions and expressions can do much more than just return values: they can “fail”, “suspend” values, and return values. “Suspended” function calls/expressions can be resumed where they suspended so that they can try again. Resumption is driven by downstream failure.
Under the hood the Icon compiler used CPS conversion and closures of dynamic extent (meaning: allocated on the stack, not on the heap) to implement continuations of dynamic extent (meaning: you could call a continuation only once and only while it remains in dynamic scope). Resumption was just a matter of calling the nearest suspended expression’s continuation instead of the current continuation, whereas suspending and returning both involve calling the current continuation.
All of this makes Icon natural for implementing depth-first backtracking search algorithms.
If you wanted breadth-first backtracking, you could always use “co-expressions” (a form of co-routine adapted to Icon concepts), but it was a bit artificial. If you really wanted breadth-first backtracking you’d have to use continuations of indefinite extent under the hood.
I can’t do Icon justice with some examples — it’s been too long since I’ve used it, but its website has lots of references and tutorials, and the main book about Icon is available in PDF.
Incidentally, there’s an Icon-in-Java implementation: Jcon :)

Musings on Java Closures and Continuations

•September 7, 2006 • 4 Comments

I see lots of posts on closures
and continuations
(more,
and more)
lately, many relating to
Java. See also lambda-the-ultimate.org.

Must I re-hash what these are? No, not really.

Now, I do find the evolution of Java here interesting.

There is a proposal
to add closures to Java
.

And these closures will support non-local exits to lexical exit points
of dynamic extent! Which means that, Java being a safe language, runtime
exceptions have to result from attempting a non-local exit that is no
longer in scope. How would one implement this? My guess: write a magic
value to a hidden variable closed by all closures that close a given
exit label prior to exiting the scope of each label closed by the
closure. But if Java had continuations then non-local exits from
closures could be made to have indefinite extent: closures that close
labels implicitly grab the associated continuations.

But no continuations for Java. At least not in that proposal.

Why not? Given Java as it stands today it should be easy enough (ha!
something a non-expert like me would say) to compile Scheme to Java,
just as Scheme can be compiled to C, just don’t expect to be able to
read the result, and don’t expect efficiency. One way would be to create
closures out of objects — but functional application then wouldn’t be
the same for Scheme-compiled-to-Java as for plain Java — then perform
CPS conversion on Scheme code, reify continuations, get “rid” of the
Java stack (each Scheme functional application would return a
continuation to a Java function whose job is to do a
call-receive-continuation-call-continuation-… loop). This would be
wasteful as the Java stack and associated machinery (think of stack and
frame pointers in C) continue to exist and be used but not for anything
useful. And such continuations could not capture native Java
continuations, only Scheme ones.

Given native Java support for closures this gets easier — now there’d
be only the CPS conversion and the getting-rid-of-the-stack bit to do.

Given that this can be done, and given that the JVM has much better
control over its stack than C, why not just implement continuations in
Java? One way would be to use stack copying, another would be to put all
the function call frames on the heap — these being the canonical
methods of implementing call/cc. With stack copying the impact on Java
apps that don’t use continuations should be near (but probably not
quite) zero, whereas putting call frames on the heap makes the cost of
grabbing and calling continuations near zero. Stack copying burdens
users of continuations with most of the runtime cost of continuations,
whereas putting call frames on the heap relies more on the garbage
collector, thus more evenly spreading the cost of making continuations
available across all users (though I suspect optimizations to lighten
the burden on call stacks where continuations are not grabbed are
possible).

It is said that LISP programmers know the value of everything and the
cost of nothing.

What’s the benefit of having native support for continuations? First, it
would make the job of a Scheme- or Ruby-on-Java implementor a lot
easier. Second, it would improve the performance of such
implementations. Third, native continuations would have the ability to
capture the state of a native Java call stack (via a closure passed to
some Java API that takes closures as arguments and calls them or passes
them to other APIs that might call them).

And the cost? There’s the cost of specifying continuations for Java,
implementing them (see above), and the runtime cost of making
continuations available (see above). Also there is the possibility that
native continuations that capture native Java state may represent a
subtle but non-trivial change to the language, particularly if there are
existing APIs that could be passed closures that can capture
continuations to be used and re-used later. It might be easier to deal
with such subtle changes if closures and continuations were introduced
at the same time.

My stab at answering my question, then: native Java support for
continuations would, at best, likely have some non-zero, negative impact
on the performance of existing Java applications, but few users would
reap the benefits of continuations, thus continuations aren’t worth it.
Such an answer, of course, ignores the things that native Java users
could, and probably would, do with native continuations, including
building all manner of backtracking algorithms, light-weight
multi-threading for event driven applications, etc… I don’t really
know how to judge the cost/value of native continuations, and I have
nothing to do with any of these efforts — I am a lurker — so take all
this with at least some salt, or perhaps lots of salt.

Aside: here’s a few useful books that I’d recommend:

ZFS and the automounter

•April 25, 2006 • Leave a Comment

So, you have ZFS, and you create lots and lots of filesystems, all the
time, any time you like.

Problem: new filesystems don’t show up in /net (-hosts special automount
map) once a host’s /net entry has been automounted.

Solution #1: wait for OpenSolaris to gain support for client-side
crossing of NFSv4 server-side mount-points.

Solution #2: replace the -hosts map with an executable automount map that
generates hierarchical automount map entries based on showmount -e output.

The script is pretty simple, but there is one problem: the automounter
barfs at entries longer than 4095 characters (sigh).

#!/bin/ksh
exec 2>/dev/null
if [[ $# -eq 0 ]]
then
logger -p daemon.error -t ${0##*/} "Incorrect usage by automountd!"
exit 1
fi
entry="/ $1:/"
showmount -e $1|sort|grep '^/'|while read p junk
do
entry="$entry $p $1:$p"
done
print "$entry"
logger -p daemon.debug -t ${0##*/} "$entry"

Comparing ZFS to the 4.4BSD LFS

•December 7, 2005 • Leave a Comment

Remember the 4.4BSD Log Structured Filesystem? I do. I’ve been thinking
about how the two compare recently. Let’s take a look.

First, let’s recap how LFS works. Open your trusty copies of “The
Design and Implementation of the 4.4BSD Operating Systems”
to
chapter 8, section 3 — or go look at the Wikepedia
entry for LFS
, or read any of the LFS papers referenced therein. Go
ahead, take your time, this blog entry will wait for you.

As you can see, LFS is organized on disk as a sequence of large
segments, each segments consisting of smaller chunks, the whole forming
a log file of sorts. Each small chunk consists of data and meta-data
blocks that have been modified/added at the time that the chunk was
written. LFS, like ZFS, never overwrites data, excepting superblocks, so
LFS does, by definition, copy-on-write, just like ZFS. In order to be
able to find inodes whose block locations have changed LFS maintains a
mapping of inode numbers to block addresses in a regular file called the
“ifile.”

That should be enough of a recap. As for ZFS, I assume the reader has
been reading the same ZFS blog entries I have been reading. (By the way,
I’m not a ZFS developer. I only looked at ZFS source code for the first
time two days ago.)

So, let’s compare the two filesystems, starting with generic aspects of
transactional filesystems:

  • LFS writes data and meta-data blocks everytime it needs to
    fsync()/sync()
  • Whereas ZFS need only write data blocks and entries in its intent log
    (ZIL)

This is very important. The ZIL is a compression technique that allows
ZFS to safely defer many writes that LFS could not. Most LFS meta-data
writes are very redundant, after all: writing to one block of a file
implies writing new indirect blocks, a new inode, a new data block for
the ifile, new indirect blocks for the ifile, and a new ifile inode —
but all of these writes are easily summarized as “wrote block #
to replace the block of the file whose inode # is .”

Of course, ZFS can’t ward off meta-data block writes forever, but it can
safely defer them with its ZIL, and in the process stands a good chance
of being able to coalesce related ZIL entries.

  • LFS needs copying garbage collection, a process involving both,
    searching for garbage and relocating data surrounding garbage
  • Whereas where LFS sees garbage ZFS sees snapshots and clones

The Wikipedia LFS entry says this about LFS’ lack of snapshots and
clones: “LFS does not allow snapshotting or versioning, even though both
features are trivial to implement in general on log-structured file
systems.” I’m not sure I’d say “trivial,” but, certainly easier than in
more traditional filesystems.

  • LFS has an ifile to track inode number to block address mappings. It
    has to, else how to COW inodes?
  • ZFS has a meta-dnode; all objects in ZFS are modelled as a “dnode” and
    so dnodes belie inodes, or rather, “znodes,” and znode numbers are
    dnode numbers. Dnode-number-to-block-address mappings are kept in a
    ZFS filesystem’s object set’s meta-dnode much as in LFS inode-to-block
    address mappings are kept in the LFS ifile.

It’s worth noting that ZFS uses dnodes for many purposes besides
implementing regular files and directories; some such uses do not
require many of the meta-data items associated with regular files and
directories, such as ctime/mtime/atime, and so on. Thus “everything is a
dnode” is more space efficient than “everything is a file” (in LFS the
“ifile” is a regular file).

  • LFS does COW (copy-on-write)
  • ZFS does COW too

But:

  • LFS stops there
  • Whereas ZFS uses its COW nature to improve on RAID-5 by avoiding the
    need to read-modify-write, so RAID-z goes faster than RAID-5. Of
    course, ZFS also integrates volume management.

Besides features associated with transactional filesystems, and besides
volume management, ZFS provides many other features, such as:

  • data and meta-data checksums to protect against bitrot
  • extended attributes
  • NFSv4-style ACLs
  • ease of use
  • and so on

OpenSolaris Opening Day — Where’s ssh?

•June 10, 2005 • Leave a Comment

Yay! OpenSolaris is out. The amount of work, energy and dedication that the OpenSolaris team has put into the project is fantastic. They got 132 engineers to write 215,000 words’ worth of blog entries for opening day (not enough, I suppose, given how many engineers we have, but that’s a large novel’s worth of postings written in a few short weeks — readers will be busy digesting all that material!). Crucial to this effort was easy blogging tech; blogging is here to stay, you’ll only see more of it.

O.K., but, where’s the Solaris Secure Shell? Ah, well, it will be there. It’s not there now, and that’s partly my fault as I didn’t notice that it wouldn’t be, and why, until it was almost too late, and my initial stab at a build system fix that would have allowed ssh to be in OpenSolaris was incomplete and so I missed the deadline. Please accept my apologies, readers and OpenSolaris team alike.

For more info on OpenSolaris and missing crypto code, see Darren’s blog entry on the topic.

An Ode to the Linker

•June 10, 2005 • 12 Comments

I’m no poet, but…

The Solaris linker is one of my favorite parts of Solaris. Long before coming to Sun I acquainted myself with the excellent Linker and Libraries Guide at docs.sun.com and used various linker tricks for fun and profit.

I see the linker as a set of extensions to the C language:

  • The control over the namespace afforded by object grouping amounts to a form of packages, in the Lisp or Perl5 sense.
  • The dlopen(3C) family of functions provides for run-time extensibility.
  • Symbol interposition provides a rudimentary form of “around methods,” if you’ll allow the abuse of terminology.
  • Filters provide a facility for coping with moves of interfaces from one object to another, and more.
  • Map files, together with header files, provide more control over the definition of an object’s ABI than just the headers alone.

Developers would be well advised to read the Solaris Linker and Libraries Guide, as well as Rod Evan’s and Michael Walker’s blogs.

Introduction

•June 8, 2005 • Leave a Comment

Since coming to Sun, almost three years ago, I’ve worked on the Solaris Secure Shell (ssh) and, to a lesser degree, on Solaris’ implementation of Kerberos V. Now I’m working on PAM and GSS-API issues, some involving standardization efforts at the IETF, particularly at the KITTEN Working Group.

I should have been blogging quite a lot by now, but for one reason or another I have not.
The advent of OpenSolaris will see me blogging more, I’m sure. Subject matter for blog entries does abound; I could blog on ssh, Kerberos, the GSS-API, the IETF, PAM, Least Privilege, and much more. Suggestions are welcome.