Another CIFS server topic to blog about: Filesystem I18N

•November 7, 2007 • Leave a Comment

Late last year I wrote a blog entry about filesystem internationalization. It did not discuss the CIFS server project, but that was definitely on my mind. We had recently had several discussions about the various filesystem I18N problems, and that blog post more or less summarized what we’d decided to do. Now you can see that the Solaris I18N engineers (e.g., Ienup Sung) have delivered several major components:

So now we have codeset conversion APIs, Unicode encoding conversion APIs and Unicode normalization and string preparation APIs in the kernel and in user-land, including Unicode case-folding, and case- and normalization-insensitive Unicode string comparison. That’s impressive! Ienup and his team did an excellent job of that. I did code reviews for Ienup, and feel that the code is quite good. Congrats Ienup!

It’s time to update that ASCII art picture in my fs I18N blog entry to include SMB…

Did I mention that these components are open source?

Dealing with Windows SIDs in Solaris, part 2

•November 7, 2007 • Leave a Comment

As described in my first post on this subject, Solaris can now map SIDs to POSIX UIDs/GIDs and back, and it can store SIDs in ZFS. The identity models of Windows and Solaris are now unified, and the ACL model of ZFS has been extended. And we have a unified administration of SMB and NFS shares. Wow. I find this exciting, and not just because I’ve worked on parts of this story.

In this post I want to walk through the identity mapping facility’s design. Next I’ll talk about implementation, and then about how to use the facility.


Design of the Solaris ID mapping facility

The salient points of the design of the ID mapping facility are:

  • A door server daemon provides the ID mapping service (idmapd, svc:/system/idmap:default)
  • The service can be called via libidmap and the idmap kernel module
  • Given the need to map many SIDs at logon time and when dealing with ACLs, the idmap door protocol (an ONC RPC protocol), the kernel and user-land APIs, and the idmap service are designed to batch operations, to reduce latency by taking advantage of parallelism in the directory server and the network
  • There are two SQLite 2.x databases:
    • /var/idmap/idmap.db — contains persistent name-based ID mapping rules
    • /var/run/idmap/idmap.db — caches Windows name<->SID lookups, ID mappings and ephemeral ID allocations
  • There are two types of ID mapping today:
    • ephemeral ID mapping, where we dynamically allocate the next available UID or GID from the erstwhile negative uid_t/gid_t namespace (uid_t and gid_t are now unsigned), but we forget these mappings on reboot (see part 1 for more)
    • name-based ID mapping, where the sysadmin provides rules for mapping Windows users and groups to Solaris users and groups; these rules use names and wildcards, not SIDs and UIDs/GIDs.
  • There are three private system calls for idmapd:
    • idmap_reg() to register idmapd’s door; besides a door fd argument there’s a boolean that tells the kernel whether idmapd was unable to open/recover /var/run/idmap/idmap.db (see below)
    • idmap_unreg(), which is called when idmapd exits cleanly
    • allocids(), which allocates a number of ephemeral UIDs and GIDs for idmapd to use for dynamic ID allocation
  • The kernel tracks what ranges of ephemeral IDs are valid. Initially there are no (or very few) valid ephemeral IDs, and the kernel begins allocation of IDs at 2^31. When idmapd crashes in a way such that the ephemeral DB is unrecoverable then idmapd tells the kernel about this when it registers, and the kernel moves the low end of the valid ephemeral ID range to match the top end.
  • idmapd uses asynchronous LDAP searches of the Active Directory Global Catalog to do name<->SID resolution
  • idmapd uses DNS SRV RR lookups and LDAP searches of the Active Directory configuration partition to auto-discover Active Directory domain, forest and site names, and domain controller and global catalog directory server names

  • I’ve run out of time. I’ll cover implementation details next.

    Dealing with Windows SIDs in Solaris

    •November 6, 2007 • 4 Comments

    The CIFS server project integrated into ONNV / OpenSolaris in build 77. This is a very important milestone for Solaris, which now has a fully integrated native SMB server running in the kernel.

    “Fully integrated” implies many good things:

    • proper interaction of SMB share modes, file locks and oplocks with NFSv4equivalents
    • proper integration with Solaris administration utilities (e.g., sharemgr(1M))
    • support for case- and Unicode normalization-insensitive but case- and normalization preserving filesystems (yes, we now have Unicode normalization code in the kernel!)
    • integration of the Solaris and Windows identity models
    • filesystem support for the integrated identity model, as well as extended ACLs to support Windows ACL features
    • etcetera

    That is a very significant list! A lot of work went into this project and related sub-projects.

    I’ll be blogging about the integration of the Solaris and Windows identity models, both in this post and subsequent ones.


    Solaris has distinct, small, flat user and group identity namespaces (POSIX UIDs and GIDs). Windows has a unified, practically unlimited, and non-flat namespace for user and group identities (SIDs). There’s a very high impedance mismatch there!

    We knew we’d need to map between these two models, so we started a project to do that. We needed to be able to map any valid SID in an AD forest to a Unix UID and/or GID, as needed. And we needed such a system to be low-configuration, easy to use, and safe. Mapping between these models isn’t hard, it’s the other requirements that were challenging to tackle.

    Initially we pursued a notion of persistent dynamic mappings within each Unix (NIS/NIS+/native LDAP) domain, but Mike Shapiro helped us simplify things greatly with an outside-the-box idea: use the heretofore unused “negative” UID/GID namespaces for ephemeral dynamic ID mapping, thus removing two big problems with our earlier design (the need to configure a pool of IDs and the reliability issues associated with having to persistently store important mappings).

    “Negative UID/GID namespaces”, you ask? Until now uid_t and gid_t have been signed 32-bit integers in Solaris, but the relevant standards (POSIX, SUS) require UIDs and GIDs to be positive integers, which means that we wasted almost half of the uid_t/gid_t namespace. Mike’s insight was that we could use that wasted ID namespace as a pool of IDs that we can dynamically allocate IDs from, resetting the pool at boot time, and that this wouldn’t be too expensive in terms of incompatibility (more on that below). So we changed the uid_t and gid_t types, and we reserved the 2^31..2^32-2 ID namespace for Solaris-driven allocation (i.e., customers cannot assign these IDs directly).

    ID mapping then works as follows:

    • there is an ID mapping service, svc:/system/idmap:default
    • the idmap service is accessed via RPC over doors only (i.e., it’s a local service)
    • by default the idmap service validates SIDs and maps them to the next available ‘ephemeral’ UID or GID, and this mapping persists until the system reboots (more on this below)
    • the mapping service also offers name-based ID mapping, where you can map Windows domain users and groups to Unix users and groups by name
    • the consumers are: the SMB server in the kernel and in user-land utilities, the NFSv4 user-land nfsmapid daemon, the kernel ksid*() functions (which are called from cr*() kernel functions that deal with cred_t), and the idmap(1M) utility

    Now, using ephemeral IDs in the erstwhile negative ID space has some implications. First and foremost: ephemeral IDs must not be persistently stored anywhere, including in filesystem objects. Because that is far too restrictive the Solaris VFS and one filesystem, ZFS, have been modified to support storing SIDs instead of ephemeral IDs (the other filesystems simply reject any attempt to store an ephemeral ID). You read that right: ZFS can now use SIDs in ACL entries! Most applications will already do the Right Thing — either reject or pass through ephemeral IDs — and those core Solaris apps that needed modification have been modified. C++ mangled symbols for methods that take uid_t or gid_t arguments will change on recompile (this was deemed acceptable). For more information you should see the ARC case that covers ephemeral IDs (which will be available soon, as I understand it).

    By the way, I think Solaris may now be the first non-Windows implementation of NFSv4 that supports the use of user/group names from many domains on the wire!


    Next up: current limitations of ID mapping, ongoing sub-projects, and a guided tour of the source. The impatient can start by looking at:

    You can find calls to various ID mapping and related functions using the OpenSolaris source code browser, of course.

    Phishing as a man-in-the-middle kind of attack

    •October 11, 2007 • 1 Comment

    I gave a presentation to the Liberty Staff yesterday about Phishing as an MITM attack, and what can be done about it. I think it went very well, and I’m very excited that I’ll be meeting people I didn’t know who are working in this space, and that we could have a significant impact on the future of web authentication and in ridding us of phishers.

    I don’t have enough time to give this topic a complete treatment in this blog entry, so I’ll stick to a very short summary. The Internet-Drafts and links to them that are relevant here are linked to in earlier blog entries of mine. Rest assured though, I will be writing more link-rich blog entries about this topic soon enough, and I’ll post my presentation once I add a few more slides (mostly to include relevant links and to give credit where it’s due — I had to limit myself to two slides for the presentation itself!).

    The gist of this presentation was: phishing is not about stealing passwords, it is about stealing our money — passwords are gravy to a phisher. If we replace cleartext passwords in HTML forms POSTed over https as our predominant method of web authentication, but aren’t careful enough to defeat MITM attacks then phishers will still be in business, and they’ll still steal our money. Note that there are practical MITM attacks that phishers can and do mount that are not on-path attacks (i.e., the phisher need not be in the route path from the client to the server) — think of URLS like “http://www.yourbank.tld:sdfjkfgsdf@clever-phisher.tld/login.php”.

    It’s crucial that we understand that neither DNS registrars nor certificate authorities care to help, nor are they in a position to help us defeat phishing.

    Here’s where Project Liberty comes in: federations, by dint of being much smaller than the Internet as a whole, can help. Federations can help by providing a way to authenticate relying parties to user agents, or at least by providing a relying party authorization function for user agents — given this then federations can act as whitelists and keep the phishers out. There is a crucial UI element here though: users need to be able to know what identity they used to authenticate to some server, what the server’s name is and what the name of the federation is that mediated mutual authentication.

    Besides the message that federated mutual authentication provides a mechanism to keep phishers out, there’s also the issue of ensuring that there are no practical MITM attacks left to phishers. This is where channel binding comes in. If authentication happens about the HTTP/TLS layer, then we need to make sure that the server we think we’re talking to at that layer is the same as the one at the HTTP/TLS layer, or we have to make sure that all messages to the server are additionally proteted about the HTTP/TLS layer (this last is never going to happen). So either we push authentication down the stack, to the HTTP/TLS layer, or we need to provide some way to bind web authentication to the HTTP/TLS “channel.”

    I described several ways to do the channel binding and mutual authentication.

    Credit for these ideas, by the way, goes to Sam Hartman, Leif Johansson, and the IETF usual suspects who helped refine them (Jeff Hutzelman, Jeff Altman, Love Hörnquist, RL “Bob” Morgan, and Lisa Dusseault, Chris Newman, and many others).

    Improving Web Security: AJAX + GSS/SAML/other authentication + channel binding to TLS

    •September 18, 2007 • Leave a Comment

    Authentication

    Leif Johansson has a couple of Internet-Drafts (proposals for RFCs) [1][2] that would provide for a novel way to deal with authentication in web applications. This is all related to Sam Hartman’s Internet-Draft on dealing with phishing. The idea in Leif’s I-Ds is to:

    1. provide a way to do multi-round-trip user/server authentication over HTTP 1.0 (and 1.1), starting with the GSS-API (but applicable to SAML profiles and other schemes),
    2. bind this authentication to the TLS sessions used between the client and the server,
    3. and use cookies to bind all those sessions together with the same authentication event.

    Imagine that you have multiple identities, which you may have enrolled for in a variety of ways (such as at a brick&mortar location, or online via another identiy (e.g., via your ISP), or online much like when you sign up for free e-mail accounts.

    Now imagine that you can authenticate these identities using a strong network authentication mechanism (say, Kerberos V), so you don’t have to type passwords into forms anymore. And imagine that there are authentication federations, so that you can use one identity in many places without having to expose your passwords to many servers. Lastly, imagine that you can authenticate to some website using an HTTP URL (as opposed to HTTPS) and that immediately you’re redirected to the HTTPS URL for the same without having to click through the “give your money away to the nice attacker?” dialog box. Oh, and one more thing: this happens with some website designer UI control.

    And all that with protection against MITMs, without necessarily depending on DNSSEC, nor on a true PKI.

    That’s what we’re talking about here.

    The components of this are, then:

    The XMLHttpRequest extensions I have in mind are, roughly:

    • a method for requesting GSS-API (or other) authentication with a given {mechanism,
      identity, [federation]} tuple as an argument;
    • a method for requesting GSS-API auth but with the browser displaying
      an identity selection dialog or, better yet, with a DOM object
      representing where on the page the browser should prompt for identity
      (but the browser should not make its UI elements for this available to
      the calling script via the DOM, so there’s no way to leak the set of identities the user has to any script on the page);

    When either of these methods is used the browser will then do the HTTP_S_+GSS+channel binding dance,
    even if the page where the script is running came from HTTP_not_S.
    And then it will, for the length of the session, accept the server’s
    certificate as valid for the server’s name from the URL being fetched
    (which, while we’re at it, must not be cacheable).

    The script should set a cookie when the XMLHttpRequest succeeds —
    we can’t be doing the HTTPS+GSS+cb dance for every URL, just once a
    session, thank you (or until the cookie expires).

    We’ll want another host object to help with online enrollment. This
    object/class (prototype) should have a method to set the credentials
    for a {mechanism, federation, identity} tuple. When this method is
    called the user should be prompted in browser chrome as to whether to
    accept this credential, and whether to accept it only for this host,
    this session, whatever.

    Putting it all together:

    1. The user visits an http:// URL (no https; using https is OK, but it
      requires a server cert valid to some trust anchor).
    2. The page a that site loads a script that uses an XMLHttpRequest
      object to do GSS+channel binding, authenticating the user to the site, the server
      to the client, and binding the server’s cert to this authentication.
      The script will set a secure only cookie for fast user
      re-authentication and it will ensure that the selected ID is remember
      subsequently.
    3. The UI will look like this: there will be an “authenticate” button,
      and maybe an “authenticate as…” button — click there and it all
      goes. But also there will be another kind of lock icon to go with the
      TLS one, or perhaps a different shape and color for the existing one,
      to indicate that stronger authentication has been done (we want users
      to want this).

    Enrollment

    So you travel and sometimes use kiosks or other people’s ‘puters.
    Which means you may not have your long-lived credentials with you. All you have
    then are plain old username+password credentials, and, perhaps, the ability to use your cell phone. So you do
    traditional username+password (preferably a temporary one obtained via your cell phone) form authentication, and a script
    enrolls new GSS creds for you and your browser. But there may be a
    field in the HTML form for limiting the lifetime of the resulting
    credentials (described, perhaps, as a session?).

    When you enroll for an identity for the first time there’d be no
    username+password, just captchas/whatever, of course.

    Actually, this too might be a good candidate for design as a new
    method of XMLHttpRequest…

    So we can’t get rid of passwords in all circumstances —
    preferably you could carry your non-password credentials on a token, but
    we’ll assume you can’t. But perhaps we can get the number of
    passwords you need to remember down to a manageable few, corresponding
    to how many federations you have identities in.

    I think some parts of this might be very easy to prototype with
    Mozilla, particularly if we stick to building only extensions to the
    XMLHttpRequest object (then we don’t have to learn how to write
    plug-ins, new host objects, etc…).

    SOAP indigestion continued

    •September 18, 2007 • Leave a Comment

    Ah, a friend tells me that negotiation of all security related things too, like key derivation, is left to application profiles. Even such crucial aspects of a security system like protection against replay attacks is left to application profiles. I see no discussion of reflection attacks in SOAP, so I hope that there’s some indication of directionality in the parts of SOAP messages that are signed, or that app profiles always use different session keys for each direction. *sigh*

    In the IETF we have security frameworks like TLS and SASL that could be as profiled as SOAP is, but by and large all such frameworks come with required-to-implement functionality that makes it possible to have off-the-shelf implementations that Just Work ™. The IETF mostly does not produce standard APIs and specific programming language APIs for them (exceptions: the GSS-API, SCTP), so each off-the-shelf implementation of, say, TLS, may have its own APIs, but at least implementations of application protocols that use TLS just work and just interop without having to have per-application profiles of TLS. TLS handles most of the security-sensitive aspects of a serious security framework, such as negotiation of key exchange and authentication mechanisms, ciphers, MACs, key-derivation, re-keying, replay protection, reflection protection, etc…

    Application developers are usually not cryptographic protocol designers. They shouldn’t have to know too much about mundane (to them) issues like key derivation, damn it. Complicating a security analyst’s job by pushing so much of a security specification to every application profile doesn’t help either. The OASIS WSS TC has done SOAP developers a disfavor here, to say nothing of what it’s done for SOAP users.

    Of course, SOAP can use TLS. But when it does there is no binding between the use of SOAP Security token profiles for authentication and the TLS sessions used. But at least that’s no worse than the sate of authentication for web applications. And there is hope for them still (I’ll blog about that sometime).

    SOAP, Security, Kerberos V, indigestion

    •September 17, 2007 • Leave a Comment

    Twice in the past 5 or so years I’ve sent the OASIS WSS TC some comments on the Kerberos V Token Profile 1.1 for SOAP Message Security 1.1. You can probably go find the mails in the archives. I never got answers to the questions that really mattered, and then I dropped the matter — I’m not a SOAP implementor, after all — though I had cared out of fear that someday the Solaris krb5 team would be asked to make it possible to implement a problematic spec.

    A colleague asked me about this profile today. So I went to find answers, again.

    Nowadays the Kerberos V Token Profile 1.1 is no longer a draft.

    Now, reading the two specs (SOAP Message Security 1.1 and the Kerberos V Token Profile 1.1) I note that:

    • The TC added support for using the initial context token from the Kerberos V GSS-API mechanism, probably in response to a question from me as to why they didn’t. I so wish they hadn’t. The right thing to do would have been to use all of the GSS-API mechanism, including its per-message token services, or that they use the GSS_Pseudo_random() function to extract keys from the mechanism’s security contexts. As it is I think we’ll need GSS-API extensions to get at mechanisms-specific internal contents: the krb5 session and sub-session keys.
    • Key derivation is still unspecified, in the token profile and in the base spec, which means that you can only use XMLenc ciphers and XMLdsig signature algorithms whose key sizes are compatible with the enctype used in the Kerberos V AP-REQ Authenticator for the sub-session keys (or in the Ticket for the session key), or you must come up with some profile that specifies key derivation, or you can choose not to interop.
    • Negotiation of things like XMLenc ciphers is also out of scope (in the base SOAP security spec; perhaps it’s specified elsewhere?).

    *sigh*

    I think SOAP would benefit from using TLS for transport security and PKI/Kerberos/… for authentication with channel binding to the TLS session. Of course, that would require further profiling, and I gather everything SOA is political, so getting there seems unlikely. But I think I’ll try, particularly when On Channel Bindings is published as an RFC (it’s in the RFC Editor‘s queue!).

    In Austin for ACL? Don’t miss Gotan Project

    •September 14, 2007 • Leave a Comment

    I saw Gotan Project last night. Wow. What a show. They play again tonight, don’t miss them. The crowd loved them, and they even loved it when they played a straight Tango tune, Canaro en París (a guitar duo version can be heard here), a very difficult piece (it gets very fast at the end), both to play and to dance. Gotan Project’s version of Canaro en París was one of the fastest I’ve heard yet. The crowd (mostly Austinites) even screamed “¡Otra! ¡Otra! …,” demanding an encore.

    Unfortunately, there is no suitable dance floor at Stubb’s, and what there is was crowded.

    Gotan Project had:

    • one celloist
    • three violinists
    • one piano player
    • one bandoneonista
    • one guitarist
    • one DJ
    • one keyboardist
    • one singer
    • and a heck of a sound

    Some things were canned: drums, and vocals for at least one song.

    IDNAbis — must not miss I18N presentation

    •August 1, 2007 • Leave a Comment

    At last week’s 69th IETF meeting (in Chicago) there was a presentation on the IDNAbis effort at the SAAG meeting on Thursday that anyone with an interest in I18N should look at and listen to (the presentation starts about 28 minutes into the recording).

    For me the biggest takeaway is this: if you want Unicode version agnosticism, and you *should* want that, then you need to think carefully about where unassigned codepoints will be dealt with. In particular, IF you use ACE encoding on the wire in your protocols then you need only worry about Unicode versions supported at the client end — a very important point. Of course, administrative authorities must be the ones to enforce rules about Unicode version use, and about use of codepoints heretofore considered dangerous, the latter in context- and language-specific ways (another crucial, and brilliant, insight in the IDNAbis presentation).

    One of the attributes of the IDNAbis proposal is that a lot of constraints from stringprep would be relaxed significantly, to the point where we can, and should, consider the use of A-labels (meaning, the output of toASCII(), that is, punycoded strings), on the wire in critical protocol elements. In particular I’m thinking that Kerberos V I18N should just shove ACE into all instances of GeneralString in the protocol, augmented with UTF8String and OCTET STRING (for legacy names from just-send-8 deployments) aliases of principals and realm names to support migrations.

    C with continuations?! A report on a conversation in Cancún

    •June 1, 2007 • Leave a Comment

    I shared my thoughts on async, async everywhere with my friend Sam while on a bus ride to some ruins in Cancún this past weekend, and his reaction was, “sure, but why not continuations everywhere, why not just add continuations to C?” This floored me: I’d never considered such a thing — it seems so… foreign, out of place, yet so clever. Needless to say, we proceeded to have a lively conversation about this.
    What would it mean to add something like Scheme’s call/cc to C? We thought it’d mean this: that all function call frames would be allocated from a heap and would be garbage collected, and that there would be no stack, plus all attendant calling conventions changes. (Such a C might as well also have closures, while we’re at it.)
    Such a C would have to deal with alloca() (e.g., by turning alloca()s into heap allocations and recording those in the activation record so they can be freed when the activation record is garbage collected), and setjmp() and longjmp() (which, actually, become much simpler in implementation!), and such things. And probably would have to have a “foreign function call” facility for calling normal C code, to make it easier to start using this weird C.
    What would break? Anything that assumes that the addresses of automatic variables in different function call frames in the current code path are ordered in the same way as the function call frames — i.e., code that assumes a stack. And any C code that uses asm() to get at the stack pointer would likely break (a stack pointer could be provided, but its values couldn’t be changed by asm() code). That’s not a whole lot of code, probably, as a percentage of an entire OS, but it’d be a largish amount of code in absolute terms, perhaps even excluding code implementing other languages in C.
    The OS would still have to provide async versions of all synchronous system calls, unless, my friend cleverly points out, the kernel itself were built with such a C compiler!
    We’re talking about dirt cheap threading here, so cheap that it might outweigh the cost of garbage collection and heap fragmentation (there’s no way to do a copying garbage collector for a language like C, I’m afraid, and any GC would have to be very conservative) — it would have to in order to succeed. With threads this cheap you can start one any time you want, for any reason, and avoid having to write CPS-like code atop lots and lots of async interfaces, each implemented in CPS too.
    No, this will never happen. It’s a fantasy of people who wish we could write procedural-looking code with tiny amounts of syntactic sugar to enable lots of cheap parallelization, while retaining source-level compatibility with millions of lines of code.
    In reality programmers will always just have to deal with writing async, CPS-like code; we’ll just have to do what should have been the compiler’s job. Scheme lost. Tough beans.
    Still, if we want to parallelize code cheaply we’ll need thread creation to be cheap as well, dirt, dirt cheap. What other ways are there to make it so? Pre-creation of threads?