Programmer's Guide for libpkix

Version 0.4, July 14, 2004
Authors: Steve Hanna, Yassir Elley, Steve Weis

  1. Introduction
   1.1 Document Purpose
   1.2 Prerequisites
   1.3 Next Steps
   1.4 Purpose of libpkix

 2. What libpkix Can Do For You

 3. Fitting libpkix Into Your Application

 4. Top Level Functions

  5. Objects
   5.1 Reference Counting
   5.2 Immutable Objects
   5.3 Modifying Objects

  6. Threading
   6.1 Functions that are Thread-Safe
   6.2 Functions that Modify Objects without Protection
   6.3 Functions that are Not Reentrant

 7. Security

 8. Platform-specific Issues
   8.1 NSS/NSPR

1. Introduction

1.1 Document Purpose

This document is a guide for programmers who will be using the libpkix library. It should also be read by anyone writing code that will be part of the libpkix Portable Code or libpkix Portability Layer.

1.2 Prerequisites

Before reading this document, you should have read the libpkix Architecture document. That describes the purpose of libpkix, its overall architecture, and how it fits into larger systems. This Programmer's Guide is more technical. It assumes a strong knowledge of C.

1.3 Next Steps

If you have comments on this document, please send them to the project administrator. If you want more information about libpkix, visit our Home Page.

1.4 Purpose of libpkix

The purpose of the libpkix library is to provide a widely useful C library for building and validating chains of X.509 certificates, compliant with the latest standards.

When we say "widely useful", we mean that we hope libpkix will be used by many people in many products on many different platforms. When we say "the latest standards", we mean IETF RFC 3280 or its successors (developed by the IETF's PKIX working group) and the latest version of ITU X.509 (ISO/IEC 9594-8).

Through this effort, we hope to help address several problems that have slowed PKI deployment: poor interoperability due to non-standard certificate chain validation and lack of application support for PKI. These are not the only obstacles to PKI usage, but they are substantial ones.

libpkix makes it much easier for application developers to include high-quality certificate chain validation and building in their applications. And this Programmer's Guide explains how to do it.

2. What libpkix Can Do For You

Any secure applications relying on public-key cryptography may use the libpkix library for building and validating X.509 certificate chains. Example applications include digital signatures and the secure socket layer (SSL) incorporated into most web browsers. These applications must securely determine a user's public key to protect against impersonation attacks.

Secure applications are assumed to know the public key of a "trust anchor" (TA). Web browsers often include the public keys of trust anchors such as Verisign and Thawte. An application may use libpkix to build certificate chains from its TA to a subject's public key, transitively establishing trust from the TA to the subject. The libpkix library may also validate chains or collect credentials to establish trust.

3. Fitting libpkix Into Your Application

libpkix is a user-level library that may be called by application developers or built into larger systems. There are two major components to libpkix: the portable code (PC) providing certificate chain functionality and a portability layer (PL). The PL insulates libpkix from the details of the underlying platform. The PL also provides memory management, concurrent access and support for reference-counted objects. The system hierarchy is as follows:

+-------------------------+
|          Caller         |
+--------------+          |
|  libpkix PC  |          |
+--------------+--+       |
|  libpkix PL     |       |
+-----------------+-------+
|   Underlying Platform   |
+-------------------------+
    

The libpkix public API (defined in pkix.h and the files it includes) provides access to certificate chain construction, validation and collection functions. Note that libpkix does not provide direct access to its data structures. Rather, it defines abstract, reference-counted "object" data types and provides functions to access their fields.

4. Top Level Functions

libpkix has three main functions for handling certificate chains: PKIX_BuildChain, PKIX_ValidateChain and PKIX_CollectCerts.

PKIX_BuildChain receives a set of parameters as its input and, if successful, returns a certificate chain. Additional information, such as policy information, may be returned along with the certificate chain.

PKIX_ValidateChain validates a certificate chain subject to given parameters. If successful, PKIX_ValidateChain may return additional information such as a policy tree or the target's public key.

PKIX_CollectCerts collects certificates or CRLs that may be useful in validating a particular certificate. The collected certificates and CRLs may be sent to another party to aid in validation.

The interface to these three functions is as follows:

PKIX_Error*
PKIX_ValidateChain(
	PKIX_ValidateParams *params,
	PKIX_ValidateResult **pResult,
	void *plContext);
    
PKIX_Error*
PKIX_BuildChain(
	PKIX_BuildParams *params,
	PKIX_BuildResult **pResult,
	void *plContext);
    
PKIX_Error*
PKIX_CollectCerts(
	PKIX_CollectParams *params,
	PKIX_CollectResult **pResult,
	void *plContext);
    

5. Objects

The libpkix library uses reference counted "objects" (actually just opaque data types) quite a bit. In fact, all data types are reference counted objects unless explicitly stated otherwise. This is odd for C, but it provides several advantages:

  1. 1) Memory management is simpler and more efficient. Instead of copying data (such as a name from a certificate) into a caller-supplied buffer, we just return a reference (pointer) to the data. This avoids buffer copies, saving time and space.
  2. 2) By keeping the details of our data structures hidden from the caller, we can easily change those data structures without requiring the caller to change any of their code. Just relink with our code.
  3. 3) By using accessor functions instead of direct access to data structures, we can more easily ensure thread-safety.

The main downside is that we have a lot of accessor functions (like PKIX_PL_Cert_GetVersion). But at least our type definitions are simple and short!

Note that opaque data structures are not a new idea, even to C. The FILE structure defined in stdio.h is opaque. Programs are not supposed to poke around inside it.

Note also that we are not trying to recreate an entire object- oriented environment with polymorphism and all. We are only trying to manage references to our data structures in an organized fashion. Still, we refer to our opaque data structures as "objects" because that's a short and easily understood term.

5.1 Reference Counting

Once a reference to an object is returned to the caller, it's hard for us to know when we can free that data structure. But we can't have memory just lying around forever without getting freed. So we use reference counts.

Each object has a reference count, an integer that tracks how many references to the object are outstanding. When an object is created, the reference count is set to one. When one function returns a reference to an object to another function, it increments the reference count. And when a reference is no longer needed, the reference count is decremented. As long as everyone does their job properly, the reference count will be non-zero until the object is no longer needed. Then it will drop to zero and the object will be freed.

For most applications, the only time they will deal with reference counts is by calling PKIX_PL_Object_DecRef when they no longer need a reference to an object. But if they want to pass a reference to another piece of code and retain their own reference, they may find it convenient to call PKIX_PL_Object_IncRef. Then they can call DecRef when they're done with the object and the other function can do the same when they're done.

Please be careful to not call PKIX_PL_Object_DecRef or PKIX_PL_Object_IncRef too many times. One too many DecRefs and you're writing into freed memory (perhaps corrupting the heap). One too many IncRefs and the memory will never be freed.

5.2 Immutable Objects

Some objects are immutable. That means that they should never change. And all the objects returned by their gettors are also immutable. This is convenient because you never have to worry about anybody modifying them. And all of their gettor functions are always thread-safe. Here is a list of immutable objects. Note that * is used as a wildcard in names here and throughout this document.

PKIX_CertChain, PKIX_Error, PKIX_*Result, PKIX_TrustAnchor,
PKIX_PolicyNode, PKIX_PL_ByteArray, PKIX_PL_BigInt,
PKIX_PL_String, PKIX_PL_OID, PKIX_PL_Date, PKIX_PL_Cert,
PKIX_PL_CRL, PKIX_PL_CRLEntry, PKIX_PL_GeneralName,
PKIX_PL_X500Name, PKIX_PL_X500RDN, PKIX_PL_CertNameConstraints,
PKIX_PL_CertBasicConstraints, PKIX_PL_PublicKey,
PKIX_PL_CertPolicies, PKIX_PL_CertPolicyInfo,
PKIX_PL_CertPolicyQualifier, PKIX_PL_CertPolicyMap
    

NOTE: Because C doesn't provide any memory protection, we can't prevent anyone from overwriting an immutable object and changing its value. Please don't do this! It's bad behavior and might cause things to crash.

IMPLEMENTORS' NOTE: These immutable objects are not immutable while they're being created, of course, only after they are returned by one of our external APIs.

5.3 Modifying Objects

Because we don't copy objects passed to or returned by a function (unless otherwise noted), the general rule is that the caller should not modify such objects after the function returns. Otherwise, thread-safety problems and other problems may ensue. However, there are some exceptions.

The following functions don't mind if you modify the objects they return:

The following functions return objects that are immutable:

The following functions return objects that you should not change: PKIX_PL_HashTableLookup

The following functions return objects that may well change after they have been returned to you. Be careful to only access them when you know that nobody else could be changing them. Otherwise, you risk thread-safety problems.

And all of the PKIX_*_Get*Context functions return a void *. Our code will never read or write through this pointer. It's only for the caller's convenience in communicating with, configuring, or providing state information for their callback. Therefore, it's up to the caller to decide who can read or write the information pointed to by this pointer, when, how to ensure thread safety, etc.

6. Threading

Here are our current thoughts about which of the public API functions should be thread-safe and why.

6.1 Functions that are Thread-safe

These functions are thread-safe. That is, it's OK for a thread to call one of these functions even if another thread has already called that function and that function invocation has not returned yet. But you must make sure that while one of these functions is executing, nobody modifies any of its arguments or anything referred to by its arguments. If this happens, the behavior is undefined.

6.2 Functions that Modify Objects without Protection

These functions modify an object in a manner that is not thread-safe. So nobody can read from that object when one of these functions is executing. But functions that don't read from the specified object should be fine.

6.3 Functions that are Not Reentrant

These functions are not reentrant and therefore not thread-safe. Nobody should call these functions if somebody has called them and they have not returned yet. And nobody should modify their arguments (or anything referred to by their arguments) while they are executing.

7. Security

Our libpkix library is not a system library. It has no special privileges and cannot do anything that the caller couldn't do themselves with a little work. Also, we don't have any way to protect our data structures from the caller overwriting them. Therefore, we assume an attitude that says "If the caller messes with us (feeding us bad data or some such), they're only hurting themselves." This attitude is somewhat dangerous. It means that the caller is responsible for checking data before they give it to us. It also means that we count on the caller to not violate thread safety rules or other rules spelled out in the programmer's guide.

On the upside, it means that we don't need to copy arguments and return values, as long as we tell the caller not to change them (or clearly specify when it is OK for the caller to change them). And we can avoid mutex overhead in many places by simply telling the caller what they must do to avoid threading problems. For more information about these points, see the sections on Threading and Modifying Objects in this document.

However, there are some exceptions to our complete trust in the caller. It is not practical for the caller to check certificates and CRLs before passing them to us. So the caller can assume that we will handle incorrectly encoded certificates and CRLs without any problem. We, in turn, assume that the PKI Portability Layer will handle this properly.

8. Platform-specific Issues

We do not insulate developers from platform-specific issues entirely. That is not our aim. So you should read the subsection here that corresponds to the platform you're developing on.

8.1 NSS/NSPR

NSS and NSPR functions are not thread-safe with respect to native threads, only with respect to NSPR threads. This applies to libpkix functions too, since we depend on the NSPR functions. So don't call libpkix functions from multiple native threads.