Implementation Guidelines for libpkix

Version 0.5, July 14, 2004
Authors: Steve Hanna, Yassir Elley

  1. Introduction
   1.1 Document Purpose
   1.2 Prerequisites
   1.3 Purpose of libpkix
   1.4 Requirements

  2. Broad Issues
   2.1 Common Sense
   2.2 Consultation
   2.3 Intellectual Property

  3. Development Process
   3.1 Development Model
   3.2 Implementation Process

  4. Coding Guidelines
   4.1 C Language Version
   4.2 Identifiers
   4.3 Common Macros
   4.4 Public APIs
   4.5 Comments
   4.6 Work Left For Later
   4.7 Inside Objects
   4.8 Threading
   4.9 Security
   4.10 Portable Code
   4.11 Portability Layer
   4.12 Reliability

  5. Development Environment

  6. Performance Analysis and Optimization

1. Introduction

1.1 Document Purpose

This document is a set of guidelines that must be followed by anyone writing code that will be part of the libpkix Portable Code or the libpkix Portability Layer.

1.2 Prerequisites

Before reading this document, you should have read the libpkix Architecture document and the Programmer's Guide for libpkix. The libpkix Architecture describes the purpose of libpkix, its overall architecture, and how it fits into larger systems. The Programmer's Guide explains how the public APIs of libpkix work. And this document explains the guts of libpkix.

1.3 Purpose of libpkix

The purpose of the libpkix library is to provide a widely useful C library for building and validating chains of X.509 certificates, compliant with the latest standards.

When we say "widely useful", we mean that we hope libpkix will be used by many people in many products on many different platforms. When we say "the latest standards", we mean IETF RFC 3280 or its successors (developed by the IETF's PKIX working group) and the latest version of ITU X.509 (ISO/IEC 9594-8).

Through this effort, we hope to help address several problems that have slowed PKI deployment: poor interoperability due to non-standard certificate chain validation and lack of application support for PKI. These are not the only obstacles to PKI usage, but they are substantial ones.

1.4 Requirements

The libpkix Architecture defines a set of requirements that libpkix must meet. Please review that list. In the rest of this document, you will see how we work to meet those requirements.

2. Broad Issues

2.1 Common Sense

If something in this document (or in any other one) doesn't make sense, ask about it. You may well have discovered a problem. Or maybe you didn't understand properly. Either way, it's best to ask.

2.2 Consultation

Feel free to consult with other members of the team. We should make use of all our resources.

2.3 Intellectual Property

libpkix uses a modified BSD license. This is a true Open Source license with no "viral" effects. Don't download any code off the Internet without getting approval from the project lead beforehand. That includes source code, object code, etc. In fact, you should consult with the project lead before downloading documentation or other information about any projects that are similar to ours. And certainly don't accept anything that might be confidential!

Note that NSS and NSPR have been approved for download.

3. Development Process

3.1 Development Model

We use a combination of the waterfall and the spiral models, similar to the Rational Unified Process. For background information on these models, see http://www.rational.com and http://www.stsc.hill.af.mil/crosstalk/1995/01/Comparis.asp But we'll describe our own process here.

At the top level, we will use a loose waterfall process, starting with Requirements Definition and ending with Integration and Deployment. But within the Implementation phase we will iterate, implementing a subset of the features and then adding on features in cycles. Here are the phases of the waterfall:

  1. Define and Agree on Requirements
  2. Design Solution
  3. Conduct Design Review
  4. Prepare Project Plan
  5. Implement Design (with Unit and System Tests)
  6. Integrate with Product
  7. Deploy (technology transfer)

Why use a waterfall model at the top level? Because we have a lot of experience building and refining the CertPath code (the certificate chain building and validation code in the JavaTM 2 Platform, Standard Edition), we expect that one pass through the overall process (defined in Figure 1) should be sufficient. But we may need to go back and tweak the requirements as we go. We reserve that right.

Why use a spiral model within the Implementation phase? Because it's always a good idea to do things in small pieces, ensuring that each piece works properly before going forward. We'll break the implementation down into phases roughly 3 months in duration with each phase containing a set of tasks (about 40, each task requiring about 1-8 person-days of effort). Each person will implement their tasks one at a time, using a simple implementation process for each task.

3.2 Implementation Process

Here is the implementation process that will be followed for each task.

  1. Implementation Design
  2. Implementing a task requires some design (figuring out how to implement the functionality agreed to in the Prepare Project Plan phase of the waterfall model).

  3. Optional Implementation Design Review
  4. The cheapest time to find and fix a problem is before any code has been written. It's a good idea to talk about your implementation design with someone else before you write the code. But this is not required.

  5. Write Code
  6. See below for Coding Guidelines.

  7. Build Code
  8. See below for details in Development Environment.

  9. Initial Testing
  10. Before passing the code off for a code review, you should be fairly confident that it is working. Therefore, you should write some unit tests and get them working.

  11. Code Review
  12. All of our code (including bug fixes) must be code reviewed by another member of the libpkix team. This code review is designed to catch any sort of error: a design error, a hidden assumption, a race condition, a violation of coding guidelines, or a simple bug.

    Plan to spend as much time doing a code review as the original coder spent writing the code (although it's fine to spend less than that if you're done and have done a good job). Don't let things slide. It's much cheaper and easier to find and fix a problem in the code review than during testing or (worst of all!) in the field.

  13. Unit Testing
  14. We have a policy of writing unit tests for our code. These tests try to exercise all the important features and paths through the code. Why test when we're doing code reviews anyway? Because sometimes the code looks fine and it builds fine, but it just doesn't work!

    You can write your tests in parallel with the code review, in order to optimize things. But you should review your test plan (the set of tests you're planning to write) with someone (probably your code reviewer) before you start working on the tests or as soon as possible thereafter. That way, you can have two minds thinking about what should be tested. And your code reviewer may want to review your tests, too. It's easy to miss something by having an error in the test.

  15. Integrate and Putback Code and Tests
  16. Test your code before putting it back to the main workspace. After putting your code back, it is recommended that you run the NightlyBuild manually to make sure everything works in the master workspace. See Development Environment (Section 5) for details.

Creating system tests will generally be a separate task, since it can be quite substantial. Design review (and code review, where appropriate) is recommended for system tests since it can be hard to find all the right test cases.

4. Coding Guidelines

We are currently working on preparing a coding guidelines document. For now, you should follow the coding style of the existing code. Additionally, we use several common libpkix macros to provide centralized and consistent error-handling, null-pointer checking, and debugging. See the section on Common Macros below.

4.1 C Language Version

All of our code will be written in ISO C89 (ISO/IEC 9899:1990, almost exactly the same as "ANSI C"). This standard has been out for almost 15 years. All of the compilers now support it. And it's much better defined than Kernighan and Ritchie. So let's use it!

However, the header files that define our public APIs may be included by code that's compiled with a K&R compiler (or at least a compiler that has been configured for K&R compatibility mode). Therefore, these header files will include K&R style declarations as well as ANSI C style declarations.

4.2 Identifiers

All external identifiers (those declared in any header file) must start with a prefix of "PKIX_". This will help avoid namespace collisions. External identifiers that are defined by the Portability Layer must start with a prefix of "PKIX_PL_".

Our function and type identifiers are generally long, using underscores as logical separators and camel case (upper and lower) to delimit words within a logical unit. For instance, PKIX_ValidateChain. This requires a bit more typing, but it makes code more readable.

Because of these long identifiers, we require that the linker distinguish identifiers if they differ in any of the first 31 characters. This allows us to define distinct functions named PKIX_CB_PolicyQualifier_GetPolicyQualifierId and PKIX_CB_PolicyQualifier_GetQualifier. All modern linkers support this and C99 (ISO/IEC 9899:1999) requires it. But be careful! It's easy to exceed this 31 character limit.

Names of types and functions should start with an upper case letter. Names of variables should start with a lower case letter.

Variable names should be short. Avoid external variables (globals) if at all possible.

Variable names (unless they are external) should begin with a lower case letter. Function and type names should begin with an upper case letter. Names for macros and constants defined with the preprocessor should be all caps, unless the same identifier is sometimes a function and sometimes a macro.

Pointer variables may be indicated by starting the variable name with a p. But that's often not necessary.

4.3 Common Macros

A number of common macros and a "cleanup" label are used to provide centralized and consistent error-handling, null-pointer checking, and debugging. By using these common macros for "infrastructural" stuff, the programmer (and code reviewer) can spend more time with the real guts of the function. Additionally, this decreases lines of code considerably.

PKIX_ENTER(type, funcName)
PKIX_NULLCHECK_*(...)
PKIX_CHECK(func, desc)
PKIX_ERROR(desc)
PKIX_DECREF(obj)
PKIX_FREE(obj)
"cleanup" label
PKIX_RETURN(type)
    

4.4 Public APIs

Be very careful about what's declared in pkix.h or any of the header files included by it. These are the public APIs of our library. Any changes to those files should be reviewed by our entire team and noted in a change log.

4.5 Comments

Carefully comment public APIs. For each external function, type, or other identifier, include a full description of the function or type right before its external declaration. In this comment, don't talk about implementation. Talk about the externally observable behavior of the function upon which people can depend. Consider this comment a contract. Be careful what you promise, because it will be painful to rescind this promise. If there are specific things that you *don't* want to promise, explain those in the comment. That way, callers will know what they can depend on.

Don't duplicate the description from the public API in the C file where the external identifier is defined. Instead, include a comment that points to the comment in the header file. You can also discuss implementation details in this comment.

Also provide comments for non-public functions and types (those that aren't defined in pkix.h or subsidiary header files), even if they are local to a file. You don't need to comment automatic variables and code, but you can if you have something useful to say.

4.6 Work Left For Later

If you have to leave something unfinished (an unanswered question, a case that's not handled properly, etc.), leave a comment explaining the problem. Start the comment with XXX (that's right, three Xs). Then we can grep for this pattern to find any remaining problems before shipping.

4.7 Inside Objects

The libpkix library uses reference counted "objects" (actually just opaque data types) quite a bit. In fact, all data types are reference counted objects unless explicitly stated otherwise. This allows us to simplify memory management, maintain encapsulation, and simplify thread-safety. Section 5 in the Programmer's Guide goes into more detail about the benefits of using these "objects".

An object is associated with a specific type. System types are those provided by the system (such as String, OID, ByteArray, etc). Users can also define their own types (such as Widget). Each type has several useful functions associated with it, where the first four have default values if they are not provided: Destructor, Equals, Hashcode, ToString, Comparator. All objects of the same type share the same functions.

Before an object of a given type can be allocated, the functions for that type must be "registered" with the system (using PKIX_PL_Object_RegisterType). To avoid unnecessary complexity, all system types are registered at initializiation (in PKIX_PL_Initialize), but user types have to be explicitly registered by the users.

Once a type has been registered, an object of that type can be allocated. Allocating a new object creates an object header and a block of uninitialized user data. A pointer to this uninitialized data is returned to the user. The structure looks as follows:

   +--------------------+
   | MAGIC HEADER   	|
   | (object header)	|
   +--------------------+
   | user data		| -- pointer returned from PKIX_PL_Object_Alloc
   +--------------------+
    
Any functions whose arguments include a (PKIX_PL_Object *) expect to receive a pointer to the user data, not the object header. In fact, the caller never need be aware of the object header, as it is only supposed to be used by the system.

The object header includes a magic number (see below), the object's type, references, and a mutex. The type is used to determine which functions should be associated with this object, the references are used for reference counting, and the mutex is used to allow object synchronization.

The magic number (which we refer to as a MAGIC HEADER) serves several functions. The magic number is verifed every time the header is accessed, allowed memory corruption to be detected. It also allows us to detect too many DecRefs on an object. When the object is created, the MAGIC HEADER is set. When it is destroyed (by an eventual DecRef), we free the memory (both the object header and the user data) and clear the MAGIC HEADER. If that memory has not been reused, a subsequent DecRef on that (already-freed) object will notice the MAGIC HEADER has been cleared, and will throw an error.

4.8 Threading

All functions must specify whether they are thread-safe and if so under what circumstances. They should also specify in their source file what assumptions this is based on.

Think carefully about how your function will be used. Thread safety may not be needed. Or it may be. Think. If in doubt, ask someone else.

4.9. Security

For good info, see this web site.

Also, know and understand the limitations of your libraries. And document the limitations of your own code.

But, most important, code defensively. Be paranoid. What could go wrong? How could someone violate your assumptions?

See the Security section in the Programmer's Guide. In fact, read all the sections of the Programmer's Guide very carefully. Together with the Architecture and the comments in our public header files, they constitute a contract with our developers. If you see a problem, don't break the contract. Change the contract now, while nobody's depending on it yet.

4.10 Portable Code

The libpkix Portable Code should only use types and functions defined by the Portable Code or public APIs declared and defined by the Portability Layer. It DEFINITELY should not use native C types (except for void and void *), stdio functions, or other functions. Otherwise, we will lose compatibility with platforms that don't provide those functions or where those types are not commonly used.

An exception to this rule can be made for the following cases. The int type can be used for local variables, as long as you don't need it to support values outside of the range -32,767 .. 32,767. And you can use a char * and/or a constant string (with only printable ASCII characters) only for passing to PKIX_PL_OID_Create or PKIX_PL_String_Create.

4.11 Portability Layer

Do everything possible to confine platform-specific code to the Portability Layer. Otherwise, we'll have a maintenance nightmare. If you run into a place where this is not possible, consult with all other members of the libpkix team.

Code in the Portability Layer may be written to conform to the coding standards of the platform you're trying to interface to, since this may be necessary in order to interface properly.

Understand thoroughly any functions that you call. Read all available documentation on platform-specific APIs. Examine these public APIs with a careful and suspicious eye. If you have any doubts or there's anything left undefined (like information about the thread-safety of functions), ask someone who's an expert with the platform-specific APIs. Find out whether the answer you get is a contract you can count on indefinitely or only an answer that's good for now. If you have to proceed based on an assumption, mark it with an XXX comment so we can come back and check your assumption later.

XXX What if the underlying platform doesn't provide enough features for us to make our functions thread-safe? NSS seems to provide what we need. But not all platforms will, necessarily.

We should probably have a flag in PKIX_InitializeResult that indicates we can't handle multi-threading because the PL libraries can't handle it. If we set this flag, we can probably assume that the caller will respect it (not trying to call our functions from multiple threads, etc.). But we should probably do some lightweight checks for reentrance then so we can error out most of the time if they don't check that flag. We also should have a PKIX_PL_IsThreadSafe function so we can find out whether the PL libraries are thread-safe and pass that info on through PKIX_InitializeResult. Actually, maybe we should have a PKIX_PL_Initialize function so we can configure the PL libraries (like with the loggers we got through PKIX_Initialize) and the PL libraries can initialize things and return to us info about whether they're thread-safe. And a PKIX_PL_Shutdown function to do the opposite.

4.12 Reliability

There are many ways to improve reliability. Some of the primary ones are: careful design and coding, documenting your assumptions, writing and running lots of tests, and doing code reviews. Of course, we use these.

We also use assertions whenever there is a condition which should always hold. The PKIX_ASSERT macro takes one argument, which is an expression (such as (n > 1)). If the expression evaluates to zero (false), the program is terminated with an error message indicating where the assertion failed. Typically, the PKIX_ASSERT macro is disabled in non-debug builds for efficiency. The expression passed to it will not even be evaluated then, so don't count on side effects from it.

If you want to write other code that will only execute in debug builds, you can use the PKIX_DEBUG macro, which has a non-zero value in debug builds and a zero value in non-debug builds.

5. Development Environment

We use standard C programming tools (cc, lint, dbx, etc). All of our code must pass lint with no warnings.

To help detect memory management errors, we use dbx in RTC (Runtime checking) mode, which detects memory leaks, as well as several classes of memory access errors. In order to do this, we use the "check -all" command in dbx. Use the "help check" command in dbx for all the details. With regard to buffer overflows, RTC does support overflow checking when memory is dynamically allocated (the common case), but not when it is statically allocated. Therefore, these (few) static cases must be analyzed very carefully.

Everyone MUST read the README file in the root directory. This README file describes several pieces of information, including integration with NSS/NSPR, the directory layout, building the source, and testing the build.

6. Performance Analysis and Optimization

As CAR Hoare famously said (and Donald Knuth reiterated), "premature optimization is the root of all evil in programming". This is true, but it's good to design things to allow for later optimizations. That's why we carefully design the libpkix APIs, consider threading from the start, avoid exposing our data structures, and allow objects to be shared.

After our first release, we will profile our code to look for optimization opportunities.