Quick review of how to write stable code in C++

(Just some good-practice tips to save you from wasting time wrestling with the language.)

Suppose you have an important test that you must run 100 million times. You plan to let it run all night, and in the morning you plan to collect the results. Obviously you should be really certain that it works before you leave it running, or else you'll have to do it again. Here's some things you should do to make sure it works well:

1- Use the "-g" flag to tell your compiler to include debug symbols in your binaries. If you don't do this, your debugger is nearly worthless.
2- Use the "-Wall" flag to tell g++ to warn you about things that look like bugs (and fix all the warnings it finds). You should ALWAYS have this flag on.
3- Add the "-D_DEBUG" so that you can write debug-only code by putting inside these:

#ifdef _DEBUG
// do some debug checking here


#endif

4- Don't use return codes. Studies show that the vast majority of bugs are found in control logic ("if" statements). Using exceptions (properly) will significantly reduce the amount of necessary control logic in your code.
5- Use asserts in your code. Lots of asserts can save you lots of debugging. You don't need to feel bad about checking for every possible suspicious condition with asserts because the asserts will disappear when you build in optimized mode. Use asserts to check that your indexes are within range. Use them to do sanity checks (check things that you already know are true, just to be certain that the code is in the state you think it is). Use them to make sure your algorithm did what you think it should have done. etc.
6- It's a good idea to run your app inside Valgrind for a while. It will find memory leaks, buffer overruns, and other bugs automatically. It's easy to use. Just run "valgrind yourapp". (Sorry, I don't think there's a Windows version of Valgrind yet.)

If your favorite platform doesn't already have its own assert capabilities, here's some code for doing asserts that you can put in your toolbox. This code is known to work on Linux, Mac, and Windows. It will stop the debugger when something bad happens so you can look at the stack and the variables, but you can continue running if you wish to ignore the problem:

#ifndef GAssert

#ifdef _DEBUG

#ifdef WIN32

#define GAssert(x,y)       
    \

    {       
              
    \

        if(!(x))   
            \

        {   
                
  \

fprintf(stderr, "Debug Assert Failed: %s\n", y);\

            __asm int
3         \

        }   
                
  \

    }

#else // WIN32

#define GAssert(x,y)       
    \

    {       
                
  \

if(!(x))               
\

{                      
\

fprintf(stderr, "Debug Assert Failed: %s\n", y);\

            kill(getpid(),
SIGINT);\

}                      
\

    }

#endif // !WIN32

#else // _DEBUG

#define GAssert(x,y)    ((void)0)

#endif // else _DEBUG

#endif // GAssert

Here's an example of how to use GAssert:

double
divide(double nom, double denom)

{

    GAssert(denom != 0, "denom is expected to be
non-zero");

    return nom / denom;

}

Of course when you actually run your test, you don't want debug symbols or debug code slowing down your experiments. In optimized mode:

1- Don't use the "-g" flag.
2- Do still use the "-Wall" flag. Always use this flag.
3- Don't use the "-D_DEBUG" flag. This way, your asserts and debug code won't slow you down.
4- Use the "-O3" flag. This tells g++ to optimize for speed. (Note: this is a "capital-oh three", not a "zero three". The "O" stands for "optimize".)

So make sure you have an easy way to specify whether you are building in debug mode or optimized mode. If you use "make", have a rule for making debug mode and a rule for making optimized mode. If you don't have a decent makefile already, here's one you can use. To use this makefile, just list your .cpp and .h files in the "Lists" section, and it will do the rest:

################

# Paths and Flags

################

TARGET_PATH = ../../bin

SOURCE_PATH = .

TARGET_NAME_OPT = hello

TARGET_NAME_DBG = hellodbg

OBJ_PATH = ../../obj

CFLAGS = -Wall

DBG_CFLAGS = $(CFLAGS) -g -D_DEBUG

OPT_CFLAGS = $(CFLAGS) -O3

DBG_LFLAGS = 

OPT_LFLAGS = 

################

# Source

################

CPP_FILES =    $(SOURCE_PATH)/main.cpp \

        $(SOURCE_PATH)/somefile.cpp \

        $(SOURCE_PATH)/yetanotherfile.cpp
\

HEADER_FILES =    $(SOURCE_PATH)/somefile.h \

        $(SOURCE_PATH)/yetanotherfile.h \

################

# Lists

################

TEMP_LIST_1 = $(CPP_FILES:$(SOURCE_PATH)/%=$(OBJ_PATH)/%)

OBJECTS_OPT = $(TEMP_LIST_1:%.cpp=%.o)

OBJECTS_DBG = $(TEMP_LIST_1:%.cpp=%.dbg.o)

################

# Rules

################

dbg : skipsomelines $(TARGET_PATH)/$(TARGET_NAME_DBG)

opt : skipsomelines $(TARGET_PATH)/$(TARGET_NAME_OPT)

skipsomelines :

    echo

    echo

    echo

    echo

    echo

    echo

usage:

    #

    # Usage:

    #  make usage   (to see this info)

    #  make clean   (to delete all the .o
files)

    #  make dbg     (to build a
debug version)

    #  make opt     (to build
an optimized version)

    #

$(TARGET_PATH)/$(TARGET_NAME_OPT) : partialcleanopt $(OBJECTS_OPT)

    g++ -lpthread -O3 -o
$(TARGET_PATH)/$(TARGET_NAME_OPT) $(OBJECTS_OPT) $(OPT_LFLAGS)

$(TARGET_PATH)/$(TARGET_NAME_DBG) : partialcleandbg $(OBJECTS_DBG)

    g++ -g -D_DEBUG -o $(TARGET_PATH)/$(TARGET_NAME_DBG)
$(OBJECTS_DBG) $(DBG_LFLAGS)

$(OBJECTS_OPT) : $(OBJ_PATH)/%.o : $(SOURCE_PATH)/%.cpp $(HEADER_FILES)

    g++ $(OPT_CFLAGS) -c $< -o $@

$(OBJECTS_DBG) : $(OBJ_PATH)/%.dbg.o : $(SOURCE_PATH)/%.cpp
$(HEADER_FILES)

    g++ $(DBG_CFLAGS) -c $< -o $@

partialcleandbg :

    rm -f $(TARGET_PATH)/$(TARGET_NAME_DBG)

partialcleanopt :

    rm -f $(TARGET_PATH)/$(TARGET_NAME_OPT)

clean : partialcleandbg partialcleandbg

    rm -f $(OBJECTS_OPT)

    rm -f $(OBJECTS_DBG)

Now, let's suppose that you want to run a test that requires you to allocate some memory that's too big to fit on the heap, and your test also relies on a function or API that almost always works, but on rare occasions might throw an exception. Your first (poor) attempt might look something like this:

void
main()

{

    for(int
i = 0; i < 100000000; i++)

   dotest(i);

}

void dotest(int i)

{

    double*
vec = new double[100000];

unstableFunc(vec, i);

delete[] vec;

}

void unstableFunc(double*
vec, int i)

{

    // this
function does something useful,

    // but
on certain (very rare) occasions,

    // it
will throw an exception

}

In the morning when you return, you will discover that it failed after only a few thousand iterations. So you decide to catch exceptions like this:

void main()

{

    int i;

    for(i = 0; i < 100000000; i++)

    {

       try

       {

            dotest(i);

       }

catch(...)

       {

        printf("iteration %d failed\n",
i);

       }

    }

}

void dotest(int i)

{

    double* vec = new double[100000];

    unstableFunc(vec, i);

    delete[] vec;

}

void unstableFunc(double* vec, int i)

{

    // this function does something useful,

    // but on certain (very rare) occasions,

    // it will throw an exception

}

The next day, you discover that it's still running. Your machine is unresponsive and it's only on iteration 200 thousand! What went wrong?

The problem is that every time an exception was thrown, you leaked memory. When your memory ran out, it started swapping to your hard drive. It thrashed your hard drive all night and since disk IO is much slower than memory access, it hardly made any progress. So how can you make sure that you don't leak memory while recovering from an exception? Well, you could avoid allocating memory, but that's often not an option. So if you do allocate memory, use a holder to make sure it gets cleaned up properly:

void main()
{
    int i;
    for(i = 0; i < 100000000; i++)
    {
       try
       {
            dotest(i);
       }
       catch(...)
       {
            printf("got a failure\n");
       }
    }
}

void dotest(int i)
{
    double * vec = new double[100000];
    Holder<double*> h(vec);
    unstableFunc(vec, i);
}

void unstableFunc(double* vec, int i)
{
    // this function does something useful,
    // but on certain (very rare) occasions,
    // it will throw an exception
}

Now, the variable "h" will clean up your memory when it goes out of scope, even if it goes out of scope due to an exception. So, what is a holder? It's just a class that uses its destructor to delete your pointer. Here's a Holder class that you can put in your toolbox if your favorite platform doesn't already have a built-in holder class:

template <class T>
class Holder
{
private:
    T m_p;

public:
    Holder(T p)
    {
        m_p = p;
    }

    ~Holder()
    {
        delete(m_p);
    }
};

Don't be scared because this class uses a template. Just imagine that "T" is automatically replaced with "double*" (because that's what was in the angle brackets), and it should all make sense. So now you can write robust code that can recover from exceptions and continue to operate efficiently. Here's some other tips to make sure you don't have memory leaks:

1- If you use inherritance, make sure your destructors are virtual. Otherwise, the base-class destructor won't be called in some situations!
2- If you do something like this:

       void* array = new SomeClass[100];
be sure to clean up like this:
       delete[] array;
not like this:
       delete(array)
Note that the Holder class above uses the wrong delete for arrays. You might want to make another class called ArrayHolder that uses the other delete.

3- If you're not afraid of templates, just don't ever use arrays. use vectors instead.
4- If possible, don't use new very often. Just keep your objects on the stack and let them allocate and destruct memory in their constructors/destructors.
5- Never use alloca in a loop. That's a quick way to overflow your stack.
6- Don't use multiple inherritance unless you really know what you're doing.
7- Run your code through Valgrind.

So, the next day you return and discover that the whole test completed after only a few hours. In great excitement, you rush to pass off your code with the T.A. But to your great dismay you find that your code behaves differently on the T.A.'s machine. What's going on?

Did you use "rand()" anywhere in your code? Don't use rand(). It's not a very good pseudo-random number generator. It's not thread-safe. It's not re-entrant. It's implemented differently on every architecture, so you may not get consistent results on different machines, even if you seed it with a constant value. And it's not as efficient as many better pseudo-random number generators. Just don't use it. Your favorite development platform probably provides a nicer pseudo-random number generator that has a known huge cycle and meets lots of rigorous testing standards. If you can't find a better one, here's a simple one that you can use:

class GRand
{
protected:
    unsigned int m_state[256];
    unsigned int m_x;
    unsigned int m_n;

public:
    GRand(unsigned int nSeed);
    ~GRand();

    // Sets the seed
    void SetSeed(unsigned int nSeed);

    // Returns a pseudo-random uint from a discrete uniform
    // distribution in the range 0 to 0xffffffff (inclusive).
    unsigned int GetUint();

    // Returns a pseudo-random uint from a discrete uniform
    // distribution in the range 0 to range-1 (inclusive).
    // (It's better to use this method than to just do
    // "GetUint() % range" because if you do that, and
    // "0x100000000 % range" is not equal to 0, then you
    // will not get a completely uniform distribution.)
    unsigned int GetUint(unsigned int range);

    // Returns a pseudo-random double from 0 (inclusive)
    // to 1 (exclusive). This draws two random uints,
    // and discards half of one of them to obtain 48
    // random bits, which are uses for the mantissa of
    // the double.
    double GetUniform();

    // Returns a random value from a standard normal distribution. (To
    // convert it to a random value from an arbitrary normal distribution,
    // just multiply the value this returns by the deviation, then add
    // the mean.)
    double GetNormal();
};

GRand::GRand(unsigned int nSeed)
{
    SetSeed(nSeed);
}

GRand::~GRand()
{
}

void GRand::SetSeed(unsigned int nSeed)
{
    m_n = nSeed ^ 0xcfd41b91; // 3 to the 20th power
    unsigned int i;
    for(i = 0; i < 256; i++)
    {
        m_state[i] = nSeed ^ m_n;
        m_n *= 3;
    }
    m_x = 0;
}

unsigned int GRand::GetUint()
{
    unsigned int d, a, b, c;
    d = m_n;
    a = d & 255;
    d = d >> 8;
    b = d & 255;
    d = d >> 8;
    c = d & 255;
    d = d >> 8;
    m_n = (m_n + m_state[a]) ^ m_state[m_x];
    m_state[a] += m_state[b];
    m_state[a] ^= m_state[c];
    m_state[m_x] += m_state[c];
    m_state[m_x] ^= m_state[d];
    m_x++;
    m_x &= 255;
    return m_n;
}

unsigned int GRand::GetUint(unsigned int range)
{
    // Use rejection to find a random value in a range that is a multiple of "range"
    unsigned int n = (0xffffffff % range) + 1;
    unsigned int x;
    do
    {
        x = GetUint();
    } while((x + n) < n);

    // Use modulus to return the final value
    return x % range;
}

double GRand::GetUniform()
{
    // use 48 random bits for the mantissa
    return (4294967296.0 * (GetUint() & 0xffff) + GetUint()) / 281474976710656.0;
}

double GRand::GetNormal()
{
    double x, y, mag;
    do
    {
        x = GetUniform() * 2 - 1;
        y = GetUniform() * 2 - 1;
        mag = x * x + y * y;
    } while(mag >= 1.0 || mag == 0);
    return y * sqrt(-2.0 * log(mag) / mag); // the Box-Muller transform
}

Now that your code is rock-solid, wouldn't it be nice if you could launch it as a daemon? This way, if your roommate closes your console window and logs you out, your program will keep running until it's done. And if you launch it on a remote machine over an SSH connection, you don't have to keep your connection open until it's done--you can just connect later to get the results. So here's some code that will fork off a daemon. (Unfortunately this code doesn't work on Windows because Windows isn't fully Posix compliant.)

typedef void (*DaemonMainFunc)(void* pArg);

int LaunchDaemon(DaemonMainFunc pDaemonMain, void* pArg)
{
    // Fork the first time
    int firstPid = fork();
    if(firstPid < 0)
        throw "Error forking (the first time) in LaunchDaemon";
    if(firstPid)
        return firstPid;

    // Fork the second time
    int secondPid = fork();
    if(secondPid < 0)
        throw "Error forking (the second time) in LaunchDaemon";
    if(secondPid)
        return secondPid;

    // Drop my process group leader and become my own process group leader
    // (so the process isn't terminated when the group leader is killed)
    setsid();

    // Set the file creation mask. (I don't know why we do this.)
    umask(0);

    // Get off any mounted drives so that they can be unmounted without
    // killing the daemon
    chdir("/");

    // Launch the daemon
    pDaemonMain(pArg);

    exit(0);
}

And here's some code to show you how to use it:

void daemonMain(void* pArg)
{
    runTests();
    saveResultsToFile();
}

int main()
{
#ifdef _DEBUG
    // don't fork off a daemon in debug mode. That would just complicate debugging
    daemonMain(NULL);
#else
    LaunchDaemon(daemonMain, NULL);
#endif
}

The End