Quick review of how to
write stable
code in C++
(Just some good-practice tips to save
you from wasting time wrestling with the language.)
Suppose you have an important test that you must run 100 million times.
You plan to let it run all night, and in the morning you plan to
collect the results. Obviously you should be really certain that it
works before you leave it running, or else you'll have to do it again.
Here's some things you should do to make sure it works well:
1- Use the "-g" flag to tell your
compiler to
include debug symbols in your binaries. If you don't do this, your
debugger is nearly worthless.
2- Use the "-Wall" flag to tell g++ to warn you about things that look
like bugs (and fix all the warnings it finds). You should ALWAYS have
this flag
on.
3- Add the "-D_DEBUG" so that you can write debug-only code by putting
inside these:
#ifdef _DEBUG
// do some debug checking here
#endif
4- Don't use return codes. Studies show that the vast majority of bugs
are
found in control logic ("if" statements). Using exceptions
(properly) will significantly reduce the
amount of necessary control logic in your code.
5- Use asserts in your code. Lots of asserts can save you lots of
debugging. You don't need to feel bad about checking for every possible
suspicious condition with asserts because the asserts will disappear
when you build in optimized mode. Use asserts to check that your
indexes are within range. Use
them to do sanity checks (check things that you already know are true,
just to be certain that the code is in the state you think it is). Use
them to make
sure your algorithm did what you think it should have done. etc.
6- It's a good idea to run your app inside Valgrind for a while. It
will find memory leaks, buffer overruns, and other bugs automatically.
It's easy to use. Just run "valgrind yourapp". (Sorry, I don't think
there's a Windows version of Valgrind yet.)
If
your favorite platform doesn't already have its own assert
capabilities, here's some code for doing asserts that you can put in
your toolbox. This code is known to work on Linux, Mac, and Windows. It
will stop the debugger when something bad happens so you can look at
the stack and the variables, but you can continue running if you wish
to ignore the problem:
#ifndef GAssert
#ifdef _DEBUG
#ifdef WIN32
#define GAssert(x,y)
\
{
\
if(!(x))
\
{
\
fprintf(stderr, "Debug Assert Failed: %s\n", y);\
__asm int
3 \
}
\
}
#else // WIN32
#define GAssert(x,y)
\
{
\
if(!(x))
\
{
\
fprintf(stderr, "Debug Assert Failed: %s\n", y);\
kill(getpid(),
SIGINT);\
}
\
}
#endif // !WIN32
#else // _DEBUG
#define GAssert(x,y) ((void)0)
#endif // else _DEBUG
#endif // GAssert
Here's an example of how to use GAssert:
double
divide(double nom, double denom)
{
GAssert(denom != 0, "denom is expected to be
non-zero");
return nom / denom;
}
Of course when you actually run your test, you don't want debug symbols
or debug code slowing down your experiments. In optimized mode:
1- Don't use the "-g" flag.
2- Do still use the "-Wall"
flag. Always use this flag.
3- Don't use the "-D_DEBUG" flag. This way, your asserts and debug code
won't slow you down.
4- Use the "-O3" flag. This tells g++ to optimize for speed. (Note:
this is a "capital-oh three", not a "zero three". The "O" stands for
"optimize".)
So make sure you have an
easy way to specify whether you are building in debug mode or optimized
mode. If you use "make", have a rule for making debug mode and a rule
for making optimized mode. If you don't have a decent makefile already,
here's one you can use. To use this makefile, just list your .cpp and
.h files in the "Lists" section, and it will do the rest:
################
# Paths and Flags
################
TARGET_PATH = ../../bin
SOURCE_PATH = .
TARGET_NAME_OPT = hello
TARGET_NAME_DBG = hellodbg
OBJ_PATH = ../../obj
CFLAGS = -Wall
DBG_CFLAGS = $(CFLAGS) -g -D_DEBUG
OPT_CFLAGS = $(CFLAGS) -O3
DBG_LFLAGS =
OPT_LFLAGS =
################
# Source
################
CPP_FILES = $(SOURCE_PATH)/main.cpp \
$(SOURCE_PATH)/somefile.cpp \
$(SOURCE_PATH)/yetanotherfile.cpp
\
HEADER_FILES = $(SOURCE_PATH)/somefile.h \
$(SOURCE_PATH)/yetanotherfile.h \
################
# Lists
################
TEMP_LIST_1 = $(CPP_FILES:$(SOURCE_PATH)/%=$(OBJ_PATH)/%)
OBJECTS_OPT = $(TEMP_LIST_1:%.cpp=%.o)
OBJECTS_DBG = $(TEMP_LIST_1:%.cpp=%.dbg.o)
################
# Rules
################
dbg : skipsomelines $(TARGET_PATH)/$(TARGET_NAME_DBG)
opt : skipsomelines $(TARGET_PATH)/$(TARGET_NAME_OPT)
skipsomelines :
echo
echo
echo
echo
echo
echo
usage:
#
# Usage:
# make usage (to see this info)
# make clean (to delete all the .o
files)
# make dbg (to build a
debug version)
# make opt (to build
an optimized version)
#
$(TARGET_PATH)/$(TARGET_NAME_OPT) : partialcleanopt $(OBJECTS_OPT)
g++ -lpthread -O3 -o
$(TARGET_PATH)/$(TARGET_NAME_OPT) $(OBJECTS_OPT) $(OPT_LFLAGS)
$(TARGET_PATH)/$(TARGET_NAME_DBG) : partialcleandbg $(OBJECTS_DBG)
g++ -g -D_DEBUG -o $(TARGET_PATH)/$(TARGET_NAME_DBG)
$(OBJECTS_DBG) $(DBG_LFLAGS)
$(OBJECTS_OPT) : $(OBJ_PATH)/%.o : $(SOURCE_PATH)/%.cpp $(HEADER_FILES)
g++ $(OPT_CFLAGS) -c $< -o $@
$(OBJECTS_DBG) : $(OBJ_PATH)/%.dbg.o : $(SOURCE_PATH)/%.cpp
$(HEADER_FILES)
g++ $(DBG_CFLAGS) -c $< -o $@
partialcleandbg :
rm -f $(TARGET_PATH)/$(TARGET_NAME_DBG)
partialcleanopt :
rm -f $(TARGET_PATH)/$(TARGET_NAME_OPT)
clean : partialcleandbg partialcleandbg
rm -f $(OBJECTS_OPT)
rm -f $(OBJECTS_DBG)
Now, let's suppose that you want to run a test that requires you to
allocate some memory that's
too big to fit on the heap, and your test also relies on a function or
API that
almost always works, but on rare occasions might throw an exception.
Your first (poor) attempt might look something like this:
void
main()
{
for(int
i = 0; i < 100000000; i++)
dotest(i);
}
void dotest(int i)
{
double*
vec = new double[100000];
unstableFunc(vec, i);
delete[] vec;
}
void unstableFunc(double*
vec, int i)
{
// this
function does something useful,
// but
on certain (very rare) occasions,
// it
will throw an exception
}
In the morning when you return, you will discover that it failed after
only a few thousand iterations. So you decide to catch exceptions
like this:
void main()
{
int i;
for(i = 0; i < 100000000; i++)
{
try
{
dotest(i);
}
catch(...)
{
printf("iteration %d failed\n",
i);
}
}
}
void dotest(int i)
{
double* vec = new double[100000];
unstableFunc(vec, i);
delete[] vec;
}
void unstableFunc(double* vec, int i)
{
// this function does something useful,
// but on certain (very rare) occasions,
// it will throw an exception
}
The next day, you discover that it's still running. Your machine is
unresponsive and it's only on iteration 200 thousand! What went wrong?
The problem is that every time an exception was thrown, you leaked
memory. When your memory ran out, it started swapping to your hard
drive. It thrashed your hard drive all night and since disk IO is much
slower than memory access, it hardly made any
progress. So how can you make sure that you don't leak memory while
recovering from an
exception? Well, you could avoid allocating memory, but that's often
not an option. So if you do allocate memory, use a holder to make sure
it gets cleaned up properly:
void
main()
{
int i;
for(i = 0; i
< 100000000; i++)
{
try
{
dotest(i);
}
catch(...)
{
printf("got a failure\n");
}
}
}
void dotest(int i)
{
double * vec =
new double[100000];
Holder<double*> h(vec);
unstableFunc(vec, i);
}
void unstableFunc(double* vec,
int i)
{
// this
function does something useful,
// but on
certain (very rare) occasions,
// it will
throw an exception
}
Now, the variable "h" will clean up your memory when it goes out of
scope, even if it goes out of scope due to an exception. So, what is a
holder? It's just a class that uses its destructor to delete your
pointer. Here's a Holder class that you can put in your toolbox if your
favorite platform doesn't already have a built-in holder class:
template
<class T>
class Holder
{
private:
T m_p;
public:
Holder(T p)
{
m_p = p;
}
~Holder()
{
delete(m_p);
}
};
Don't be scared because this class uses a template. Just imagine that
"T"
is automatically replaced with "double*" (because that's what was in
the angle brackets), and it should all make sense. So now you can write
robust code that can recover from exceptions and continue to operate
efficiently. Here's some other tips to make sure you don't have memory
leaks:
1- If you use inherritance, make sure
your destructors are virtual. Otherwise, the base-class destructor
won't be called in some situations!
2- If you do something like this:
void* array = new SomeClass[100];
be sure to clean up like this:
delete[] array;
not like this:
delete(array)
Note that the Holder class above uses the wrong delete for arrays. You
might want to make another class called ArrayHolder that uses the other
delete.
3- If you're not afraid of templates, just don't ever use arrays. use
vectors instead.
4- If possible, don't use new very often. Just keep your objects on the
stack and let them allocate and destruct memory in their
constructors/destructors.
5- Never use alloca in a loop. That's a quick way to overflow your
stack.
6- Don't use multiple inherritance unless you really know what you're
doing.
7- Run your code through Valgrind.
So, the next
day you return and discover that the whole test completed after only
a few hours. In great excitement, you rush to pass off your code with
the T.A. But to your great dismay you find that your code behaves
differently on the T.A.'s machine. What's going on?
Did you use "rand()" anywhere in your code? Don't use rand(). It's not
a very good pseudo-random number generator. It's not thread-safe. It's
not re-entrant. It's implemented differently on every architecture, so
you may not get consistent results on different machines, even if you
seed it with a constant value. And it's not as
efficient as many better pseudo-random number generators. Just don't
use it. Your favorite development platform probably provides a nicer
pseudo-random number generator that has a known huge cycle and meets
lots of rigorous testing standards. If you can't find a better one,
here's a simple one that you can use:
class
GRand
{
protected:
unsigned int
m_state[256];
unsigned int
m_x;
unsigned int
m_n;
public:
GRand(unsigned
int nSeed);
~GRand();
// Sets the
seed
void
SetSeed(unsigned int nSeed);
// Returns a
pseudo-random uint from a discrete uniform
//
distribution in the range 0 to 0xffffffff (inclusive).
unsigned int
GetUint();
// Returns a
pseudo-random uint from a discrete uniform
//
distribution in the range 0 to range-1 (inclusive).
// (It's
better to use this method than to just do
// "GetUint()
% range" because if you do that, and
//
"0x100000000 % range" is not equal to 0, then you
// will not
get a completely uniform distribution.)
unsigned int
GetUint(unsigned int range);
// Returns a
pseudo-random double from 0 (inclusive)
// to 1
(exclusive). This draws two random uints,
// and
discards half of one of them to obtain 48
// random
bits, which are uses for the mantissa of
// the double.
double
GetUniform();
// Returns a
random value from a standard normal distribution. (To
// convert it
to a random value from an arbitrary normal distribution,
// just
multiply the value this returns by the deviation, then add
// the mean.)
double
GetNormal();
};
GRand::GRand(unsigned int nSeed)
{
SetSeed(nSeed);
}
GRand::~GRand()
{
}
void GRand::SetSeed(unsigned int
nSeed)
{
m_n = nSeed ^
0xcfd41b91; // 3 to the 20th power
unsigned int i;
for(i = 0; i
< 256; i++)
{
m_state[i] = nSeed ^ m_n;
m_n *= 3;
}
m_x = 0;
}
unsigned int GRand::GetUint()
{
unsigned int
d, a, b, c;
d = m_n;
a = d &
255;
d = d >>
8;
b = d &
255;
d = d >>
8;
c = d &
255;
d = d >>
8;
m_n = (m_n +
m_state[a]) ^ m_state[m_x];
m_state[a] +=
m_state[b];
m_state[a] ^=
m_state[c];
m_state[m_x]
+= m_state[c];
m_state[m_x]
^= m_state[d];
m_x++;
m_x &= 255;
return m_n;
}
unsigned int
GRand::GetUint(unsigned int range)
{
// Use
rejection to find a random value in a range that is a multiple of
"range"
unsigned int n
= (0xffffffff % range) + 1;
unsigned int x;
do
{
x = GetUint();
} while((x +
n) < n);
// Use modulus
to return the final value
return x %
range;
}
double GRand::GetUniform()
{
// use 48
random bits for the mantissa
return
(4294967296.0 * (GetUint() & 0xffff) + GetUint()) /
281474976710656.0;
}
double GRand::GetNormal()
{
double x, y,
mag;
do
{
x = GetUniform() * 2 - 1;
y = GetUniform() * 2 - 1;
mag = x * x + y * y;
} while(mag
>= 1.0 || mag == 0);
return y *
sqrt(-2.0 * log(mag) / mag); // the Box-Muller
transform
}
Now that your code is rock-solid, wouldn't it be nice if you could
launch it as a daemon? This way, if your roommate closes your console
window and logs you out, your program will keep running until it's
done. And if you launch it on a remote machine over an SSH connection,
you don't have to keep your connection open until it's done--you can
just connect later to get the results. So here's some code that will
fork off a daemon. (Unfortunately this code doesn't work on Windows
because Windows isn't fully Posix compliant.)
typedef
void (*DaemonMainFunc)(void* pArg);
int LaunchDaemon(DaemonMainFunc
pDaemonMain, void* pArg)
{
// Fork the
first time
int firstPid =
fork();
if(firstPid
< 0)
throw "Error forking (the first time) in
LaunchDaemon";
if(firstPid)
return firstPid;
// Fork the
second time
int secondPid
= fork();
if(secondPid
< 0)
throw "Error forking (the second time) in
LaunchDaemon";
if(secondPid)
return secondPid;
// Drop my
process group leader and become my own process group leader
// (so the
process isn't terminated when the group leader is killed)
setsid();
// Set the
file creation mask. (I don't know why we do this.)
umask(0);
// Get off any
mounted drives so that they can be unmounted without
// killing the
daemon
chdir("/");
// Launch the
daemon
pDaemonMain(pArg);
exit(0);
}
And here's some code to show you how to use it:
void
daemonMain(void* pArg)
{
runTests();
saveResultsToFile();
}
int main()
{
#ifdef _DEBUG
// don't fork
off a daemon in debug mode. That would just complicate debugging
daemonMain(NULL);
#else
LaunchDaemon(daemonMain, NULL);
#endif
}
The End