Reference vs. pointer parameters in C++

Mar 2, 2012   //   by Ray Mitchell   //   Blog  //  No Comments

Reference and Pointer Parameters

A question that often arises when working with C++ is whether it’s better to use references or pointers for function parameters.  Both types of parameters provide the ability for a function to indirectly access an object in the calling environment.  This indirect access provides several benefits:

  1. The object referred/pointed to by the argument does not need to be copied in order for the function to have access to that object’s value
  2. The function can directly modify the object referred/pointed to by the argument
  3. The function can take parameters of types that do not allow copying

Consider the following C++ code:

class Big
{
public:
    char array[1000000];     // 1,000,000 bytes
};

void foo(Big &big, Big *pBig)
{
    big.array[0] = 'a';      // Sets 0th element in big's array to the letter 'a'
    pBig->array[0] = 'a';    // Does the same thing as the previous line
}

int main()
{
    Big *pBig = new Big;
    foo(*pBig, pBig);        // The Big object on the heap is not copied
    delete pBig;
}

When this code runs the first line of main allocates 1,000,000 bytes on the heap to hold the dynamically allocated Big object.  main then calls foo passing the Big object as the first argument and the address of the Big object as the second argument.  foo’s first parameter is initialized to refer to the Big object and foo’s second parameter is initialized to hold the address of the Big object.  The Big object is not copied.  foo then modifies the 0th element in the Big object’s array member twice, first through the reference parameter then through the pointer parameter.  This demonstrates that the reference and pointer parameters are providing the same capabilities with the only difference being the syntax used.

How then should we decide whether to use a pointer or a reference?  Is this another case of there being no reason to prefer one vs. the other as with structs and classes (http://www.cplusplus.com/forum/beginner/5980/)?  Not quite.  There is more to consider before deciding which option is best.

References Must Refer to an Object

A reference must always refer to an object. Iit can never refer to no object in the way that a pointer can contain the null address (0).  By taking advantage of this, we can guarantee that a valid object must always be passed as the argument to a function taking a reference parameter.  This eliminates the need for us to verify that the argument passed to a reference parameter is non-null:

void bar(Big *pBig)
{
    if (pBig == 0)
    {
        // Handle error condition - caller passed in null where non-null pointer expected
    }

    // Rest of function...
}

void bar(Big &big)
{
    // No need to verify non-null argument passed in
}

This is an important reason to consider using reference parameters over pointers.  But does this mean we should never use pointer parameters?

Does a Function Call Modify its Arguments?

When maintaining code, it is useful to be able to identify where changes to the program state take place.

Consider this code:

#include "MyLibrary.h"

int main()
{
    int val = 7;
    int result = zyzzyx(val);
}

Does the call to the zyzzyx function modify val?  We can’t be sure because the parameter to zyzzyx could be by value, in which case val is definitely not modified but it might instead be a reference, in which case val could be modified.  There’s no way for us to know for sure without leaving this section of code to look at the prototype for zyzzyx (in this case located in another file).  These are the three ways the parameter to zyzzyx could be defined:

int zyzzyx(int);          // By value
int zyzzyx(int &);        // By reference
int zyzzyx(const int &);  // By constant reference

In the first case, zyzzyx does not modify the argument because the argument will be copied into the parameter (this is call by value).  In the second case, we have to assume zyzzyx will modify the argument because the parameter is a reference to a non-const object (I say “we have to assume” because if zyzzyx does not actually modify the parameter then the author should have made the parameter a reference to a const int).  In the third case, zyzzyx does not modify the argument because even though the parameter is a reference the compiler will not allow changes to the object being referred to.

So, the unfortunate situation is that there is no way we can know whether a call to zyzzyx(val) modifies val without looking at the declaration of zyzzyx.  The same situation exists for pointer parameters:

#include "MyLibrary.h"

int main()
{
    int val = 7;

    // Does this call modify val? No way we can know without seeing zoid's prototype
    int result = zoid(&val);
}

This means that regardless of whether we use a reference or pointer parameter there is no way to tell by just looking at the point of a call whether an argument will be modified by the function call.  Not being able to easily tell whether a function call changes the state of the running program means that we must treat every function call as possibly resulting in side-effects.  Code with side-effects is harder to understand and maintain because we must keep a mental model of the changing state as each line executes.*  It would be nice if there was a way to know whether a function call has side effects without having to look away from the point of the call.

*To read more about side-effects see http://en.wikipedia.org/wiki/Side_effect_%28computer_science%29.  Functional programming aims to minimize side-effects:  http://en.wikipedia.org/wiki/Functional_programming#Pure_functions.

Syntax Differences to the Rescue

We can take advantage of the fact that references and pointers use different syntaxes to provide a visual cue at the point of each function call indicating whether each argument is or is not changed by the call.  We will adopt the convention of only defining reference parameters when the object referred to will not be modified, and only defining pointer parameters when the object pointed to will be modified.  Our reference parameters will always be defined to refer to const objects and our pointer parameters will always be defined to point to non-const objects.

As an example, here is header file declaring the functions:

// File: MyLibrary.h
int baz(const int &);  // Does not modify referred-to object
int bat(int *);        // Modifies pointed-to object

And here is a file where the functions are called:

// File: main.cpp
#include "MyLibrary.h"

int main()
{
    int val = 7;
    int result;

    // We know baz doesn't modify val even without looking in MyLibrary.h because the
    // parameter is passed using value-syntax. Our convention tells us that if the parameter
    // is a reference then it is a reference to a const which means the argument will not be
    // modified. If the parameter is by value then the argument will be copied which also
    // means the argument will not be modified. Either way, when we see what appears to
    // be call-by-value syntax the argument won't be changed.
    result = baz(val);

    // We know bat modifies val without looking in MyLibrary.h because the address of the
    // parameter is passed. Whenever we see an address being passed we know the function
    // will change the object at that address. The address-of operator (&) is our visual
    // cue that the argument will be changed.
    result = bat(&val);
}

Now, if we are consistent about using this convention we can be sure that an argument passed to a function using the “call-by-value” syntax will not be changed.  We can also be sure that when an address is passed to a function the object at that address will be changed.

Conclusion

Deciding whether to use a reference or a pointer parameter is not as simple a choice as it first may appear.  With the convention mentioned in this article I’ve found it easy to write consist code that clearly identifies which function calls are changing program state.  I’m sure other conventions exist.  The key is to be consistent and have a clear understanding of why you are following the convention you use.

Leave a comment

(will not be published)

CAPTCHA Image
*

Recent Posts