Turn your Raspberry Pi into a radio controller for RC vehicles

Ever since I tried turning the Raspberry Pi into an FM transmitter, I had wondered if it would be possible to make it drive a radio-controlled car. It turns out that it is! With no modifications to your Pi, you can be driving around a toy-grade RC car iFe, and I’ve read about a few other ones online that should work as well. Any RC toy that works at a frequency in the range of 1-250 MHz should be controllable by the Pi once you figure out the command signals.

I’m going to talk a little bit about how RC cars work, how the code and hardware works on the Pi, and about RC controls in general. If you just want to start driving around, you can jump to the end.

Raspberry Pi FM radio

The first version of the Raspberry Pi FM transmitter was written by Oliver Mattos and Oskar Weigl. It works by using the hardware that is normally meant to produce spread spectrum signal clock signals on the GPIO pins to generate a radio frequency. The GPIO pin itself acts as an antenna, which you can extend by attaching a length of wire such as a jumper cable. By reading samples from an audio file and writing them out to the GPIO pin at a certain rate, it is possible to tune in a radio and hear the audio.

The original version of this code had a small problem though. FM requires a continuous stream of radio signals, but the Raspberry Pi normally runs Linux, which is not a real-time operating system. This means that you can start, stop, and run different programs at the same time, but because the Pi only has a single CPU core, it has to simulate multiple programs running at the same time by constantly switching between them. Whenever there was a context switch away from the program that was writing to the GPIO pins, the radio signal would skip and pop. Richard Hirst came up with a great solution to this problem: use the DMA controller to continually write data to the GPIO pins. Because DMA bypasses the CPU altogether, it is not affected by context switches, and it has the side effect of reducing the total CPU load because the controlling program now only has to periodically refill a circular buffer. Unfortunately, it also means that it’s not possible to stop broadcasting for short periods of time (at least not that I’ve been able to figure out), which will be problematic later when we try to control an RC car.

Frequency Modulation, Amplitude Modulation, and Pulse Modulation

There are two common ways of encoding audio over a radio signal: frequency modulation (FM) and amplitude modulation (AM). FM works by continually varying the frequency of the signal in small amounts while keeping the amplitude the same, while AM keeps the frequency the same while varying the amplitude. The difference is best described with a picture:

am-fm
Animated diagram representing the difference between radio waves modulated by amplitude and by frequency. By Berserkus, from Wikipedia. Used under CC BY-SA 2.5.

The Raspberry Pi FM program uses the fractional divider to control and change the frequency of the generated signal. This allows it to generate an FM audio signal. Because the Pi is just writing values to the GPIO pin, I don’t think that it’s possible to control the amplitude of the signal, so I don’t think that it’s possible to have the Pi generate AM audio signals.

There are a few different control schemes that remote control toys use. Most toy-grade RC cars (ones with forward/back + left/right control) use pulse modulation. The radio controller broadcasts a signal at a constant frequency for a certain amount of time, and then stops broadcasting for another amount of time. It then repeats this cycle with varying lengths of broadcast and pause time in a particular pattern. The RC car recognizes this pattern as a command, such as “forward” or “reverse right” and starts moving.

The specific pattern typically consists of some base time length, such as 400 microseconds, and all pulses and pauses are multiples of that length. A control signal starts with a synchronization pattern consisting of a broadcast for some length of time and a pause, which is repeated a certain number of times. Then a number of bursts and pauses are repeated for the command. For example, the New Bright 1:24 truck sends synchronization signals of 1200 microsecond bursts with 400 microsecond pauses repeated 4 times, followed by a number of 400 microsecond bursts and 400 microsecond pauses. If that signal is repeated 40 times, the vehicle drives forward, while 28 bursts tell it to drive reverse and right.

Remember earlier when we had to use DMA to ensure a constant signal? Because of this, I haven’t been able to figure out how to make the Pi stop broadcasting for a consistent amount of time, which is needed for the signal pauses. However, think back to how FM radio signals work. Because the Pi send FM, it can very quickly change the frequency that it is broadcasting on. To simulate a broadcast pause, it instead broadcasts at a different frequency for a length of time. Because the RC car is only listening for signals at one particular frequency, as far as it is concerned, the Pi has stopped broadcasting altogether.

A step up from toy-grade cars usually have proportional steering and throttle control, so you can more precisely control the vehicle. Most high-end hobby grade devices use pulse-position modulation. Andrew Hazelden has an excellent explanation of how PPM works. I wanted to get a car with proportional steering and throttle control, but most RC cars that I looked at operate in gigahertz range which is too high for the Pi, so I instead bought a RadioShack Dune Warrior which runs in the 27 MHz range.

Finding the signal

With toy-grade cars, I’ve found that the easiest way to find the signals that control them is to just use brute force. They mostly use very similar signal patterns, so you can simply iterate through different control pattens until the car responds. For more complex patterns, you’ll probably need to use an oscilloscope to analyze the patterns.

Brute force

For toy-grade RC cars, you can just turn on the car and iterate through different control patterns until you find one that causes the car to move. Every car that I’ve looked at uses one synchronization pattern for all of the controls that it can respond to, so once you find one signal, it’s just a matter of guessing how many signal repeats control the remaining actions.

Iterating through all possible patterns can take a few hours, and I didn’t want to wait around for the car to move, so I instead pointed a webcam at the car and then waited until the image changed. Once the car moves, the computer will see that the image has changed and will save the image and the pattern.

I tried a few different techniques to monitor the webcam to decide if the car has moved. First, I tried just computing the percent difference of the color values of the pixels between a base photo and the most recent photo. This generated a lot of false positives due to changing lighting conditions, such as when a cloud passes overhead. If you run this program, I recommend placing the car in a closet or some other place where you can keep the lighting consistent.

Next, I converted the image to greyscale and then normalized the colors so that the darkest color was black and the lightest color was white. This reduced the noise between pictures a lot and helped with my varying lighting conditions, but I was still worried about false positives. I tried one more thing: reducing the image depth to 1 bit, so it was only black and white.

1-bit-posterized-rc-car

This reduced the noise to a really low level, and because the wall behind the car was white, once it moved, the difference was very clear. This reduced my false positives to 0, so I stopped here. I had considered running edge detection on the base image and then ignoring differences along the detected edges because that’s where most of the difference between photos appeared but I ended up not needing to.

Oscilloscope

For more complex vehicles, it’s easiest to hook an oscilloscope up to the radio controller and then observing the changes in the command signal as you move the controller. This is what we ended up doing with the Dune Warrior.

dune-warrior-oscilloscope
Dune Warrior, straight and full throttle

We hooked up the oscilloscope and watched how the signal changed as we adjusted the throttle and the steering. If you want to do this yourself, I strongly suggest recording the signal to video so that it can be reviewed later. The signal starts with a 500 microsecond synchronization burst and pause, followed by 22 more bursts. Each burst is either 127 microseconds or 200 microseconds, with a pause of the same length. The burst lengths are used to encode a binary pattern, where the shorter bursts are a 0 and the longer bursts are a 1. 6 bits are for steering, 5 are used for the throttle, and 1 is used as an even parity bit; the remainder never appear to change and I’m not sure what their function is. What’s strange is that the bits for steering are not contiguous; bits 7-8 form the 2 most significant bits, while 1-4 form the rest. 32 is centered for steering, while 0 is sharp left and 63 is sharp right. Similarly, 16 is idle for throttle while 0 is full reverse and 31 is full forward.

Running your own car

If you have a cheap toy-grade RC car (i.e. one that only has controls for forward/backward and left/right) then you should be able to find the command signals and control the car pretty easily. Clone the repository from my GitHub on your Raspberry Pi and compile pi_pcm by running scons. Run pi_pcm as root. By default, the program listens for JSON commands on port 5432.

Turn on the car and put it in a location where you can control the lighting (such as a closet). Point a computer that has a webcam at the car (it can be the same computer or a separate one), run python watch.py -f [frequency] -s [Pi IP address] and enter the frequency of your RC car (this should normally be 27 or 49 MHz). You’ll need to have gstreamer installed to capture the webcam stream. Most cars operate on one of a several different channels in the 27 or 49 MHz range; if you don’t know the exact channel, then just enter 27 or 49 and the program will iterate through each channel for you.

Once the program sees the car move, it will save a picture and rename it according to the command it just sent. The filename format is frequency-microseconds-synchronizationBurstLength-synchronizationBurstCount-signalCount. From there, you can run python control.py -f [frequency] -s [Pi IP address]. It will prompt you for the command information and ask you for different signal bursts; try different ones (1 – 100 or so) and observe how the RC car reacts. Once you have all of the signals, save the information in a JSON file (see control-specs/pro-dirt.json for an example). Now you should be able to run python interactive_control.py -s [Pi IP address] [JSON control file] and drive the car using the arrow keys.

More thoughts

There are a few other projects that you cna build from here. For one, I’m planning on entering the SparkFun Autonomous Vehicle Competition with my Dune Warrior. You can make an autonomous vehicle yourself by buying a USB battery pack for the Pi and taping both to the top of your car and writing a simple control program. I’m also planning on buying a Raspberry Pi camera, slapping on a WiFi USB dongle, and implementing Twitch drives your car in my apartment.

Posted in Uncategorized | Tagged , | 23 Comments

Using C++11 variadic templates to create a type-safe and injection-safe database interface: part 2

Introduction

In part 1, we defined a variadic template function to set up and run prepared SQL statements that are impervious to injection attacks. In this post, we’ll define the second half of our library that will automatically convert and store results from the query in a typesafe matter.

Previously, we defined a variadic templated function that bound all of the input parameters for use in prepared statements. Here, we’ll use two separate functions: one to bind the output parameters from the prepared statement so that we have somewhere to store the results, and one to convert those output parameters to a tuple.

Code

Binding the output

We’ll first define a variadic templated function to bind output parameters to the prepared statement. When MySQL runs the statement, it will store the results in these parameters. The MySQL needs to know the types of each of the parameters when they are bound, so we’ll define partial template specializations for each data type.

We want to be able to partially specialize our templates so that we can have specializations for unique_ptr and shared_ptr types. C++ doesn’t allow you to partially specialize functions (that’s what overloading is for), so we’ll define a templated class with a single static method in order to use partial template specialization.

We’ll start with the least specialized template. The function will take a MySQL bind structure, a buffer in which to save the results from the query, and structure that MySQL will set if the result of the query is null. For the least specialized version, we’ll tell MySQL to try to convert the result to a string, and then rely on boost::lexical_cast to convert the string back to a relevant type later.

template <typename T>
class OutputBinderParameterSetter {
  public:
    static void setParameter(
      MYSQL_BIND* const bind,
      std::vector<int8_t>* const buffer,
      my_bool* const isNullFlag
    ) {
      bind->buffer_type = MYSQL_TYPE_STRING;
      if (0 == buffer->size()) {
        // 20 is an arbitrary default. If the buffer is too short, we
        // will resize it later and refetch the results from MySQL
        buffer->resize(20);
      }
      bind->buffer = buffer->data(); // data() is new in C++11
      bind->is_null = isNullFlag;
      bind->buffer_length = buffer->size();
    }
};

The full specializations for the normal C++ types will come next. Unlike before, we know the exact required buffer size by using the sizeof operator on the type, so we won’t need to tell MySQL the buffer size. We will need to set an is_unsigned flag for integral types.

template <>
class OutputBinderParameterSetter<uint32_t> {
  public:
    static void setParameter(
      MYSQL_BIND* const bind,
      std::vector<int8_t>* const buffer,
      my_bool* const isNullFlag
    ) {
      bind->buffer_type = MYSQL_TYPE_LONG;
      buffer->resize(sizeof uint32_t);
      bind->buffer = buffer->data(); // data() is new in C++11
      buffer->is_null = isNullFlag;
      bind->is_unsigned = 1;
    }
};

This pattern will be repeated for the other integral types. The partial specialization for float and double are similar, but you don’t need to set the is_unsigned flag.

We also need to define partial specializations for shared_ptr and unique_ptr types. There’s nothing special to do when we’re just binding the output parameters, so we’ll just forward to the full specializations.

template <typename T>
class OutputBinderParameterSetter<shared_ptr<T>> {
  public:
    static void setParameter(
      MYSQL_BIND* const bind,
      std::vector<int8_t>* const buffer,
      my_bool* const isNullFlag
    ) {
      OutputBinderParameterSetter<T>::setParameter(bind, buffer, isNullFlag);
    }
};

Our final specializations are for pointer types. Because C++11 added shared_ptr and unique_ptr, there’s little good reason to handle memory manually. To discourage uses from using raw pointers, we’ll define a partial template specialization that will fail to compile and give an informative message. We’ll use another C++11 feature, static_assert, to accomplish this. We don’t want the compiler to produce an error message unless the user is actually trying to instantiate the specialization, so we’ll use the type as part of the assert.

template <typename T>
class OutputBinderParameterSetter<T*>> {
  public:
    static void setParameter(
      MYSQL_BIND* const,
      std::vector<int8_t>* const,
      my_bool* const
    ) {
      static_assert(
        // C++ guarantees that the sizeof any type >= 0,
        // so this will always give a compile time error
        sizeof(T) < 0,
        "Raw pointers are not supported; use std::shared_ptr"
          " or std::unique_ptr instead");
    }
};

Converting the output

Next we’ll define a variadic templated function to convert the results from the executed query into the types from the tuple that the user has passed in. Like before, we want to be able to partially specialize our templates.

We’ll start with the generic version that we will fall back to if no other specializations are a better match. In this case, we’ll fall back to boost::lexical_cast to convert the buffer to the correct type, and the type’s operator=(T&&). It’s worth noting that we handle null values with our shared_ptr and unique_ptr specializations, so if we do encounter a null value here, then that’s a hard error.

template <typename T>
class OutputBinderResultSetter {
  void setResult(
    T* const value,
    const MYSQL_BIND& bind
  ) {
    if (*bind.is_null) {
      throw MySqlException("Null value handled with non-pointer type");
    }
    *value = boost::lexical_cast<T>(static_cast<char*>(bind.buffer));
  }
};

With that out of the way, we can start defining the full specializations for the normal types. Similar to the generic version, we’ll throw an exception if the result is unexpectedly null.

template <>
class OutputBinderResultSetter<uint32_t> {
  public:
    static void setResult(
      uint32_t* const value,
      const MYSQL_BIND& bind
    ) {
      if (*bind.is_null) {
        throw MySqlException("Null value handled with non-pointer type");
      }
      *value = *static_cast<const uint32_t*>(bind.buffer);
    }
};

We also need to define partial specializations for shared_ptr and unique_ptr types. In the other specializations, encountering null values is an error, but here, we can just set the pointer type to NULL and delete any object that may have been owned by the pointer. Otherwise, we just fall through to the full specialization.

template <typename T>
class OutputBinderResultSetter<std::shared_ptr<T>> {
  public:
    static void setResult(
      std::shared_ptr<T>* const value,
      const MYSQL_BIND& bind
    ) {
      if (*bind.is_null) {
        value->reset();
      }
      T* newObject = new T;
      OutputBinderResultSetter<T>::setResult(newObject, bind);
      *value = std::shared_ptr<T>(newObject);
    }
};

Our final specializations are for pointer types. Like before, we want to prevent users from using raw pointers, so we’ll use static_assert to raise a compile-time error.

template <typename T>
class OutputBinderResultSetter<T*> {
  public:
    static void setResult(
      T** const,
      const MYSQL_BIND&
    ) {
      static_assert(
        sizeof(T) < 0,
        "Raw pointers are not supported; use std::shared_ptr"
          " or std::unique_ptr instead");
    }
};

Putting it all together

Now that we have the specializations for binding and converting the parameters, we can write the function that runs a statement and saves the results into a vector of tuples.For the sake of brevity, I’m going to remove out a lot of error checking and just leave comments in their place.

We’ll define a variadic templated function that takes a MySQL statement that’s already been prepared (i.e. has already had the input parameters bound to it), and a vector of tuples. At this point, we have 2 options: we could directly append new rows to the vector that the user provides, or we could replace whatever was in the vector with the results from the successfully run query. Doing the first is convenient because users can run multiple statements in a row and append the results to the same vector, while the second is nice because we can support the strong guarantee. With the first, we could save several rows before encountering a data error, leaving the data set inconsistent. Here, we’ll take the second choice.

template <typename T>
void bindParameters(
  std::vector const mysqlBindParameters,
  std::vector<std::vector>* const buffers,
  std::vector* const nullFlags,
  int_<I>,
  const Tuple* const unused
) {
  OutputBinderParameterSetter<
    typename std::tuple_element<I>::type
  >::setParameter(
    &mysqlBindParameters->at(I),
    &buffers->at(I),
    &nullFlags->at(I));

  bindParameters(
    mysqlBindParameters,
    buffers,
    nullFlags,
    int_<I>(),
    unused);
}
template <>
void bindParameters(
    std::vector* const,
    std::vector<std::vector>* const,
    std::vector* const,
    int_,
    const Tuple* const
) {
}

template 
void setResults(
  MYSQL_STMT* const statement,
  std::vector<std::tuple>* const results
) {
  // Allocate space for the bind parameters, buffers,
  // buffer lengths, and null flags
  const auto fieldCount =  mysql_stmt_field_count(statement);
  std::vector parameters(fieldCount);
  std::vector<std::vector> buffers(fieldCount);
  std::vector lengths(fieldCount);
  std::vector nullFlags(fieldCount);

  const std::tuple* unused = nullptr;
  // Bind all of the output parameters
  bindParameters(
    &parameters,
    &buffers,
    &nullFlags,
    int_(),
    unused);

  for (size_t i = 0; i < fieldCount; ++i) {
    parameters.at(i).length = &lengths.at(i);
  }

  mysql_stmt_bind_result(statement, parameters->data());
  mysql_stmt_execute(statement);
  std::vector<std::tuple> tempResults;
  int fetchStatus = mysql_stmt_fetch(statement);
  // Keep fetching results until there are no more
  while (0 == fetchStatus) {
    std::tuple rowTuple;
    setResultTuple(
      &rowTuple,
      parameters,
      int_(sizeof...(Args) - 1));

    tempResults.push_back(std::move(rowTuple));
    fetchStatus = mysql_stmt_fetch(statement);
  }
  *results = std::move(tempResults);
}

Now we can define the runQuery function in the MySQL class. This function is a little more complicated than our previous runCommand function, because we’re going to define a template with two variadic arguments: one for the input bind parameters, and one for the type of the output tuple (i.e. the types of the columns from the query).

template <typename... InputArgs, typename... OutputArgs>
void runQuery(
  std::vector<std::tuple<OutputArgs...>>* const results,
  const char* const query,
  const InputArgs&... args
) const {
  MYSQL_STMT* const statement = mysql_stmt_init(connection_);

  const size_t length = ::strlen(query);
  mysql_stmt_prepare(statement, query, length);
 
  // SELECTs should always return something, so if this
  // is 0, the user tried to run a command, e.g. UPDATE
  if (0 != mysql_stmt_field_count(statement)) {
    throw MySqlException("Tried to run command statement with runQuery");
  }
 
  const size_t parameterCount = mysql_stmt_param_count(statement);
  if (sizeof...(InputArgs) != parameterCount) {
    std::string errorMessage("Incorrect number of parameters; query required ");
    errorMessage += boost::lexical_cast<std::string>(parameterCount);
    errorMessage += " but ";
    errorMessage += boost::lexical_cast<std::string>(sizeof...(args));
    errorMessage += " parameters were provided.";
    throw MySqlException(errorMessage);
  }

  std::vector<MYSQL_BIND> inputBindParameters(parameterCount);
  InputBinder<0, Args...> binder;
  binder.bind(&inputBindParameters, args...);

  setResults<OutputArgs...>(statement, results);
}

And that’s it!

Next time, we’ll look at how we can use another C++11 feature, user defined literals, to check at compile time that the number of expected arguments in our prepared statements matches the number of provided arguments.

Posted in Uncategorized | 1 Comment

Using C++11 variadic templates to create a type-safe and injection-safe database interface: part 1

Introduction

Prior to C++11, there was no general way in C++ to create type-safe functions that would take a variable number of arguments. C++ had inherited ..., the ellipsis operator, from C that would allow a function to take a variable number of arguments that would then be stored on the stack. These arguments would not be type checked at compile time, and if you weren’t careful, you would encounter runtime errors.

C++11 introduced variadic templates, which are templates that take a variable number of arguments. Similarly to how template metaprogramming works, you specify a template recursively and the compiler will unroll the template definitions at compile time, affording you type safety and the performance of linear code without sacrificing the flexibility of variable numbers of arguments.

In this post, we’ll use variadic templates to create a database interface that is both safe from SQL injection attacks and that automatically type checks and converts results from SELECT statements. In the end, you’ll be able to write code similar to this:

MySql connection{"localhost", "user", "password"};
vector<tuple<string, string, int>> users;
connection.runQuery(
  &users,
  "SELECT first_name, last_name, age FROM users WHERE username LIKE ? AND age > ?",
  username,
  age);

which will run the query and store the results into the provided vector.

Variadic templates

Syntax

A template that takes a variable number of arguments is specified using ..., the ellipsis operator. As with templates in C++98, you can create templated versions of classes or functions. For this example, we will be using the ellipsis to create parameter packs, which hold the variadic parameters. We define a template with two variable types: one from the beginning of the parameter pack and a second variadic type that represents the rest of the parameter pack. The compiler will unroll the parameter pack one parameter at a time and call the function for each parameter. Finally, we will provide an overload that takes no parameters to close off the the unrolling.

As an example, we could redefine the C function printf as follows:

template <typename Head, typename... Tail> // Tail is the parameter pack
int printf(const char* format, Head value, Tail... args);

int printf(const char* format);

A call to printf("%d %s %e", 1, "test", 1.5) will cause the compiler to generate these template instances:

printf<int, Tail...>("%d %s %e", 1, "test", 1.5)  // with Tail... = [const char*, double]
printf<const char*, Tail...>("%d %s %e", "test", 1.5) // with Tail = [double]
printf<double, Tail...>("%d %s %e", 1.5) // with Tail = []
printf("%d %s %e")

and each template instance will have its code expanded inline.

SQL injection

SQL injection is a technique used to attack databases. It works by causing the application to interpret user supplied input as SQL commands that alter the original query to something other than what the developer intended. There are several ways to mitigate this attack; we’re going to use parameterized queries here. Parameterized queries prevent SQL injection by clearly separating user-supplied data from the query that is supposed to use that data. This separation prevents the database engine from accidentally interpreting user data as SQL commands.

The MySQL prepared statement C API requires the developer to specify a full query with placeholders meant for user-supplied data, and then bind parameters to the query by specifying each parameter’s type and value. In C, this would be done as follows:

MYSQL_BIND parameter;
memset(&parameter, 0, sizeof(parameter));
parameter.buffer_type = MYSQL_TYPE_LONG;
parameter.buffer = &age;
parameter.is_null = &nullFlag;
parameter.is_unsigned = 0;
parameter.error = &errorFlag;
mysql_stmt_bind_param(&statement, &parameter);

Setting up parameterized queries requires a lot of boilerplate code and is prone to typos. For example, using the wrong type code on line 3 can cause runtime errors. However, by using variadic templates and parameter packs, we can define functions that will automatically prepare these parameters for us at compile time based on the types of the arguments that we provide.

Binding parameters

To start with, we’ll define a variadic templated class that we’ll use to bind the input parameters to the query. C++11 doesn’t allow for partial template specialization of variadic templated functions, so we’ll need to encapsulate a function inside of a class. We’ll start by defining the base case for the variadic template expansion.

template <size_t N, typename... Args>
struct InputBinder {
  static void bind(std::vector<MYSQL_BIND*>* const) {}
};

This function will take a vector of MYSQL_BIND parameters that we will set up one by one. We also include a size constant in the template arguments so that we know which parameter to bind to. This variable will be incremented when we make recursive instantiations of this class.

To expand the variadic template, we’ll define a function to unpack a single parameter from the parameter pack at a time. We’ll eventually define partial template specializations for different types of parameters so that we can set up the special behavior required for binding the parameters, but to start, we’ll define the generic function. This function will be instantiated if there’s no other specialization that matches a type, and we want to force the programmer to explicitly define specializations for each type that they use, so we’ll cause the compiler to generate an error if this function is ever instantiated.

template <size_t N, typename Head, typename... Tail>
struct InputBinder<N, Head, Tail...>
  static void bind(
    std::vector<MYSQL_BIND>* const,
    const Head&,
    const Tail&...
  ) {
    static_assert(
      sizeof...(Tail) < 0,
      "All types need to have partial template specialized instances"
      " defined for them");
  }
};

Here, we’re using another C++11 feature, static_assert, to provide a meaningful error message if this function is ever instantiated. C++ guarantees that the sizeof... any parameter pack is at least 0, so this assertion will always fail.

The simplest types to bind parameters to are integral types, so we’ll start with those. We’ll take a MYSQL_BIND from the provided vector and set the various attributes required for binding the parameters. When we’re done, we’ll recursively instantiate another variadic template class to bind the remaining parameters.

template <size_t N, typename... Tail>
struct InputBinder<N, Tail...>
  static void bind(
    std::vector<MYSQL_BIND>* const bindParameters,
    const int32_t& value,
    const Tail&...
  ) {
    // Set up the bind parameters
    MYSQL_BIND& bindParameter = bindParameters->at(N),
    bindParameter.buffer_type = MYSQL_TYPE_LONG;
    bindParameter.buffer = const_cast<void*>(
      static_cast<const void*>(&value));
    bindParameter.is_unsigned = 0;
    bindParameter.is_null = 0;

    // Recursively instantiate another template
    // to bind the rest of the parameters
    InputBinder<N + 1, Tail...> binder;
    binder.bind(bindParameters, tail...);
  }
};

The remaining integral types will have similar specializations but with changes to the buffer_type and is_unsigned attributes. MySQL ignores the is_unsigned attribute for floating type points, so we can drop that line when we define those specializations. The specializations for string types require an additional buffer_length parameter to be set:

template <size_t N, typename... Tail>
struct InputBinder<N, Tail...>
  static void bind(
    std::vector<MYSQL_BIND>* const bindParameters,
    const std::string& value,
    const Tail&...
  ) {
    // Set up the bind parameters
    MYSQL_BIND& bindParameter = bindParameters->at(N),
    bindParameter.buffer_type = MYSQL_TYPE_STRING;
    bindParameter.buffer = const_cast<void*>(
      static_cast<const void*>(value.c_str()));
    bindParameter.buffer_length = value.length();
    bindParameter.is_null = 0;

    // Recursively instantiate another template
    // to bind the rest of the parameters
    InputBinder<N + 1, Tail...> binder;
    binder.bind(bindParameters, tail...);
  }
};

Running the code

Now that we have our variadic template defined, we just need to call it from the function that runs commands. We’ll first tell MySQL to prepare a query, make sure the number of provided parameters match the number of parameters in the query, bind the parameters using our template, and then run the query.

template<typename... Args>
my_ulonglong runCommand(
  const char* const command,
  const Args&... args
) {
  MYSQL_STMT* const statement = mysql_stmt_init(connection_);

  const size_t length = ::strlen(command);
  mysql_stmt_prepare(statement, command, length);

  // Commands (e.g. INSERTs or DELETEs) should always have this set to 0
  if (0 != mysql_stmt_field_count(statement)) {
    throw MySqlException("Tried to run SELECT statement with runCommand");
  }

  const size_t parameterCount = mysql_stmt_param_count(statement);
  if (sizeof...(args) != parameterCount) {
    std::string errorMessage{"Incorrect number of parameters; command required "};
    errorMessage += boost::lexical_cast<std::string>(parameterCount);
    errorMessage += " but ";
    errorMessage += boost::lexical_cast<std::string>(sizeof...(args));
    errorMessage += " parameters were provided.";
    throw MySqlException(errorMessage);
  }

  std::vector<MYSQL_BIND> bindParameters;
  bindParameters.resize(parameterCount);
  InputBinder<0, Args...> binder;
  binder.bind(&bindParameters, args...);
  mysql_stmt_execute(statement);

  // If the user ran a SELECT statement or something else, at least warn them
  const my_ulonglong affectedRows = mysql_stmt_affected_rows(statement);
  if (((my_ulonglong)(-1)) == affectedRows) {
    throw MySqlException{"Expected command but ran SELECT statement"};
  }

  // Cleanup
  mysql_stmt_free_result(statement);
  mysql_stmt_close(statement);

  return affectedRows;
}

When we instantiate the InputBinder template, we start if with a size of 0 so that it will start binding parameters with the first element from the provided vector.

With that, we can start running commands using code like this:

MySql connection{"localhost", "user", "password"};

connection.runCommand(
  "INSERT INTO user (name, age) VALUES (?, ?), (?, ?)",
  names[0],
  ages[0],
  names[1],
  ages[1]);

In the next post, I’ll cover how to use variadic templates to parse the results from MySQL and store them into a vector. All of the code in this tutorial is available at from my GitHub account.

Posted in Uncategorized | Tagged , , | 1 Comment

Writing a simple shell using Flex and Lemon: part 2

Introduction

In part 1, we created a simple parser and driver for a basic shell that supports piping and command substitution. In order to test our parser, the driver in our last part only simulated reading tokens. In this part, we’re going to be using Flex to generate a real scanner that’s capable of reading tokens and passing them to the parser.

One other reason that I prefer Lemon over Bison is that Lemon is automatically reentrant and thread safe. Flex is not normally thread safe, but we’ll do some extra work to make it that way so that we can use it with multiple threads if we need to. In the process, we’ll also write less tightly coupled code.

Flex

Flex is a tool that lets us write a scanner using a higher level descriptive input, similar to how we used our Lemon grammar file to produce a parser. Flex is nice in that it allows us to use regular expressions to match tokens, and can automatically ignore white space.

Our scanner needs are pretty simple: our parser only has four tokens, PIPE, two tokens for starting and ending command substitutions, FILENAME which will match Unix files, and ARGUMENT, which could be a program name, a file name, a flag, or some other argument. The first three are relatively easy, but the last will require a little more work because we’re going to allow quoting of parameters.

Flex syntax

A Flex specification file consists of three sections. The first section contains configuration options, the second contains the rules for matching tokens, and the third allows you to enter code that will be copied to the end of the output file. Each section is separated by the token %%. We will only use the first two sections here.

It’s worth mentioning that Flex will automatically match the longest rule that it can find. If you have a regular expression that can match anything, make sure that Flex only matches a single character; otherwise, it will prefer that rule over other ones, even though the others may be a better fit.

Configurations section

Start editing a file named shellscanner.l. The first thing that we will want to do is to include the generated tokens definition file from Lemon, so add this at the top:

%{
    #include "shellparser.h"
%}

Next, we can put various options to control Flex’s behavior. The first option we’re interested in is one that makes Flex reentrant, so add this next:

%option reentrant

Flex will also call a function named yywrap after it has finished scanning so that you can, for example, scan multiple files in a chain. We don’t care about this, so add

%option noyywrap

to prevent Flex from trying to call it.

Now we will define various states that the scanner can move into. In the normal state, the scanner just reads tokens and returns their values, but there are times where we want to change into a special state and have differently defined actions. For example, in C, identifiers consist of a string of alphanumeric characters, so Flex will return an IDENTIFIER value whenever it reads one. However, if we are parsing a comment, then we want Flex to ignore these strings, so we define a separate state for parsing while we are in a comment.

In our example, we’ll have one state for quoted parameters. Other shells behave differently when presented with single-quoted versus double-quoted parameters (e.g. double-quoted parameters honor escaping, but single-quoted parameters don’t) but we’ll ignore those differences here. Define the two states by entering:

%x SINGLE_QUOTED
%x DOUBLE_QUOTED

That should be it for this section, so enter in

%
%

and we’ll get on to the next section.

Rules section

Rules in Flex start with an optional state identifier in angle brackets (if not in the default state), a regular expression to match in the input, followed by some action to take in curly braces. The normal action that you will take is to return the value that represents the token that you matched, although sometimes you may not want to return anything at all, such as when you are parsing whitespace.

Our first rule is our easiest. If we see the pipe character, we return the PIPE token.

"|"	{return PIPE;}

We’ll do similar things for the command substitution rules.

"$("    {return COMMAND_SUBSTITUTION_START;}
")"     {return COMMAND_SUBSTITUTION_END;}

Our next rule will ignore any whitespace that shows up in our command line.

[ \t\n\r]	{}

The next rule will match Unix filenames. Filenames in Unix can use just about any character, but we’re going to limit it to alphanumeric characters and underscores, periods, and dashes. Our action in this case will be to just return the FILENAME token defined in the Lemon-generated header file.

[a-zA-Z0-9_\.\-]	{return FILENAME;}

Note that we had to escape the period and the dash, as both have special meanings in regular expressions.

Our next rules will handle quoted arguments. These are more complicated because we have to use states. To start, we’ll specify how to enter a state, then what actions to take while we’re in that state, and how to get out of it, and what to return when we’re done.

We’ll start with single quoted arguments first. To start, we enter the QUOTED state any time we encounter a single quote character. Flex has a special action named BEGIN that will switch to other states that we have defined in the first section, or back to the DEFAULT state.

"'"	{BEGIN(SINGLE_QUOTED);}

Once we’re in the SINGLE_QUOTED state, we want to ignore keep reading characters until we hit the closing quote, at which point we are can go back to the normal INITIAL scanning state.

<SINGLE_QUOTED>[^']+	{}
<SINGLE_QUOTED>"'"	{BEGIN(INITIAL); return ARGUMENT;}

One final note is that if we start quoting an argument but never find the closing quote, we need to signal an error. Flex has a special rule named &lt&ltEOF>>, so if we encounter it, we’ll just return -1 because -1 is an invalid state.

<SINGLE_QUOTED><<EOF>>	{return -1};

Our double quoted arguments will be similar, except that we’ll start and end with double quotes instead.

["]	{BEGIN(DOUBLE_QUOTED);}
<DOUBLE_QUOTED>[^"]	{}
<DOUBLE_QUOTED>["]	{BEGIN(INITIAL); return ARGUMENT;}
<DOUBLE_QUOTED><<EOF>>	{return -1};

The only thing we have left to match is other arguments, such as --color=auto. For this case, we just want to match any string consisting of not whitespace and quotes.

[^ \t\r\n|'"]+	{return ARGUMENT;}

Finally, we can close this section and finish this file by entering

%
%

Your full file should look like this (you can also grab the source from my GitHub:

%{
    #include "shellparser.hpp"
%}

%option reentrant
%option noyywrap

%x SINGLE_QUOTED
%x DOUBLE_QUOTED

%%

"|"    {return PIPE;}
"$("    {return COMMAND_SUBSTITUTION_START;}
")"     {return COMMAND_SUBSTITUTION_END;}

[ \t\r\n]    {}

[a-zA-Z0-9\-\._]+    {return FILENAME;}

[']                     {BEGIN(SINGLE_QUOTED);}
<SINGLE_QUOTED>[^']+    {}
<SINGLE_QUOTED>[']      {BEGIN(INITIAL); return ARGUMENT;}
<SINGLE_QUOTED><<EOF>>  {return -1;}

["]                     {BEGIN(DOUBLE_QUOTED);}
<DOUBLE_QUOTED>[^"]+    {}
<DOUBLE_QUOTED>["]      {BEGIN(INITIAL); return ARGUMENT;}
<DOUBLE_QUOTED><<EOF>>  {return -1;}

[^ \t\r\n|'"]+    {return ARGUMENT;}

%%

One final note. If Flex can’t match some input against a rule, it will by default just print that input. A lot of tutorials therefore recommend putting a rule at the end that will match any single character and then returning an error if that rule is matched. I haven’t done that here because we should be matching everything with our own rules. The good news is that Flex is smart enough to recognize if a rule can’t be matched and will print a warning, so you can test that everything is being matched by adding this rule at the end:

.    {return -1;}

Updating the Lemon parser

In part 1, if Lemon ever encountered a syntax error, it would just print a message to stderr. That worked fine when we were just testing the parser, but now that we’re integrating a scanner, we’ll need a better way of reporting errors.

Lemon allows you define and send an additional parameter to the parse function that will then be accessible in the parser’s rules. If you were implementing a compiler or something similar, you could use this parameter to send a struct that counts how many tokens you had seen or keep track of other information as the parser is running. For our purposes, we’ll just send a boolean by pointer that the parser will set to false if it encounters an error.

First, define the extra parameter by adding:

%extra_argument {bool* valid}

then change the %syntax_error section to read:

%syntax_error {
    *valid = false;
}

Updating the driver

We’ll use a driver that’s very similar to what we did last time, but now we’ll manually call the scanner. We also have to initialize the scanner ourselves with the input that we read from the user.

First, we’ll write a separate function that we will call from main to parse a single command line. It will take a string that is the command line to be parsed. In this function, we’ll have to initialize the Flex scanner and keep reading tokens and calling the parser until we either hit the end of file or encounter an error. Then, our main function just needs to read one line at a time and call the parse function with that line. You can also grab this code from my GitHub. If C is more your style, you can check out David E. Wheeler’s port too.

#include <cstdlib>
#include <iostream>
#include <string>
#include "shellparser.hpp"
#include "shellscanner.yy.hpp"
using namespace std;

void* ParseAlloc(void* (*allocProc)(size_t));
void Parse(void* parser, int token, const char*, bool* valid);
void ParseFree(void* parser, void(*freeProc)(void*));

void parse(const string& commandLine) {
    // Set up the scanner
    yyscan_t scanner;
    yylex_init(&scanner);
    YY_BUFFER_STATE bufferState = yy_scan_string(commandLine.c_str(), scanner);

    // Set up the parser
    void* shellParser = ParseAlloc(malloc);

    bool validParse;
    int lexCode;
    do {
        lexCode = yylex(scanner);
        Parse(shellParser, lexCode, NULL, &validParse);
    }
    while (lexCode > 0 && validParse);

    if (-1 == lexCode) {
        cerr << "The scanner encountered an error.\n";
    }

    if (!validParse) {
        cerr << "The parser encountered an error.\n";
    }

    // Cleanup the scanner and parser
    yy_delete_buffer(bufferState, scanner);
    yylex_destroy(scanner);
    ParseFree(shellParser, free);
}

int main() {
    string commandLine;
    while (getline(cin, commandLine)) {
        parse(commandLine);
    }
    return 0;
}

Updating the Makefile

Now that we’ve added some new files, we’ll have to udpdate the dependencies and add some new rules to make the Flex source files by running flex against the scanner specification file. We’ll use the same .PHONY specification trick as we did with the parser to create a fake target and list it as a dependency.

Also, because our folder is getting somewhat cluttered with files generated from Lemon and Flex and with object files, we’ll create another target named clean that will remove all of these genereated files. We can clean up our directory by just running

make clean

Just like with our Flex and Lemon targets, clean isn’t a real target, so we’ll use the same .PHONY trick. You can also grab this code from my GitHub.

shell: main.o shellparser.o shellscanner.yy.o
	$(CXX) main.o shellparser.o shellscanner.yy.o -o shell

main.o: main.cpp shellscanner.yy.hpp shellparser.hpp

shellparser.cpp: lemonfiles

shellparser.hpp: lemonfiles

.PHONY: lemonfiles
lemonfiles: shellparser.y
	lemon shellparser.y -s
	mv shellparser.c shellparser.cpp
	mv shellparser.h shellparser.hpp

shellscanner.o: shellscanner.yy.hpp shellscanner.yy.cpp

shellscanner.yy.cpp: flexfiles

shellscanner.yy.hpp: flexfiles

.PHONY: flexfiles
flexfiles: shellscanner.l
	flex --outfile=shellscanner.yy.cpp --header-file=shellscanner.yy.hpp shellscanner.l

.PHONY: clean
clean:
	rm -f *.o
	rm -f shellscanner.yy.cpp shellscanner.yy.hpp
	rm -f shellparser.cpp shellparser.hpp shellparser.out
	rm -f shell

Testing

Now that we have everything in place, we should be ready to compile and test our program. Compile it by typing

make

and run it by typing

./shell

You can now start entering command line strings. If nothing gets printed, then that means the command was successfully parsed; an error message means that something went wrong. Test it by typing some example lines, such as:

cat main.cpp "hello world.txt" | wc -l
ls
cat $(find . | grep 'cpp$') | grep -v -P '^\s*$' | wc -l
echo "foo" 'bar' |
| ./shell
find . | grep test | grep tos | grep -v -i photos | wc
find . | |

Some of these should parse perfectly.

Posted in Uncategorized | Tagged , | 6 Comments

Writing a basic shell using Flex and Lemon: Part 1

Introduction

I’ve recently decided to switch one of my projects from using the Bison parser generator to Lemon. Bison is the go to standard for parser generators, but after looking into Lemon, I really think that Lemon is safer, easier to use, and a better choice for writing a parser.

Before I start using new technologies, I like to produce a non-trivial sample program so that I can better understand the ins and outs of it before I jump in with a full-sized project. One good-sized problem for this domain is writing a simple shell; the grammar is not terribly complex, but it’s a trivial problem either. It should give an idea of the kinds of problems to expect when expanding to a larger project.

In this tutorial, we will be writing a very simple shell using Flex and Lemon. With this shell, you will be able to run commands with arguments, use command substitution, and pipe output from one program into another. A simple case would just be the command

ls

while a more complex case would be

cat $(find . | grep -i access) | sed -r 's/http[s]?:\/\///g' | cut -d '/' -f 1 | sort | uniq -c

This tutorial won’t produce a shell that can run programs in the background, redirect output to a file, support up/down-arrow history, support command line editing, allow argument expansion, or any other advanced features (although you could add those in). It won’t be powerful enough to replace your main shell any time soon, but it should give you an idea of how Lemon and Flex work.

Lemon

Why not Bison?

As stated earlier, we will be using Lemon to generate a parser. Lemon has several features that I feel make it a better choice than Bison. First, with Bison, the parser directly calls the scanner, which creates a tight coupling between the two. It’s difficult to debug your parser until your scanner is already working. With Lemon, you call the scanner yourself and feed the results into the scanner. This lets you unit test both pieces more easily. Second, Bison supports numerical arguments for its rules, which makes reading the code more difficult and error-prone. For example, a Bison rule for a C-style function would look like:

function:
  returnType LEFT_PARENTHESIS parameterList RIGHT_PARENTHESIS methodBody
  {
     $$ = new FunctionNode($1, $3, $5);
  }

where $$ is the return value for this rule and the $1 $3 $5 tokens are the return values for the returnType parameterList methodBody rules respectively. This is difficult to read because understanding the meaning of the action requires looking at the rule. Quick, without looking back at the rule, what’s the $5 in the action represent?

Contrast this with Lemon’s syntax:

function(FUNCTION) ::= returnType(TYPE) LEFT_PARENTHESIS parameterList(PARAMS) RIGHT_PARENTHESIS methodBody(BODY) .
{
  FUNCTION = new FunctionNode(TYPE, PARAMS, BODY);
}

With Lemon, we provide a name for each of the tokens in parentheses that we want to use in the action. Then, in the action, we refer to the name and don’t have to refer back to the rule to understand which token we are referring to. We can also modify the rule by inserting more tokens without having to update the $1 $3 $5 arguments in the action to match the new positions of the tokens. Lemon will also warn you if you give a name to a token but don’t use it in the action. The dot at the end rule indicates the end of the rule.

Getting Started

Let’s start writing a basic Lemon parser for our shell. Create a file called shellparser.y; this will be our Lemon grammar file.

Let’s start by defining our grammar rules. By convention, terminal symbols (that is, grammar rules that can’t be broken down any more) are specified in all capital letters, and other tokens start with a lower-case letter. For now, we won’t include any actions – we’ll add those after we test our grammar.

Our first rule should allow a bunch of possibly piped commands. Let’s call this commandList.

start ::= commandList .
{
}
commandList ::= command PIPE commandList .
{
}
commandList ::= command .
{
}

Here, I’ve specified that a commandList either consists of a single command followed by a pipe and another commandList, or just a single command. This way, a long list of commands will continue expanding until it hits the last command.

Our next rule is for a single command, which consists of a program’s name and a (possibly empty) series of arguments. Note that because other filenames can be used as arguments, we will have to define a rule for arguments to use either filenames or regular arguments.

command ::= FILENAME argumentList .
{
}
command ::= FILENAME .
{
}
command ::= COMMAND_SUBSTITUTION_START commandList COMAMND_SUBSTITUTION_END .
{
}

argumentList ::= argument argumentList .
{
}
argumentList ::= argument .
{
}
argument ::= ARGUMENT .
{
}
argument ::= FILENAME .
{
}
argument ::= COMMAND_SUBSTITUTION_START commandList COMAMND_SUBSTITUTION_END .
{
}

That should be it for our rules. However, because we are also planning on running these commands at some point, we’ll want to store what our ARGUMENTs were. To do so, we need to declare the types of these terminals. For now, we’ll define them as C-style strings. Add this above the rules:

%token_type {const char*}

If Lemon encounters an error while parsing, it will run the code defined in the %syntax_error section. Let’s define that near the top of our file:

%syntax_error
{
    std::cerr << "Error parsing command\n";
}

To get that to work properly, we will also have to include the iostream header file. Lemon also uses assertions, so we will have to include the header file for those. To make Lemon do that, add this to your file:

%include
{
    #include <cassert>
    #include <iostream>
}

Your full Lemon grammar should look like this (this code is also available on my GitHub):

%include
{
    #include <cassert>
    #include <iostream>
}

%syntax_error
{
    std::cerr << "Error parsing command\n";
}

%token_type {const char*}

start ::= commandList .
commandList ::= command PIPE commandList .
{
}
commandList ::= command .
{
}

command ::= FILENAME argumentList .
{
}
command ::= FILENAME .
{
}
command ::= COMAND_SUBSTITUTION_START commandList COMMAND_SUBSTITUTION_END .
{
}

argumentList ::= argument argumentList .
{
}
argumentList ::= argument .
{
}
argument ::= ARGUMENT .
{
}
argument ::= FILENAME .
{
}
argumentList ::= COMAND_SUBSTITUTION_START commandList COMMAND_SUBSTITUTION_END .
{
}

Testing

Driver

To test our grammar file, let’s write a simple driver file and a Makefile to compile everything. Start by writing a file named Makefile first. Add the following code to it:

shell: main.o shellparser.o
	$(CXX) main.o shellparser.o -o shell

main.o: main.cpp shellparser.hpp

shellparser.cpp: lemonfiles

shellparser.hpp: lemonfiles

.PHONY: lemonfiles
lemonfiles: shellparser.y
	lemon shellparser.y -s
	mv shellparser.c shellparser.cpp
	mv shellparser.h shellparser.hpp

make can figure out some simple rules on its own; for example, shellparser.o only relies on shellparser.cpp, so make will automatically compile it. Some files, such as main.o, need to have their dependencies manually listed so that make will recompile them if their include files change. In this case, make is able to figure out a default action for creating main.o.

The most interesting part is with the .PHONY line. Normally, make expects that a target (the bit before the colon) will be made upon running the actions that follow the target. However, if we simply listed shellparser.cpp and shellparser.hpp as targets with actions that ran Lemon, make would attempt to run Lemon twice, which can cause weird problems. By using .PHONY, we can tell make that the target lemonfiles won’t actually be created by the actions, but that it can still be used as a dependency target.

Now, let’s write our test main file. Lemon will produce a header file that defines the values of the various tokens that it recognizes. Unfortunately, this header does not define the interface to the parser, so we will have to forward declare the interface ourselves. To start the parser, we call the ParseAlloc function with a function to allocate memory with. After that, we call the Parse function with values of tokens that we have scanned. The end of the tokens is indicated with a token value of 0. Finally, we clean up the parser by calling ParseFree.

Edit the file main.cpp:

#include <cstdlib>
#include <iostream>
#include "shellparser.hpp"
using namespace std;

void* ParseAlloc(void* (*allocProc)(size_t));
void Parse(void*, int, const char*);
void ParseFree(void*, void(*freeProc)(void*));

int main()
{
    void* shellParser = ParseAlloc(malloc);
    // Simulate a command line such as "cat main.cpp | wc"
    Parse(shellParser, FILENAME, "cat");
    Parse(shellParser, ARGUMENT, "main.cpp");
    Parse(shellParser, PIPE, 0);
    Parse(shellParser, FILENAME, "wc");
    Parse(shellParser, 0, 0); // Signal end of tokens
    ParseFree(shellParser, free);
    return 0;
}

Compiling

Let’s test our program by compiling it. Because we specified a valid list of commands in our driver, it should run without calling the %syntax_error directive we defined in our grammar file. Compile everything by typing:

make

and run the driver by typing:

./shell

If no message is printed, then everything worked!

Breaking

Now let’s check to see if the parser fails when given invalid tokens. Our shell should refuse to parse the command line:

cat main.cpp | | wc

because there are two pipes. Edit the main.cpp file and make two copies of the PIPE line to simulate this command line. Recompile and run your program, and it should print an error.

Posted in Uncategorized | Tagged , | 6 Comments

Rounding function for bc

bc doesn’t include a way to round floating point values to the nearest integer, but you can write one yourself. Just type this in:

define round(n)
{
    oldscale = scale
    if (n > 0)
        value = n + 0.5
    if (n < 0)
        value = n - 0.5
    scale = 0
    rounded = value / 1
    scale = oldscale
    return rounded
}

If you don’t feel like typing this in every time, you can add it to a file that will be loaded when you start bc. Type it into a file, such as $HOME/.bcrc and then add the following to your .bashrc file:

export BC_ENV_ARGS="$HOME/.bcrc"

Posted in Uncategorized | Tagged , | 2 Comments

C++ Assignment Operator or: How I Learned to Stop Worrying and not Check for Self-Assignment

If you’re already here, I’m going to assume you know a lot of the intricacies of writing an assignment operator in C++. Getting all the ins and outs of writing assignment operators is tricky – if you want a good discussion of all the bits and pieces, I’d recommend taking a look at The Anatomy of the Assignment Operator. But that’s not why I’m here, so at the risk of being blasphemous, I’ll just get my main point out of the way.

If you are writing
if (this != &rhs)
in your assignment operator, you are probably doing it wrong, or at the very least, not as right as it should be.

That’s it. Take a deep breath, sit back, relax and just let it settle. I’ve been told to do it this way for years, and you probably have too, so it might be a bit of a shock. But please, please stop telling people to write it this way. It’s important to understand why people do it, but it’s just as important to understand why it’s wrong.

Self-assignment

Let’s look at why people recommend writing this check. Imagine we have the following class:

class Foo
{
public:

    ...

private:
    Bar* myBar;
};

A first attempt at an assignment operator might look like the following:

Foo& operator=(const Foo& rhs)
{
    delete myBar;
    myBar = new Bar(*rhs.myBar);
    return *this;
}

The first thing that we do is delete the old copy of myBar so that no memory is leaked. Then, we make a copy of the right hand side’s myBar. This is where our problems start. What would happen in the above assignment operator if I tried to assign an object to itself? For example, if I wrote the following:

Foo f;
f = f;

In this case, the parameter rhs would be the same as the current object. So when we try to run the line 3 in the assignment operator:

delete myBar;

the rhs’s myBar is also being deleted. So what happens when we try to run the next line?

myBar = new Bar(*rhs.myBar);

We’re trying to access and copy memory that has already been deleted. At this point, any number of things could happen. Some of the time, that memory will still belong to our process and will be unchanged, and our program will happily continue. But once in a while, the program will mysteriously crash. Trying to debug this issue when it works 99 out of 100 times is difficult and irritating.

So what can we do instead? The commonly suggested solution is to check for self-assignment and skip the allocation and deletion in that case. This does solve the problem, but let’s take a step back and look at our code again. The problem with self-assignment in the above constructor is that by the time we are ready to copy the Bar object, we’ve already deleted it. But what if we did the copy first, swapped that copy with our member variable, and then deleted what used to be our member variable?

Here’s a new code snippet that does that instead:

Foo& operator=(const Foo& rhs)
{
    Bar* temp = new Bar(*rhs.myBar);
    swap(temp, myBar);
    delete myBar;
    return *this;
}

What happens now if we accidentally do self-assignment? Well, line 3 makes a copy of the member variable. Line 4 swaps the copy with our member variable. Finally, line 5 deletes what used to be our member copy. No memory is leaked, no deleted memory is accessed, and at worst, some cycles are wasted making a copy of some memory and deleting it. We fixed the problem without ever needing a check for self-assignment!

But just avoiding a special case doesn’t necessarily make this version better, does it? Well…

Exception guarantees

In C++, there are several guidelines you should try to follow whenever you’re writing a function. These are known as exception guarantees. In some cases, it is impossible to implement the stronger ones, but you should try to implement them if you can.

The first guarantee is the basic guarantee which states that your code should not leak memory if any exceptions are thrown. In both our copy constructors, the only line that could throw an exception is the memory allocation. If it does, then there is no memory cleanup required because we’re only trying to allocate a single object. If our class did have more objects, we would need to allocate them sequentially, catch any exceptions that might be thrown while making copies of them and delete the copies that we had made so far before rethrowing the exception.

The second guarantee is the strong guarantee which states that your code should leave the object in a consistent state if any exceptions are thrown. Let’s imagine that I was using the Foo class in the following context:

do
{
    try
    {
        Foo f1, f2;

        ...

        f1 = f2;

        ...

    }
    catch (bad_alloc& e)
    {
        // Clean up some memory and keep running
    }
} while (!done);

Here, we have a long running process and that may run out of memory at some point, but we have a way of cleaning up some memory if an error ever does occur, so we’d like to continue running the process until it successfully completes.

Now imagine what happens with our first assignment operator. We first delete the member variable myBar, then try to make a copy of the right hand side’s myBar. If this throws an exception, then our Foo will be left with myBar pointing at some invalid memory. When our program tries to pick up where it left off, it will try to access that memory and will probably crash.

What about with our second assignment operator? We first try making a copy of the right hand side’s myBar. If it throws an exception, our Foo is left unchanged. The program can run the loop again and nothing crashes.

By doing any allocations before deleting objects, we’ve managed to avoid the problems of self-assignment without writing if (this != &rhs) . In addition, this code implements the strong guarantee and is therefore safer. If your code relies on a check for self-assignment for correctness, then it probably doesn’t implement the strong guarantee and is less safe than it should be.

But what about speed?

One question I’ve received when explaining this concept is that the second version is much slower when doing self-assignment than the original version. That is true – comparing two memory addresses is much less costly than allocating and deleting memory. However, in practice, self-assignment is generally rare. Optimizing one specific rare case at the expense of the much more common general cases will likely not gain you much. Putting in a special case to guarantee that your program does not crash 1 out of 1000 times is important; putting in a special case to optimize 1 out of 1000 cases is not. If you are concerned about speed, profile your code first and determine how often self-assignment really occurs.

Alternatives

It turns out that there is an even easier way to write an assignment operator as long as certain conditions are met. The strongest exception guarantee is the no-throw guarantee which states that no exceptions will be thrown. As long as we’re not allocating any memory or other objects, this should be possible. In Foo, the only member variable we have is a pointer. Because moving pointers will not throw an exception, we can write a member swap() method as part of the Foo class that swaps the contents of two Foo objects that has the no-throw guarantee. Then, we can write the whole copy constructor as follows:

Foo& operator=(Foo rhs)
{
    swap(rhs); // Calls Foo::swap
    return *this;
}

Rather than sending the right hand side object by reference, we send it by value; this causes the compiler to make a temporary copy of the the right hand side object using Foo’s copy constructor. This temporary object will be sent to the method, and the copy will be swapped with the local object. Then, upon exiting the method, the temporary object will go out of scope and be automatically destroyed. And so, without doing any extra work, we automatically get the basic and strong guarantees.

Posted in Uncategorized | Tagged , | 5 Comments