Using C++11 variadic templates to create a type-safe and injection-safe database interface: part 1

Introduction

Prior to C++11, there was no general way in C++ to create type-safe functions that would take a variable number of arguments. C++ had inherited ..., the ellipsis operator, from C that would allow a function to take a variable number of arguments that would then be stored on the stack. These arguments would not be type checked at compile time, and if you weren’t careful, you would encounter runtime errors.

C++11 introduced variadic templates, which are templates that take a variable number of arguments. Similarly to how template metaprogramming works, you specify a template recursively and the compiler will unroll the template definitions at compile time, affording you type safety and the performance of linear code without sacrificing the flexibility of variable numbers of arguments.

In this post, we’ll use variadic templates to create a database interface that is both safe from SQL injection attacks and that automatically type checks and converts results from SELECT statements. In the end, you’ll be able to write code similar to this:

MySql connection{"localhost", "user", "password"};
vector<tuple<string, string, int>> users;
connection.runQuery(
  &users,
  "SELECT first_name, last_name, age FROM users WHERE username LIKE ? AND age > ?",
  username,
  age);

which will run the query and store the results into the provided vector.

Variadic templates

Syntax

A template that takes a variable number of arguments is specified using ..., the ellipsis operator. As with templates in C++98, you can create templated versions of classes or functions. For this example, we will be using the ellipsis to create parameter packs, which hold the variadic parameters. We define a template with two variable types: one from the beginning of the parameter pack and a second variadic type that represents the rest of the parameter pack. The compiler will unroll the parameter pack one parameter at a time and call the function for each parameter. Finally, we will provide an overload that takes no parameters to close off the the unrolling.

As an example, we could redefine the C function printf as follows:

template <typename Head, typename... Tail> // Tail is the parameter pack
int printf(const char* format, Head value, Tail... args);

int printf(const char* format);

A call to printf("%d %s %e", 1, "test", 1.5) will cause the compiler to generate these template instances:

printf<int, Tail...>("%d %s %e", 1, "test", 1.5)  // with Tail... = [const char*, double]
printf<const char*, Tail...>("%d %s %e", "test", 1.5) // with Tail = [double]
printf<double, Tail...>("%d %s %e", 1.5) // with Tail = []
printf("%d %s %e")

and each template instance will have its code expanded inline.

SQL injection

SQL injection is a technique used to attack databases. It works by causing the application to interpret user supplied input as SQL commands that alter the original query to something other than what the developer intended. There are several ways to mitigate this attack; we’re going to use parameterized queries here. Parameterized queries prevent SQL injection by clearly separating user-supplied data from the query that is supposed to use that data. This separation prevents the database engine from accidentally interpreting user data as SQL commands.

The MySQL prepared statement C API requires the developer to specify a full query with placeholders meant for user-supplied data, and then bind parameters to the query by specifying each parameter’s type and value. In C, this would be done as follows:

MYSQL_BIND parameter;
memset(&parameter, 0, sizeof(parameter));
parameter.buffer_type = MYSQL_TYPE_LONG;
parameter.buffer = &age;
parameter.is_null = &nullFlag;
parameter.is_unsigned = 0;
parameter.error = &errorFlag;
mysql_stmt_bind_param(&statement, &parameter);

Setting up parameterized queries requires a lot of boilerplate code and is prone to typos. For example, using the wrong type code on line 3 can cause runtime errors. However, by using variadic templates and parameter packs, we can define functions that will automatically prepare these parameters for us at compile time based on the types of the arguments that we provide.

Binding parameters

To start with, we’ll define a variadic templated class that we’ll use to bind the input parameters to the query. C++11 doesn’t allow for partial template specialization of variadic templated functions, so we’ll need to encapsulate a function inside of a class. We’ll start by defining the base case for the variadic template expansion.

template <size_t N, typename... Args>
struct InputBinder {
  static void bind(std::vector<MYSQL_BIND*>* const) {}
};

This function will take a vector of MYSQL_BIND parameters that we will set up one by one. We also include a size constant in the template arguments so that we know which parameter to bind to. This variable will be incremented when we make recursive instantiations of this class.

To expand the variadic template, we’ll define a function to unpack a single parameter from the parameter pack at a time. We’ll eventually define partial template specializations for different types of parameters so that we can set up the special behavior required for binding the parameters, but to start, we’ll define the generic function. This function will be instantiated if there’s no other specialization that matches a type, and we want to force the programmer to explicitly define specializations for each type that they use, so we’ll cause the compiler to generate an error if this function is ever instantiated.

template <size_t N, typename Head, typename... Tail>
struct InputBinder<N, Head, Tail...>
  static void bind(
    std::vector<MYSQL_BIND>* const,
    const Head&,
    const Tail&...
  ) {
    static_assert(
      sizeof...(Tail) < 0,
      "All types need to have partial template specialized instances"
      " defined for them");
  }
};

Here, we’re using another C++11 feature, static_assert, to provide a meaningful error message if this function is ever instantiated. C++ guarantees that the sizeof... any parameter pack is at least 0, so this assertion will always fail.

The simplest types to bind parameters to are integral types, so we’ll start with those. We’ll take a MYSQL_BIND from the provided vector and set the various attributes required for binding the parameters. When we’re done, we’ll recursively instantiate another variadic template class to bind the remaining parameters.

template <size_t N, typename... Tail>
struct InputBinder<N, Tail...>
  static void bind(
    std::vector<MYSQL_BIND>* const bindParameters,
    const int32_t& value,
    const Tail&...
  ) {
    // Set up the bind parameters
    MYSQL_BIND& bindParameter = bindParameters->at(N),
    bindParameter.buffer_type = MYSQL_TYPE_LONG;
    bindParameter.buffer = const_cast<void*>(
      static_cast<const void*>(&value));
    bindParameter.is_unsigned = 0;
    bindParameter.is_null = 0;

    // Recursively instantiate another template
    // to bind the rest of the parameters
    InputBinder<N + 1, Tail...> binder;
    binder.bind(bindParameters, tail...);
  }
};

The remaining integral types will have similar specializations but with changes to the buffer_type and is_unsigned attributes. MySQL ignores the is_unsigned attribute for floating type points, so we can drop that line when we define those specializations. The specializations for string types require an additional buffer_length parameter to be set:

template <size_t N, typename... Tail>
struct InputBinder<N, Tail...>
  static void bind(
    std::vector<MYSQL_BIND>* const bindParameters,
    const std::string& value,
    const Tail&...
  ) {
    // Set up the bind parameters
    MYSQL_BIND& bindParameter = bindParameters->at(N),
    bindParameter.buffer_type = MYSQL_TYPE_STRING;
    bindParameter.buffer = const_cast<void*>(
      static_cast<const void*>(value.c_str()));
    bindParameter.buffer_length = value.length();
    bindParameter.is_null = 0;

    // Recursively instantiate another template
    // to bind the rest of the parameters
    InputBinder<N + 1, Tail...> binder;
    binder.bind(bindParameters, tail...);
  }
};

Running the code

Now that we have our variadic template defined, we just need to call it from the function that runs commands. We’ll first tell MySQL to prepare a query, make sure the number of provided parameters match the number of parameters in the query, bind the parameters using our template, and then run the query.

template<typename... Args>
my_ulonglong runCommand(
  const char* const command,
  const Args&... args
) {
  MYSQL_STMT* const statement = mysql_stmt_init(connection_);

  const size_t length = ::strlen(command);
  mysql_stmt_prepare(statement, command, length);

  // Commands (e.g. INSERTs or DELETEs) should always have this set to 0
  if (0 != mysql_stmt_field_count(statement)) {
    throw MySqlException("Tried to run SELECT statement with runCommand");
  }

  const size_t parameterCount = mysql_stmt_param_count(statement);
  if (sizeof...(args) != parameterCount) {
    std::string errorMessage{"Incorrect number of parameters; command required "};
    errorMessage += boost::lexical_cast<std::string>(parameterCount);
    errorMessage += " but ";
    errorMessage += boost::lexical_cast<std::string>(sizeof...(args));
    errorMessage += " parameters were provided.";
    throw MySqlException(errorMessage);
  }

  std::vector<MYSQL_BIND> bindParameters;
  bindParameters.resize(parameterCount);
  InputBinder<0, Args...> binder;
  binder.bind(&bindParameters, args...);
  mysql_stmt_execute(statement);

  // If the user ran a SELECT statement or something else, at least warn them
  const my_ulonglong affectedRows = mysql_stmt_affected_rows(statement);
  if (((my_ulonglong)(-1)) == affectedRows) {
    throw MySqlException{"Expected command but ran SELECT statement"};
  }

  // Cleanup
  mysql_stmt_free_result(statement);
  mysql_stmt_close(statement);

  return affectedRows;
}

When we instantiate the InputBinder template, we start if with a size of 0 so that it will start binding parameters with the first element from the provided vector.

With that, we can start running commands using code like this:

MySql connection{"localhost", "user", "password"};

connection.runCommand(
  "INSERT INTO user (name, age) VALUES (?, ?), (?, ?)",
  names[0],
  ages[0],
  names[1],
  ages[1]);

In the next post, I’ll cover how to use variadic templates to parse the results from MySQL and store them into a vector. All of the code in this tutorial is available at from my GitHub account.

Advertisements
This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

One Response to Using C++11 variadic templates to create a type-safe and injection-safe database interface: part 1

  1. Pingback: Using C++11 variadic templates to create a type-safe and injection-safe database interface: part 2 | Brandon's Thoughts

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s