Writing Stable APIs with pImpl and Fast pImpl in C++

Ryonald Teofilo
7 min readApr 27, 2024

--

Source: timetrabble.com

If you’re anything like me who struggled with acne growing up, you may have also mistaken pImpl idiom as some clever wordplay on the facial spots that plagued the teenage years for a lot of us. However, not only that it doesn’t have anything to do with that, the pointer-to-Implementation is a C++ technique that might benefit us when writing code, especially when writing APIs (Application Programming Interface).

Let’s take a look at the snippet below to understand this.

// MyManager.h
class MyManager
{
public:
MyManager();
~MyManager();

void DoA();
void DoB();

private:
void DoRepetitiveThingBeforeA();
void DoBThings();

int mA{0};
int mB{0};
};

Here we are supplied two public methods to DoA() and DoB() to interact with the class. However in C++, we are also required to declare the private members — which in most cases, the consumer of this class do not care, or need to know know about.

Other than showing information that is unnecessary, these implementation details may also be something we want to hide. For instance, if we ship our code as a library where clients do not have access to the source code — This is where the pImpl idiom can come in handy.

// MyManager.h
class MyManager
{
public:
MyManager();
~MyManager();

void DoA();
void DoB();

private:
// Pointer-to-implementation
struct Impl;
Impl* pImpl;
};
// MyManager.cpp
#include <iostream>
#include "pimpl.h"

struct MyManager::Impl
{
void DoRepetitiveThingBeforeA()
{
std::cout << "Doing boring A things! " << mA << std::endl;
}
void DoBThings()
{
std::cout << "Doing B things! " << mB << std::endl;
}

int mA{0};
int mB{0};
};

MyManager::MyManager() : pImpl(new Impl) {}
MyManager::~MyManager() { delete pImpl; }

void MyManager::DoA()
{
pImpl->DoRepetitiveThingBeforeA();
/* more code */
}

void MyManager::DoB()
{
pImpl->DoBThings();
/* more code */
}

As observed in the interface header, we’re now using an opaque pointer-to-implementation to achieve total encapsulation — where there are no private data members mentioned in the header file, which helps us hide details we don’t need or want to show.

Another huge benefit of the pImpl idiom is stability. The separation of interface and implementation minimises dependencies and improves ABI stability; which in turn reduces compile time during development. Here’s what we mean by that:

By having all the implementation details contained in the .cpp file, we will not have to re-compile code that depends on the interface i.e. includes the header file. Of course, this only holds true until you make changes to the interface — which in this case, will only be when you want to add/remove members you’d want to make public to the clients.

This ties directly to the ABI stability aspect i.e. the data layout of your class; For example, if I later decide I want an extra member variable int mC in MyManager, adding this to the interface as a private member changes the data layout.

// MyManager.h
class MyManager
{
public:
MyManager();
~MyManager();

void DoA();
void DoB();

private:
struct Impl;
Impl* pImpl;

// New member variable, this changes data layout!
// int mC;
};
// MyManager.cpp
struct MyManager::Impl
{
void DoRepetitiveThingBeforeA()
{
std::cout << "Doing boring A things! " << mA << std::endl;
}
void DoBThings()
{
std::cout << "Doing B things! " << mB << std::endl;
}

int mA{0};
int mB{0};

// Add here instead!
int mC{0};
};

With pImpl however, we’ll only have a pointer i.e. 8 bytes in a 64-bit system as data, which makes this interface more stable as we make changes to it. This could be a significant thing based on your circumstances, either when you’re writing code shipped as a library — where you’d want to minimise disruption on each update; or a piece of long-lived code, something we often don’t think about as developers.

pImpl but smarter — RAII wrappers

Writing code in modern C++, it doesn’t feel quite right to be using raw pointers. Thus, through usage of smart pointers, let’s make a quick change to our code to use std::unique_ptr to manage our implementation object’s lifetime.

// MyManager.h
#include <memory>

class MyManager
{
public:
MyManager();
~MyManager();

void DoA();
void DoB();

private:
struct Impl;
std::unique_ptr<Impl> pImpl;
};
// MyManager.cpp
#include <iostream>
#include "pimpl.h"

struct MyManager::Impl
{
void DoRepetitiveThingBeforeA()
{
std::cout << "Doing boring A things! " << mA << std::endl;
}
void DoBThings()
{
std::cout << "Doing B things! " << mB << std::endl;
}

int mA{0};
int mB{0};
};

MyManager::MyManager() : pImpl(new Impl) {}
MyManager::~MyManager() = default;

void MyManager::DoA()
{
pImpl->DoRepetitiveThingBeforeA();
/* more code */
}

void MyManager::DoB()
{
pImpl->DoBThings();
/* more code */
}

Now, we should not need to worry about managing the lifetime of our implementation object i.e. we can leave our destructor as default. This also prevents the compiler from implicitly generating a copy constructor/assignment operator for MyManager since std::unique_ptr has them explicitly deleted.

Drawbacks on pImpl; and its workaround — Fast Pimpl

The keen-eyed amongst us may have noticed that this separation of interface and implementation has introduced a layer of indirection in our code which can adversely affect performance.

This varies depending on the hardware we’re running on, but the most common factors are the overhead from heap allocation and non-local memory accesses due to how caching works in modern computers — see my stories on memory alignment and cache-line alignment.

In order to workaround this, the Fast pImpl idiom can be used to reduce the cost of the aforementioned factors by constructing the implementation object within the interface object. Let’s take a look at what this means.

// MyManager.h
#include <cstddef>

class MyManager
{
public:
MyManager();
~MyManager();
MyManager(const MyManager& other) = delete;
MyManager& operator=(const MyManager& other) = delete;

void DoA();
void DoB();

private:
alignas(std::max_align_t) std::byte storage[8];
struct Impl;
Impl* pImpl;
};
// MyManager.cpp
#include <iostream>
#include <memory>
#include "pimpl.h"

struct MyManager::Impl
{
void DoRepetitiveThingBeforeA()
{
std::cout << "Doing boring A things! " << mA << std::endl;
}
void DoBThings()
{
std::cout << "Doing B things! " << mB << std::endl;
}

int mA{0};
int mB{0};
};

MyManager::MyManager()
{
// static assert here to ensure we have the right storage size
static_assert(sizeof(Impl) <= sizeof(storage));
pImpl = new (&storage) Impl();
}

MyManager::~MyManager()
{
std::destroy_at(pImpl);
}

void MyManager::DoA()
{
pImpl->DoRepetitiveThingBeforeA();
/* more code */
}

void MyManager::DoB()
{
pImpl->DoBThings();
/* more code */
}

With this idiom, we instead allocate memory ahead of time; and within the interface object in the stack. This prevents us from paying the cost of dynamic allocation and workaround non-local memory access by achieving memory locality.

The even-keener eyes amongst us would also notice that we had removed the RAII pointers; this is because we are no longer dynamically allocating memory i.e. malloc(), therefore there is no longer the need to deallocate the memory i.e. free(). We would simply have to destroy the object at that address. However, it’s important to be cognizant that this removal means we now have to explicitly delete the copy constructor and assignment operator; which also prevents its move counterparts from being implicitly declared by the compiler.

Since the pImpl variable now just acts a proxy to storage's address. This gives another opportunity to further reduce memory footprint by replacing it with a helper member function, which saves us a few extra bytes.

// MyManager.h
#include <cstddef>

class MyManager
{
public:
MyManager();
~MyManager();
MyManager(const MyManager& other) = delete;
MyManager& operator=(const MyManager& other) = delete;

void DoA();
void DoB();

private:
struct Impl;
alignas(std::max_align_t) std::byte storage[8];

// Helper
Impl* GetpImpl();
};
// MyManager.cpp
#include <iostream>
#include <memory>
#include <new>
#include "pimpl.h"

struct MyManager::Impl
{
void DoRepetitiveThingBeforeA()
{
std::cout << "Doing boring A things! " << mA << std::endl;
}
void DoBThings()
{
std::cout << "Doing B things! " << mB << std::endl;
}

int mA{0};
int mB{0};
};

MyManager::Impl* MyManager::GetpImpl()
{
return std::launder(reinterpret_cast<MyManager::Impl*>(&storage));
}

MyManager::MyManager()
{
static_assert(sizeof(Impl) <= sizeof(storage));
new (&storage) Impl();
}

MyManager::~MyManager()
{
std::destroy_at(GetpImpl());
}

void MyManager::DoA()
{
GetpImpl()->DoRepetitiveThingBeforeA();
/* ... */
}

void MyManager::DoB()
{
GetpImpl()->DoBThings();
/* ... */
}

Note the usage of std::launder in the helper function, this is required (as specified in cppreference) because we’re:

“obtaining a pointer to an object created by placement new from a pointer to an object providing storage for that object.”

There are different conditions where it is well-defined to access a newly created object through a pointer pointing, reference referring, or original variable name given to the original object. But since the new object is not the same type as the original object that provided storage, we’ll need to do memory laundering to work around this.

Which pImpl idiom should I use? Do I even need it?

Using the pImpl certainly comes at a cost. The pImpl idiom certainly does well in terms of achieving total encapsulation, reducing dependencies and improving API stability — but this comes at the price of a layer of indirection.

As we’ve seen, we can minimise the negative impact of indirection using the Fast pImpl idiom, but that surely comes at the cost of increased code complexity and portability.

Therefore, it’s important to take the time to consider your circumstances before deciding whether pImpl is worth the trade-off. And if so, consider whether it is worth implementing Fast pImpl. Of course, when it comes to anything performance, always measure before making a decision!

--

--