Why I Rewrote std::shared_ptr


Borislav Stanimirov / @stanimirovb

Hello, World



  #include <iostream>

  int main()
  {
      std::cout << "Hi, I'm Borislav!\n";
      std::cout << "These slides are here: https://is.gd/xsptr\n";
      return 0;
  }
        

Borislav Stanimirov


  • I mostly write C++
  • 2006-2018: game development
  • From 2019 on: medical software
  • Open source software
  • github.com/iboB


A blog post of what we do: ibob.bg/blog

This talk


  • What are shared pointers and weak pointers
  • Specifically std::shared_ptr and std::weak_ptr
  • Minor things that I don't like about them
  • Potential issues with them: shared pointer leaks
  • Why would anyone rewrite the standard ones. Pros and cons
  • "Exotic" shared pointers

NOT This talk


  • Use cases for shared pointers
  • General shared pointer criticism
  • Responding to general shared pointer criticism
  • A guide to rewriting std::shared_ptr

A Crash Course in Shared Pointers

Smart pointers


  • Free memory as needed
  • Smart pointers in C++:
    • std::unique_ptr
    • std::shared_ptr
    • but also std::vector
    • to an extent, every container
  • Now you have no excuse to write delete

std::shared_ptr

Created by Peter Dimov

Shared ownership through reference counting


  auto a = std::make_shared<person>("Alice"); // Allocate one person
                                              // Ref count of Alice is 1
  auto b = a; // a and b point to Alice. Ref count is 2
  a.reset(); // a doesn't point to Alice, but b still does. Ref count is 1
  b.reset(); // b doesn't point to Alice. Ref count is 0. Alice is deleted
        

Alice started her lifetime with a and ended it with b


  class person {
    std::string m_name;
  public:
    person(std::string_view name) : m_name(name) {}
    const std::string& name() const { return m_name; }
    void greet() { std::cout << "Hi, my name is " << m_name << '\n'; }
  };
        

There is no ref count here!

The Control Block

shared_ptr also points to a wrapper which holds the ref count


  template <typename T>
  class shared_ptr {
    T* m_ptr;
    control_block* m_cb;
    dec_ref() { if (m_cb) m_cb->dec_ref(); }
    inc_ref() { if (m_cb) m_cb->inc_ref(); }
  public:
    ~shared_ptr() {
      dec_ref();
    }
    shared_ptr(const shared_ptr& other)
      : m_ptr(other.m_ptr), m_cb(other.m_cb)
    {
      inc_ref();
    }
    // ... operator->, *, ...
  };

The Control Block


  struct control_block {
    int m_refs = 1;
    void inc_ref() { ++m_refs; }
    void dec_ref() {
      if (--m_refs == 0) {
        destroy_self();
      }
    }
    virtual void destroy_self() = 0;
  };

  template <typename T>
  struct control_block_t final : public control_block {
    T m_resource;
    void destroy_self() override { delete this; }
  }
        

Making a Shared Pointer



  template <typename T, typename... Args>
  shared_ptr<T> make_shared(Args&&... args) {
    auto cb = new control_block_t<T>(T{std::forward<Args>(args)...});
    shared_ptr<T> ret;
    ret->m_ptr = &cb->m_resource;
    ret->m_cb = cb;
    return ret;
  }
        

What We Skipped


  • shared_ptr from unique_ptr or a raw pointer
  • Allocators
  • Ref count atomicity

Done?

std::weak_ptr

A non-owning pointer. Weak count


  auto a = std::make_shared<person>("Alice"); // Ref count = 1. Weak count = 1
  std::weak_ptr<person> w = a; // Ref count = 1. Weak count = 2
  auto b = w.lock(); // a and b -> Alice. Ref count = 2. Weak count = 2
  a.reset(); // Ref count = 1. Weak count = 2
  b.reset(); // Ref count = 0 => Alice is destroyed => Weak count = 1. CB remains
  auto c = w.lock(); // c is empty. Ref count = 0. Weak count = 1
  w.reset(); // Weak count = 0. Control block is deleted
        

Alice started her lifetime with a and ended it with b

The control block started it with a and ended it with w

The Control Block With Weak Refs


  struct control_block {
    int m_refs = 1;
    int m_weak_refs = 1;
    void inc_ref() { ++m_refs; }
    void dec_ref() {
      if (--m_refs == 0) {
        destroy_resouce();
        dec_weak_ref();
      }
    }
    void inc_weak_ref() { ++m_weak_refs; }
    void dec_weak_ref() {
      if(--m_weak_refs == 0) {
        destroy_self();
      }
    }
    virtual void destroy_resource() = 0;
    virtual void destroy_self() = 0;
  };
        

The Control Block With Weak Refs



  template <typename T>
  struct control_block_t final : public control_block {
    std::optional<T> m_resource;
    void destroy_resource() override { m_resource.reset(); }
    void destroy_self() override { delete this; }
  }
        

weak_ptr


  template <typename T>
  class weak_ptr {
    T* m_ptr;
    control_block* m_cb;
    dec_weak_ref() { if (m_cb) m_cb->dec_weak_ref(); }
    inc_weak_ref() { if (m_cb) m_cb->inc_weak_ref(); }
  public:
    ~weak_ptr() {
      dec_weak_ref();
    }
    weak_ptr(const weak_ptr& other)
      : m_ptr(other.m_ptr), m_cb(other.m_cb)
    {
      inc_weak_ref();
    }
    ...
        

weak_ptr cont


    ...
    weak_ptr(const shared_ptr<T>& sptr)
      : m_ptr(sptr.m_ptr), m_cb(sptr.m_cb)
    {
      inc_weak_ref();
    }
    shared_ptr<T> lock() const {
      if (!m_cb) return {}; // empty weak_ptr
      if (m_cb->m_refs == 0) return {}; // object destroyed

      shared_ptr<T> ret;
      ret->m_ptr = m_ptr;
      ret->m_cb = m_cb;
      m_cb->inc_ref(); // inc strong ref

      return ret;
    }
  };
        

...and that's how std::shared_ptr works

We are skipping shared_from_this

We are skipping atomic ops!

What if we leave it at that?

local_shared_ptr


  • Atomic ops are not free. Especially on ARM
  • Sometimes we only have one thread
    • That's usually niche
    • Single-threaded async programming
    • Single-threaded flavors
  • Boost.SmartPtr: boost::local_shared_ptr

Two raw pointers per shared and weak pointer?

Seems like a waste.

How about...


  template <typename T>
  class shared_ptr {
    /* no T* here */
    control_block<T>* m_cb;
    dec_ref() { if (m_cb) m_cb->dec_ref(); }
    inc_ref() { if (m_cb) m_cb->inc_ref(); }
  public:
    T* operator->() { return &m_cb->m_resource; }
    // ...
  };
        

... this could work. But...

Control-Block-Only Pointers


  • Performance:
    • &m_cb->m_resource
    • two indirections per deref: *, ->, get()
  • Functionality:
    • Type erasure becomes impossible
    • std::shared_ptr<shape> s = some_square;

Type Erasure

  std::shared_ptr<shape> c = some_square; // classic
  std::shared_ptr<void> o = some_ptr; // opaque
  std::shared_ptr<std::string> a(alice, &alice->name); // aliasing
  std::shared_ptr<void> empty;
  int x;
  std::shared_ptr<int> o_O(empty, &x); // abomination

Always use this*:

  template <typename U, typename T>
  std::shared_ptr<T> make_aliased(const std::shared_ptr<U>& owner, T* ptr) {
      if (owner.use_count() == 0) return {};
      return std::shared_ptr<T>(owner, ptr);
  }

*Peter Dimov doesn't agree

That's why we must have two raw pointers in shared_ptr

Or do we?...

Intrusive Shared Pointer

What if the the object does cary its ref count?

Let's merge the control block and the object:


  template <typename T>
  class intrusive_shared_ptr {
    /* no control block here */
    T* m_ptr; // this is also a control block
    dec_ref() { if (m_ptr) m_ptr->dec_ref(); }
    inc_ref() { if (m_ptr) m_ptr->inc_ref(); }
  public:
    ...
  };
        

Using Intrusive Shared Pointers

Inherit from a common parent


  class person : public ref_countable {
    // ...
  };
        

Intrusive Shared Pointer Facts


  • sizeof(shared_ptr<T>) == sizeof(void*)
  • Free shared_from_this
  • Free atomic load/store
  • No shared pointers of foreign types
  • Weak pointers are really hard (and almost never implemented)
  • Only hierarchical type erasure
    • i_shared_ptr<TObject>shared_ptr<void>
    • No aliasing

Free atomic load/store?

Atomic Load and Store


  std::shared_ptr<person> alice;

  // thread a
  std::atomic_store(&alice, std::make_shared<person>("Alice"));

  // thread b
  auto ptr = std::atomic_load(&alice);
        
  • kuzco: an immutable state library
  • Deprecated in C++20 😠
  • ...in favor of std::atomic<std::shared_ptr<T>>🤮
  • But... how does it even work?

Implementing Atomic Load and Store

  • Trivial with intrusive_shared_ptr
  • Impossible with two-word shared_ptr
    • DCAS, MCAS
    • All implementations rely on global locking primitives
    • 🤮
  • Deprecated in C++20 👍
  • ...in favor of std::atomic<std::shared_ptr<T>>😠
  • Atomic deref is evil!
  • itlib::atomic_shared_ptr_storage

Problems So Far


  • Unneeded atomic ops: boost::local_shared_ptr
  • Dangerous aliasing ctor: itlib::make_aliased
  • Two-word size: intrusive shared pointer
  • Evil atomic pointer: itlib::atomic_shared_ptr_storage


Everything is solved. Why rewrite?

Leaks

"Smart pointers free memory. Leaks are impossible!"

Circular References


  struct person {
    std::string name;
    std::vector<std::shared_ptr<person>> siblings;
  };
  // ...
  alice->siblings.push_back(bob);
  bob->siblings.push_back(alice);
  // ...
  people.erase(alice);
  people.erase(bob);
        

The most important use-case for weak pointers

Loose Shared Pointers

  • Managers
  • Structured loggers
  • Delayed garbage collection

Common way to hunt and test:


  struct record_instance {
    static std::unordered_set<record_instance*> instances;
    record_instance() { instances.insert(this); }
    ~record_instance() { instances.erase(this); }
  };
        

Loose Weak Pointers

Common way to hunt and test:

Shared Pointer Leaks


  • Hard to impossible to test
  • Hard to even notice
  • Shared pointers are rarely found in hot loops
  • Objects are usually small
  • Low creation velocity
  • Uncontrolled memory growth is slow
  • May take hours, or days, or weeks of runtime

Classic Leak Detectors


  • They hook themselves to malloc and free
  • Log allocations and deallocations with metadata
  • They can't tell what is a leak
  • It's up to us to inspect the metadata

Cool. That's adequate

Let's do it for std::shared_ptr

C++ standard committee:

Nope


  • Hooking to malloc and free is a documented extention point
  • std::shared_ptr is opaque
  • The control block is an implementation detail


This can only be achieved if we were to rewrite std::shared_ptr with a custom control block

So I did that

xmem


  • github.com/iboB/xmem
  • Reimplementation of std smart pointers
  • ... with a twist
  • The control block is a template argument

Demo

Control Block Polymorphism Facts


  • No third party libs with std::shared_ptr
  • Needs care to avoid slow perf
  • Catch leaks
  • API-compatible with std::shared_ptr
  • local_shared_ptr is an artifact
  • Potential runtime polymorphism

End

Questions?



Slides license: CC-BY 4.0   Creative Commons License