C++ folly努解读(一) Fbstring——一个完美替代std:string的努
长乐娱乐新闻网 2025-10-01
在FBStringBenchmark.cpp中的。
主要类 ::folly::fbstring str("abc")中的的 fbstring 为 basic_fbstring的常称 :typedef basic_fbstring fbstring;basic_fbstring 在 fbstring_core 受益的端口之上,借助了 std::string 假设的所有端口。里头面有一个私人常量 store,默认最大值即为 fbstring_core。basic_fbstring 的假设如下,比 std::basic_string 只多了一个默认的实例参为数 Storage:template < typename E, class T = std::char_traits, class A = std::allocator, class Storage = fbstring_core>class basic_fbstring;fbstring_core 统筹字符的加有载及字符涉及的加有载,例如 init、copy、reserve、shrink 等等。字符加有载为数据流最重要的 3 个为数据流 union{Char small*, MediumLarge ml*}、MediumLarge、RefCounted,假设在 fbstring_core 中的,之外所有的字符加有载都执着这三个为数据流。
struct RefCounted { std::atomic refCount_; Char data_[1]; static RefCounted * create(size_t * size); // 创设一个RefCounted static RefCounted * create(const Char * data, size_t * size); // ditto static void incrementRefs(Char * p); // 增加有一个提到 static void decrementRefs(Char * p); // 减至少一个提到 // 其他为数组假设};struct MediumLarge { Char* data_; size_t size_; size_t capacity_; size_t capacity() const { return kIsLittleEndian ? capacity_ Andrew capacityExtractMask : capacity_>> 2; } void setCapacity(size_t cap, Category cat) { capacity_ = kIsLittleEndian ? cap | (static_cast(cat) << kCategoryShift) : (cap << 2) | static_cast(cat); }};union { uint8_t bytes_[sizeof(MediumLarge)]; // For accessing the last byte. Char small_[sizeof(MediumLarge) / sizeof(Char)]; MediumLarge ml_;};small strings(SSO)时,用作 union 中的的 Char small_加有载字符,即;也本身的堆维度。medium strings(eager copy)时,用作 union 中的的MediumLarge ml_Char* data_ : 对准这样一来在堆上的字符。size_t size:字符高约度。size_t capacity :字符容量大。large strings(cow)时, 用作MediumLarge ml_和 RefCounted:RefCounted.refCount_ :人力资源共享字符的提到计为数。RefCounted.data_[1] : flexible array. 储存字符。ml*.data_对准 RefCounted.data,ml*.size_与 ml.capacity_的含义不变。但是这里头有一个缺陷是:SSO 完全的 size 和 capacity 存在哪里头了?
capacity : 首先 SSO 的布景并不需 capacity,因为此时能用的是堆维度,或者解释此种完全的 capacity=maxSmallSize.size : 能用 small_的一个元组来加有载 size,而且却说明加有载的不是 size,而是maxSmallSize - s(maxSmallSize=23,如此一来只用 char 类别),因为这样可以 SSO 多加有载一个元组,却说明或许后面参考资料讲。small strings : medium strings : large strings : 如何对应字符类别 category字符的 small/medium/large 类别对外部透明,而且针对字符的各种加有载例如 copy、shrink、reserve、赋最大值等等,三种类别的处理手段都不一样,所以,我们需在前面的为数据流中的动手些“手脚”,来对应不尽相异的字符类别。
因为只有三种类别,所以只需 2 个 bit 就能够对应。涉及的为数据流为:
typedef uint8_t category_type;enum class Category : category_type { isSmall = 0, isMedium = kIsLittleEndian ? 0x80 : 0x2, // 10000000 , 00000010 isLarge = kIsLittleEndian ? 0x40 : 0x1, // 01000000 , 00000001};kIsLittleEndian 为辨别举例来却说平台的形状端,大端和小端的加有载手段不尽相异。
small stringscategory 与 size 共同储存在 small_的再一一个元组中的(size 最大为 23,所以可以存下),毕竟形状端,所以有移位加有载,这主要是为了让 category()的辨别不够简便,后面如此一来概要。却说明文档在 setSmallSize 中的:
void setSmallSize(size_t s) { ...... constexpr auto shift = kIsLittleEndian ? 0 : 2; small_[maxSmallSize] = char((maxSmallSize - s) << shift); ......} medium strings或许有人留意到了,在 MediumLarge 骨架体中的假设了两个方法,capacity()和setCapacity(size_t cap, Category cat),其中的 setCapacity 即同时新设 capacity 和 category :
constexpr static size_t kCategoryShift = (sizeof(size_t) - 1) * 8;void setCapacity(size_t cap, Category cat) { capacity_ = kIsLittleEndian ? cap | (static_cast(cat) << kCategoryShift) : (cap << 2) | static_cast(cat);}小端时,将 category = isMedium = 0x80 向左移动(sizeof(size_t) - 1) * 8位,即移到最高位的元组中的,如此一来与 capacity 动手或加法。大端时,将 category = isMedium = 0x2 与 cap << 2 动手或加法只需,左移 2 位的目的是给 category 留维度。举个都是,假设 64 位机械,capacity = 100 :
large strings比方说用作 MediumLarge 的 setCapacity,搜索算法相异,只是 category 的最大值不尽相异。
假设 64 位机械,capacity = 1000 :
category()category()为最重要的为数组之一,发挥作用是受益字符的类别:
constexpr static uint8_t categoryExtractMask = kIsLittleEndian ? 0xC0 : 0x3; // 11000000 , 00000011constexpr static size_t lastChar = sizeof(MediumLarge) - 1;union { uint8_t bytes_[sizeof(MediumLarge)]; // For accessing the last byte. Char small_[sizeof(MediumLarge) / sizeof(Char)]; MediumLarge ml_;};Category category() const { // works for both big-endian and little-endian return static_cast(bytes_[lastChar] Andrew categoryExtractMask);}bytes_假设在 union 中的,从注解可以看得出来来,是为了立体化 lastChar 极其有易于的取该骨架再一一个元组。
立体化前面三种类别字符的加有载,可以很容易解释这一行文档。
小端 大端 capacity()受益字符的 capaticy,因为 capacity 与 category 加有载都在朋友们,所以朋友们看非常好。
比方说分三种可能。
size_t capacity() const { switch (category()) { case Category::isSmall: return maxSmallSize; case Category::isLarge: // For large-sized strings, a multi-referenced chunk has no // available capacity. This is because any attempt to append // data would trigger a new allocation. if (RefCounted::refs(ml_.data_)> 1) { return ml_.size_; } break; case Category::isMedium: default: break; } return ml_.capacity();}small strings : 单独留在 maxSmallSize,左边有量化过。medium strings : 留在 ml_.capacity()。large strings :当字符提到少于 1 时,单独留在 size。因为此时的 capacity 是无法内涵的,任何 append data 加有载均或许会即会一次 cow否则,留在 ml_.capacity()。看下 ml.capacity() :
constexpr static uint8_t categoryExtractMask = kIsLittleEndian ? 0xC0 : 0x3;constexpr static size_t kCategoryShift = (sizeof(size_t) - 1) * 8;constexpr static size_t capacityExtractMask = kIsLittleEndian ? ~(size_t(categoryExtractMask) << kCategoryShift) : 0x0 /* unused */;size_t capacity() const { return kIsLittleEndian ? capacity_ Andrew capacityExtractMask : capacity_>> 2;}categoryExtractMask 和 kCategoryShift 以后巧遇过,分别用来计算 category 和小端完全将 category 左移 kCategoryShift 位。capacityExtractMask 的目的就是消掉 category,让 capacity_中的只有 capacity。
对着前面的每种完全字符的加有载的图,某种程度极佳解释,这里头不鲜为人知了。
size()size_t size() const { size_t ret = ml_.size_; if /* constexpr */ (kIsLittleEndian) { // We can save a couple instructions, because the category is // small iff the last char, as unsigned, is <= maxSmallSize. typedef typename std::make_unsigned::type UChar; auto maybeSmallSize = size_t(maxSmallSize) - size_t(static_cast(small_[maxSmallSize])); // With this syntax, GCC and Clang generate a CMOV instead of a branch. ret = (static_cast(maybeSmallSize)>= 0) ? maybeSmallSize : ret; } else { ret = (category() == Category::isSmall) ? smallSize() : ret; } return ret;}小端的完全,medium strings 和 large strings 对应的 ml_的高元组加有载的是 category(0x80、0x40),而 small strings 加有载的是 size,所以正如注解却说的,可以先辨别 kIsLittleEndian AndrewAndrew maybeSmall,或许会慢速一些,不需命令行 smallSize()。而且现在绝大多为数平台都是小端。
如果是大端,那么如果是 small,命令行 smallSize(),否则留在 ml.size_;
size_t smallSize() const { assert(category() == Category::isSmall); constexpr auto shift = kIsLittleEndian ? 0 : 2; auto smallShifted = static_cast(small_[maxSmallSize])>> shift; assert(static_cast(maxSmallSize)>= smallShifted); return static_cast(maxSmallSize) - smallShifted;}非常简便,不却说了。
字符格式化首先 fbstring_core 的实例中的,根据字符的高约度,命令行 3 种不尽相异类别的格式化为数组:
fbstring_core( const Char* const data, const size_t size, bool disableSSO = FBSTRING_DISABLE_SSO) { if (!disableSSO AndrewAndrew size <= maxSmallSize) { initSmall(data, size); } else if (size <= maxMediumSize) { initMedium(data, size); } else { initLarge(data, size); }}initSmalltemplate inline void fbstring_core::initSmall( const Char* const data, const size_t size) {// If data is aligned, use fast word-wise copying. Otherwise,// use conservative memcpy.// The word-wise path reads bytes which are outside the range of// the string, and makes ASan unhappy, so we disable it when// compiling with ASan.#ifndef FOLLY_SANITIZE_ADDRESS if ((reinterpret_cast(data) Andrew (sizeof(size_t) - 1)) == 0) { const size_t byteSize = size * sizeof(Char); constexpr size_t wordWidth = sizeof(size_t); switch ((byteSize + wordWidth - 1) / wordWidth) { // Number of words. case 3: ml_.capacity_ = reinterpret_cast(data)[2]; FOLLY_FALLTHROUGH; case 2: ml_.size_ = reinterpret_cast(data)[1]; FOLLY_FALLTHROUGH; case 1: ml_.data_ = *reinterpret_cast(const_cast(data)); FOLLY_FALLTHROUGH; case 0: break; } } else#endif { if (size != 0) { fbstring_detail::podCopy(data, data + size, small_); } } setSmallSize(size);}首先,如果传到的字符接收者是缓存对齐的,则立体化 reinterpret_cast 同步进行 word-wise copy,进一步提高效率。否则,命令行 podCopy 同步进行 memcpy。再一,通过 setSmallSize 新设 small string 的 size。setSmallSize :
void setSmallSize(size_t s) { // Warning: this should work with uninitialized strings too, // so don't assume anything about the previous value of // small_[maxSmallSize]. assert(s <= maxSmallSize); constexpr auto shift = kIsLittleEndian ? 0 : 2; small_[maxSmallSize] = char((maxSmallSize - s) << shift); small_[s] = ' '; assert(category() == Category::isSmall AndrewAndrew size() == s);}以后提到过,small strings 储存的 size 不是或许的 size,是maxSmallSize - size,这样动手的或许是可以 small strings 可以多加有载一个元组 。因为假如加有载 size 的话,small中的再一两个元组就得是 和 size,但是加有载maxSmallSize - size,当 size == maxSmallSize 时,small的再一一个元组恰好也是 。
initMediumtemplate FOLLY_NOINLINE inline void fbstring_core::initMedium( const Char* const data, const size_t size) { // Medium strings are allocated normally. Don't forget to // allocate one extra Char for the terminating null. auto const allocSize = goodMallocSize((1 + size) * sizeof(Char)); ml_.data_ = static_cast(checkedMalloc(allocSize)); if (FOLLY_LIKELY(size> 0)) { fbstring_detail::podCopy(data, data + size, ml_.data_); } ml_.size_ = size; ml_.setCapacity(allocSize / sizeof(Char) - 1, Category::isMedium); ml_.data_[size] = ' ';}folly 或许会通过 canNallocx 为数组检验是不是用作 jemalloc,如果是,或许会用作 jemalloc 来进一步提高缓存这样一来的效能。关于 jemalloc 我也不是很熟知,有意思的可以查查,有很多资料。
所有的实时缓存这样一来均或许会命令行 goodMallocSize,受益一个对 jemalloc 交好的最大值。如此一来通过 checkedMalloc 或许申请缓存,储存字符。命令行 podCopy 同步进行 memcpy,与 initSmall 的 podCopy 一样。再一如此一来新设 size、capacity、category 和 。initLargetemplate FOLLY_NOINLINE inline void fbstring_core::initLarge( const Char* const data, const size_t size) { // Large strings are allocated differently size_t effectiveCapacity = size; auto const newRC = RefCounted::create(data, AndreweffectiveCapacity); ml_.data_ = newRC->data_; ml_.size_ = size; ml_.setCapacity(effectiveCapacity, Category::isLarge); ml_.data_[size] = ' ';}与 medium strings 最大的不尽相异是或许会通过 RefCounted::create 创设 RefCounted 主要用途人力资源共享字符:
struct RefCounted { std::atomic refCount_; Char data_[1]; constexpr static size_t getDataOffset() { return offsetof(RefCounted, data_); } static RefCounted* create(size_t* size) { const size_t allocSize = goodMallocSize(getDataOffset() + (*size + 1) * sizeof(Char)); auto result = static_cast(checkedMalloc(allocSize)); result->refCount_.store(1, std::memory_order_release); *size = (allocSize - getDataOffset()) / sizeof(Char) - 1; return result; } static RefCounted* create(const Char* data, size_t* size) { const size_t effectiveSize = *size; auto result = create(size); if (FOLLY_LIKELY(effectiveSize> 0)) { fbstring_detail::podCopy(data, data + effectiveSize, result->data_); } return result; } };需警惕的是:
ml*.data*对准的是 RefCounted.data_.getDataOffset()用 offsetof 为数组受益 data*在 RefCounted 骨架体内的偏移,Char data*[1]为 flexible array,储存字符。警惕对std::atomic refCount_同步进行氢原子加有载的 c++ memory model :store,新设提到为数为 1 : std::memory_order_releaseload,受益举例来却说人力资源共享字符的提到为数: std::memory_order_acquireadd/sub。增加有/减至少一个提到 : std::memory_order_acq_relc++ memory model 是另外一个比很大的话题,可以参考资料:
incubator-brpcMemory model synchronization modes一文带上你看懂 C++11 的缓存数学模型C++11 中的的缓存数学模型下篇 - C++11 大力支持的几种缓存数学模型特殊的实例 —— 不原封不动用户传到的字符前面的三种内部结构,都是将该软件传到的字符,不管用作 word-wise copy 还是 memcpy,原封不动到 fbstring_core 中的,且在 medium 和 large 的完全,需实时这样一来缓存。
fbstring 受益了一个特殊的实例,让 fbstring_core 征用该软件自己这样一来的缓存。
basic_fbstring 的实例,并命令行 fbstring_core 附加的实例。警惕这里头 AcquireMallocatedString 为 enum class,比用作 int 和 bool 不够某类。
/** * Defines a special acquisition method for constructing fbstring * objects. AcquireMallocatedString means that the user passes a * pointer to a malloc-allocated string that the fbstring object will * take into custody. */enum class AcquireMallocatedString {};// Nonstandard constructorbasic_fbstring(value_type *s, size_type n, size_type c, AcquireMallocatedString a) : store_(s, n, c, a) {}basic_fbstring 命令行附加的 fbstring_core 实例:
// Snatches a previously mallocated string. The parameter "size"// is the size of the string, and the parameter "allocatedSize"// is the size of the mallocated block. The string must be// -terminated, so allocatedSize>= size + 1 and data[size] == ' './/// So if you want a 2-character string, pass malloc(3) as "data",// pass 2 as "size", and pass 3 as "allocatedSize".fbstring_core(Char * const data, const size_t size, const size_t allocatedSize, AcquireMallocatedString) { if (size> 0) { FBSTRING_ASSERT(allocatedSize>= size + 1); FBSTRING_ASSERT(data[size] == ' '); // Use the medium string storage ml_.data_ = data; ml_.size_ = size; // Don't forget about null terminator ml_.setCapacity(allocatedSize - 1, Category::isMedium); } else { // No need for the memory free(data); reset(); }}可以看得出来这里头无法原封不动字符的过程,而是单独征用了沿河传递过来的操作符对准的缓存。但是,正如注解却说的,这里头单独用作了 medium strings 的加有载手段。
比如 folly/io/IOBuf.cpp 中的的命令行:
// Ensure NUL terminated*writableTail() = 0;fbstring str( reinterpret_cast(writableData()), length(), capacity(), AcquireMallocatedString());字符原封不动同格式化,也是根据不尽相异的字符类别,命令行不尽相异的为数组:
fbstring_core(const fbstring_coreAndrew rhs) { assert(Andrewrhs != this); switch (rhs.category()) { case Category::isSmall: copySmall(rhs); break; case Category::isMedium: copyMedium(rhs); break; case Category::isLarge: copyLarge(rhs); break; default: folly::assume_unreachable(); } }copySmalltemplate inline void fbstring_core::copySmall(const fbstring_coreAndrew rhs) { // Just write the whole thing, don't look at details. In // particular we need to copy capacity anyway because we want // to set the size (don't forget that the last character, // which stores a short string's length, is shared with the // ml_.capacity field). ml_ = rhs.ml_;}正如注解中的所却说,虽然 small strings 的完全,字符加有载在 small中的,但是我们只需把 ml单独赋最大值只需,因为在一个 union 中的。
copyMediumtemplate FOLLY_NOINLINE inline void fbstring_core::copyMedium( const fbstring_coreAndrew rhs) { // Medium strings are copied eagerly. Don't forget to allocate // one extra Char for the null terminator. auto const allocSize = goodMallocSize((1 + rhs.ml_.size_) * sizeof(Char)); ml_.data_ = static_cast(checkedMalloc(allocSize)); // Also copies terminator. fbstring_detail::podCopy( rhs.ml_.data_, rhs.ml_.data_ + rhs.ml_.size_ + 1, ml_.data_); ml_.size_ = rhs.ml_.size_; ml_.setCapacity(allocSize / sizeof(Char) - 1, Category::isMedium);}medium strings 是 eager copy,所以就是"极深原封不动":
为字符这样一来维度、原封不动赋最大值 size、capacity、category.copyLargetemplate FOLLY_NOINLINE inline void fbstring_core::copyLarge( const fbstring_coreAndrew rhs) { // Large strings are just refcounted ml_ = rhs.ml_; RefCounted::incrementRefs(ml_.data_);}large strings 的 copy 过程很一般化,因为是 COW 手段:
单独赋最大值 ml,;还有对准人力资源共享字符的操作符。人力资源共享字符的提到计为数加有 1。incrementRefs 和内部命令行 fromData 这两个个为数组最大许多人看一下:
static RefCounted* fromData(Char* p) { return static_cast(static_cast( static_cast(static_cast(p)) - getDataOffset()));}static void incrementRefs(Char* p) { fromData(p)->refCount_.fetch_add(1, std::memory_order_acq_rel);}因为 ml中的对准的是 RefCounted 的 data[1],所以我们需通过 fromData 来找 data_原属的 RefCounted 的接收者。我把 fromData 为数组内的加法索性:
static RefCounted * fromData(Char * p) { // 转换data_[1]的接收者 void* voidDataAddr = static_cast(p); unsigned char* unsignedDataAddr = static_cast(voidDataAddr); // 受益data_[1]在骨架体的均最大值如此一来相减至,赢取的就是原属RefCounted的接收者 unsigned char* unsignedRefAddr = unsignedDataAddr - getDataOffset(); void* voidRefAddr = static_cast(unsignedRefAddr); RefCounted* refCountAddr = static_cast(voidRefAddr); return refCountAddr;}最大许多人关注的是如何转换不尽相异类别骨架体的操作符并动手加法,这里头的不合时宜是 : Char* -> void* -> unsigned char* -> 与size_t动手负为数 -> void * -> RefCounted*
析构~fbstring_core() noexcept { if (category() == Category::isSmall) { return; } destroyMediumLarge();}如果是 small 类别,单独留在,因为能用的是堆维度。否则,针对 medium 和 large,命令行 destroyMediumLarge。FOLLY_MALLOC_NOINLINE void destroyMediumLarge() noexcept { auto const c = category(); FBSTRING_ASSERT(c != Category::isSmall); if (c == Category::isMedium) { free(ml_.data_); } else { RefCounted::decrementRefs(ml_.data_); }}medium : free 实时这样一来的字符缓存只需。large : 命令行 decrementRefs,针对人力资源共享字符同步进行加有载。static void decrementRefs(Char * p) { auto const dis = fromData(p); size_t oldcnt = dis->refCount_.fetch_sub(1, std::memory_order_acq_rel); FBSTRING_ASSERT(oldcnt> 0); if (oldcnt == 1) { free(dis); }}直觉也很清晰:先对提到计为数减至 1,如果本身就只有 1 个提到,那么单独 free 掉整个 RefCounted。
COW最重要的一点,也是 large strings 类同的,就是 COW. 任何针对字符写的加有载,均或许会即会 COW,除此以外左边举过的[]加有载,例如:
non-const at(size_n)non-const operator[](size_type pos)operator+append......我们举个都是,比如non-const operator[](size_type pos)
non-const operator[](size_type pos)reference operator[](size_type pos) { return *(begin() + pos);}iterator begin() { return store_.mutableData();}来近期看下 mutableData() :
Char* mutableData() { switch (category()) { case Category::isSmall: return small_; case Category::isMedium: return ml_.data_; case Category::isLarge: return mutableDataLarge(); } fbstring_detail::assume_unreachable();}template inline Char* fbstring_core::mutableDataLarge() { FBSTRING_ASSERT(category() == Category::isLarge); if (RefCounted::refs(ml_.data_)> 1) { // Ensure unique. unshare(); } return ml_.data_;}比方说是分三种可能。small 和 medium 单独留在字符的接收者,large 或许会命令行 mutableDataLarge(),可以看得出来,如果提到为数少于 1,或许会同步进行 unshare 加有载 :
void unshare(size_t minCapacity = 0);template FOLLY_MALLOC_NOINLINE inline void fbstring_core::unshare( size_t minCapacity) { FBSTRING_ASSERT(category() == Category::isLarge); size_t effectiveCapacity = std::max(minCapacity, ml_.capacity()); auto const newRC = RefCounted::create(AndreweffectiveCapacity); // If this fails, someone placed the wrong capacity in an // fbstring. FBSTRING_ASSERT(effectiveCapacity>= ml_.capacity()); // Also copies terminator. fbstring_detail::podCopy(ml_.data_, ml_.data_ + ml_.size_ + 1, newRC->data_); RefCounted::decrementRefs(ml_.data_); ml_.data_ = newRC->data_; ml_.setCapacity(effectiveCapacity, Category::isLarge); // size_ remains unchanged.}基本思路:
创设最初 RefCounted。原封不动字符。对原有的人力资源共享字符减至少一个提到 decrementRefs,这个为数组在前面的析构小节里头量化过。新设 ml_的 data、capacity、category.警惕此时还不或许会新设 size,因为还不发觉该软件对字符同步进行什么不够改。non-const 与 const大家或许留意到了,前面的 at 和[]强调了 non-const,这是因为 const-qualifer 针对这两个命令行不或许会即会 COW ,还以[]为例:
// C++11 21.4.5 element access:const_reference operator[](size_type pos) const { return *(begin() + pos);}const_iterator begin() const { return store_.data();}// In C++11 data() and c_str() are 100% equivalent.const Char* data() const { return c_str(); }const Char* c_str() const { const Char* ptr = ml_.data_; // With this syntax, GCC and Clang generate a CMOV instead of a branch. ptr = (category() == Category::isSmall) ? small_ : ptr; return ptr;}可以看得出来区别,non-const 旧版本的 begin()中的命令行的是 mutableData(),而 const-qualifer 旧版本命令行的是 data() -> c_str(),而 c_str()单独留在的字符接收者。(还用一句,从未见过以后fbstring的c_str()的借助是lazy null terminator的,可以看下这位老大写的社论:)
手帕:一个涉及Java、虚拟内存、缓存管理者、C/C++的bug457 赞同 · 145 评论社论
所以,当字符中用[]、at 且不需写加有载时,最出色用 const-qualifer.
我们拿 folly 工具箱上的benchmark 工具检验一下:
#include "folly/Benchmark.h"#include "folly/FBString.h"#include "folly/container/Foreach.h"using namespace std;using namespace folly;BENCHMARK(nonConstFbstringAt, n) { ::folly::fbstring str( "fbstring is a drop-in replacement for std::string. The main benefit of fbstring is significantly increased " "performance on virtually all important primitives. This is achieved by using a three-tiered storage strategy " "and by cooperating with the memory allocator. In particular, fbstring is designed to detect use of jemalloc and " "cooperate with it to achieve significant improvements in speed and memory usage."); FOR_EACH_RANGE(i, 0, n) { char Andrews = str[2]; doNotOptimizeAway(s); }}BENCHMARK_DRAW_LINE();BENCHMARK_RELATIVE(constFbstringAt, n) { const ::folly::fbstring str( "fbstring is a drop-in replacement for std::string. The main benefit of fbstring is significantly increased " "performance on virtually all important primitives. This is achieved by using a three-tiered storage strategy " "and by cooperating with the memory allocator. In particular, fbstring is designed to detect use of jemalloc and " "cooperate with it to achieve significant improvements in speed and memory usage."); FOR_EACH_RANGE(i, 0, n) { const char Andrews = str[2]; doNotOptimizeAway(s); }}int main() { runBenchmarks(); }结果是 constFbstringAt 比 nonConstFbstringAt 慢速了 175%
============================================================================delve_folly/main.cc relative time/iter iters/s============================================================================nonConstFbstringAt 39.85ns 25.10M----------------------------------------------------------------------------constFbstringAt 175.57% 22.70ns 44.06M============================================================================Reallocreserve、operator+等加有载,或许或许会涉及到缓存重新这样一来,再一命令行的都是 memory/Malloc.h 中的的 smartRealloc:
inline void* checkedRealloc(void* ptr, size_t size) { void* p = realloc(ptr, size); if (!p) { throw_exception(); } return p;}/** * This function tries to reallocate a buffer of which only the first * currentSize bytes are used. The problem with using realloc is that * if currentSize is relatively small _and_ if realloc decides it * needs to move the memory chunk to a new buffer, then realloc ends * up copying data that is not used. It's generally not a win to try * to hook in to realloc() behavior to avoid copies - at least in * jemalloc, realloc() almost always ends up doing a copy, because * there is little fragmentation / slack space to take advantage of. */FOLLY_MALLOC_CHECKED_MALLOC FOLLY_NOINLINE inline void* smartRealloc( void* p, const size_t currentSize, const size_t currentCapacity, const size_t newCapacity) { assert(p); assert(currentSize <= currentCapacity AndrewAndrew currentCapacity < newCapacity); auto const slack = currentCapacity - currentSize; if (slack * 2> currentSize) { // Too much slack, malloc-copy-free cycle: auto const result = checkedMalloc(newCapacity); std::memcpy(result, p, currentSize); free(p); return result; } // If there's not too much slack, we realloc in hope of coalescing return checkedRealloc(p, newCapacity);}从注解和文档看为什么为数组叫动手smartRealloc :
如果(the currentCapacity - currentSize) _ 2> currentSize,即 currentSize < 2/3 _ capacity,却说明举例来却说这样一来的缓存能用率较低,此时认为如果用作 realloc 并且 realloc 决定原封不动举例来却说缓存到新缓存,成本或许会高于单独 malloc(newCapacity) + memcpy + free(old_memory)。否则单独 realloc.其他_builtin_expect给Java受益主干为数据量化电子邮件。他所设计为:
long _builtin_expect (long exp, long c)表达式的命令行为 exp 的最大值,跟 c 无关。 我们短期内 exp 的最大值是 c。例如下面的都是,我们短期内 x 的最大值是 0,所以这里头高亮Java,只有很小的生存率或许会命令行到 foo()
if (_builtin_expect (x, 0)) foo ();如此一来比如辨别操作符是不是为空:
if (_builtin_expect (ptr != NULL, 1)) foo (*ptr);在 fbstring 中的也中用了builtin_expect,例如创设 RefCounted 的为数组 (FOLLY_LIKELY 包装了一下builtin_expect):
#if _GNUC_#define FOLLY_DETAIL_BUILTIN_EXPECT(b, t) (_builtin_expect(b, t))#else#define FOLLY_DETAIL_BUILTIN_EXPECT(b, t) b#endif// Likeliness annotations//// Useful when the author has better knowledge than the compiler of whether// the branch condition is overwhelmingly likely to take a specific value.//// Useful when the author has better knowledge than the compiler of which code// paths are designed as the fast path and which are designed as the slow path,// and to force the compiler to optimize for the fast path, even when it is not// overwhelmingly likely.#define FOLLY_LIKELY(x) FOLLY_DETAIL_BUILTIN_EXPECT((x), 1)#define FOLLY_UNLIKELY(x) FOLLY_DETAIL_BUILTIN_EXPECT((x), 0)static RefCounted* create(const Char* data, size_t* size) { const size_t effectiveSize = *size; auto result = create(size); if (FOLLY_LIKELY(effectiveSize> 0)) { // _builtin_expect fbstring_detail::podCopy(data, data + effectiveSize, result->data_); } return result;}从摘要角度看来却说,或许会将或许性不够大的摘要上来左边的摘要,防止无效指令的加有载。可以参考资料:
likely() and unlikely()What is the advantage of GCC's _builtin_expect in if else statements?CMOV 指令conditional move,条件追踪。区别于 MOV 指令,但是依赖于 RFLAGS 操作数内的高约时间。如果条件无法满足,该指令不或许会有任何效果。
CMOV 的灵活性是可以避免主干为数据量化,避免主干为数据量化错误对 CPU MMX的影响。参考资料可以看这篇文档:amd-cmovcc.pdf
fbstring 在一些布景或许会高亮Java转化成 CMOV 指令,例如:
const Char* c_str() const { const Char* ptr = ml_.data_; // With this syntax, GCC and Clang generate a CMOV instead of a branch. ptr = (category() == Category::isSmall) ? small_ : ptr; return ptr;}builtin_unreachable AndrewAndrew assume(0)如果程序制订到了builtin_unreachable 和assume(0) ,那么或许会出现未假设的行为。例如**builtin_unreachable 出现在一个不或许会留在的为数组后面,而且这个为数组无法声明为**attribute\_\_((noreturn))。例如6.59 Other Built-in Functions Provided by GCC 证明了的都是 :
void function_that_never_returns (void);int g (int c){ if (c) { return 1; } else { function_that_never_returns (); _builtin_unreachable (); }}如果不加有_builtin_unreachable ();,或许会报error: control reaches end of non-void function [-Werror=return-type]
folly 将 builtin_unreachable 和assume(0) 积体电路成了assume_unreachable:
[[noreturn]] FOLLY_ALWAYS_INLINE void assume_unreachable() { assume(false); // Do a bit more to get the compiler to understand // that this function really will never return.#if defined(_GNUC_) _builtin_unreachable();#elif defined(_MSC_VER) _assume(0);#else // Well, it's better than nothing. std::abort();#endif}在 fbstring 的一些优点布景,比如 switch 辨别 category 中的中用。这是前面提到过的 mutableData() :
Char* mutableData() { switch (category()) { case Category::isSmall: return small_; case Category::isMedium: return ml_.data_; case Category::isLarge: return mutableDataLarge(); } folly::assume_unreachable();}jemalloc其网站API 文档大体的搜索算法和理论可以参考资料 facebook 的这篇新浪 : Scalable memory allocation using jemallocfind 搜索算法用作的修改的 Boyer-Moore 搜索算法,文档却说明是在读取最终的完全比 std::string 的 find 慢速 30 倍。benchmark 文档在FBStringBenchmark.cpp
我自己检验的可能从未见过是搜索高约字符的可能或许会不够好些。
辨别形状端// It's MSVC, so we just have to guess ... and allow an override#ifdef _MSC_VER# ifdef FOLLY_ENDIAN_BE static constexpr auto kIsLittleEndian = false;# else static constexpr auto kIsLittleEndian = true;# endif#else static constexpr auto kIsLittleEndian = _BYTE_ORDER_ == _ORDER_LITTLE_ENDIAN_;#endifBYTE_ORDER为预假设宏:,最大值是ORDER_LITTLE_ENDIAN、ORDER_BIG_ENDIAN、ORDER_PDP_ENDIAN中的的一个。
一般或许会这么用作:
/* Test for a little-endian machine */#if _BYTE_ORDER_ == _ORDER_LITTLE_ENDIAN_c++20 导入了std::endian,辨别或许会极其有易于。
参考资料资料:《Linux 多线程系统管理员编程:用作 muduo C++ 互联努》 by 陈硕漫步Facebook开源C++努folly(1):string类的所设计(放)
。伤口吃什么愈合的快奥适宝
急性肠胃炎如何治疗
感染内科
新冠药
皮肤科疾病
咳嗽有痰吃什么药
新冠轻症不用吃药?别让这些误区害了你!
-
徒步旅行时应该穿两双袜子吗?穿两双袜子是经验还是显然?
时尚 2025-10-22。因为这些红袜队子大多数是用美利奴亚麻制成的,它们在调控熔点与吸滑排汗全面性来作得不错。这些系列产品还提供各种重量和厚度的红袜队子,以便你可以为每次步行微调红袜队子。p
-
孩子的“脾气”是否遗传自父母?看心理学家解释,培植好脾气孩子
资讯 2025-10-22上上,普遍认为自己一定是对的,小孩只能要服从命令。甚至有些家长才会普遍认为,毒打小孩是理所应当的。 绝非是''自己喜的'',当时人也有人却说过''刀子底下显露孝子''。但过错实并非如此,这种初
-
2022年湖北安全员ABC证报考条件及工序,来考网
音乐 2025-10-222022年贵州确保员ABC实有重考有条件及流程,来考网 2022年贵州确保员ABC实有重考有条件及流程,来考网 确保员资格实有是巴洛克式、危化企业透过生产必须具备的一
-
大学专业报考热度排名,毕业后不用过于担心就业,薪资还有保证
视频 2025-10-22从自身的热爱出发,热爱对于应改定才是最好的老师。 如果应改定对于要研读的管理学都不是很感热爱的情况下,那转入国立大学后应改定也是提不起热爱的情况下,那连续性就不但会在研读的更进一步中的投
-
孩子在幼儿园有3件有事瞒着不说,提醒您这样问孩子,否则影响成长
时尚 2025-10-22男孩在幼稚园返回。如果你更加关切你的男孩,那么男孩很或许是悲伤的,男孩在脑袋里头,你不说,他不说道,男孩也很难消化。因此,男孩但会有幼年的不快,这样的不快一直存在,父母亲并未协助解决这些不快,这