tagged Pointer

问题来源

在最近的一次面试过程中,被问到Apple对tagged pointer的优化的问题,当时一脸懵逼,除了说上一句“提高访问效率和节约内存成本”,就没有然后了,被问及其实现原理一概不知。一时陷入尴尬,不知道对话该怎么继续。词穷的原因,大概是因为我的无知吧!所以过后,赶紧补上这方面的知识!

探索历程

1、百度

既然打算掌握tagged pointer的相关知识,肯定要知道它是什么,干什么用的,怎么实现的,优势&劣势等等。一大堆的问题,将我湮灭,还好,我有度娘
so…
百度.png
一看这么多,求知欲爆棚的我瞬间高潮了,内心简直。。。。。无法用言语形容。尝试着阅读了几篇,发现解释的都很片面和肤浅,和我想要的相差甚远。有点小失落。但是,我可是“好好学习天天向上”的好学生,祖国的花朵,共产主义的接班人啊,怎能轻言放弃呢。于是继续在一堆优(垃)秀(圾)中追求真理。。。

皇天不负有心人。
image.png
看到唐巧大佬也有过这方面的研究。嗯嗯,看来我和大佬之间,就差一个tagged pointer的掌握!!!(窃喜&YY自己正走在成为大佬的路上.png)
点开大佬博客,似乎好像,大佬对tagged pointer也没有太多的解析,让读者知其大概。顿时陷入了迷茫和思考:我想要的仅仅是这些表面的知其然而不知其所以然的所谓的知识吗?脑海中闪过无数鸡汤,瞬间顿悟:去吧皮卡丘,追寻你想要的吧!


2、Developer Documention

干劲是有了,那从哪里入手呢,怎么去学习tagged pointer呢。这时,我想到了官方文档,于是打开xcode,Help–>Developer Documention,搜索Tagged Pointer:
image.png
WTF….
内心一万头草泥马奔腾而过~~~
后来想想,感觉自己好傻逼,Developer Documention是开发者接口文档,tagged pointer又不是接口,是Apple对NSString、NSNumber、NSDate等类的优化。优化。优化。额,,好像哪里不对,优化不是底层实现的吗,探索底层,不应该去看runtime吗。诶,终究还是太年轻,为什么现在才想起来要看runtime,看来共产主义迟迟不让你接班,还是有原因的!!!

runtime源码

既然知道了路在何方,就不在迷茫~
废话不多说,先去OpenSource下载一份runtime源码。
这里下载一份最新的objc-750
下载完后,我隐隐感觉到,我离掌握Tagged Pointer已经不远了。
于是,抑制住激动的心,控制住颤抖的手,点开objc-750源码,好吧,我又迷失在了无尽的头文件中。
image.png
这尼玛,Tagged Pointer相关的知识,在哪个头文件?总不能一个个去找吧?你可能觉得我已经在头文件中迷失自我了,我只能说:呵,天真,我可是共产主义的接班人,这就想难道我?
于是我点开了文件搜索😎~
image.png
这TM也有好几个啊啊啊啊啊啊。没办法,只能再次求助度娘
一番查找,终于找到了objc-internal.h。(窃喜.jpg)
总算进入正题了(进入全神贯注模式.jpg)。
开始阅读源码:
image.png
这里可以看注释,文件中关于Tagged Pointer的,大概在209行的位置:
image.png
所以直接看这里。
开始的位置,是对64位系统的判定:

1
2
3
#if __LP64__
#define OBJC_HAVE_TAGGED_POINTERS 1 //如果是64位系统,则定义宏OBJC_HAVE_TAGGED_POINTERS
#endif

然后下面就对OBJC_HAVE_TAGGED_POINTERS宏进行判定:

1
2
3

#if OBJC_HAVE_TAGGED_POINTERS //说明tagged pointer只存在于64位系统中,即Apple是在64位系统上用tagged pointer优化NSString等类的。
...

继续往下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#if __has_feature(objc_fixed_enum)  ||  __cplusplus >= 201103L
enum objc_tag_index_t : uint16_t
#else
typedef uint16_t objc_tag_index_t;
enum
#endif
{
// 60-bit payloads
OBJC_TAG_NSAtom = 0,
OBJC_TAG_1 = 1,
OBJC_TAG_NSString = 2,
OBJC_TAG_NSNumber = 3,
OBJC_TAG_NSIndexPath = 4,
OBJC_TAG_NSManagedObjectID = 5,
OBJC_TAG_NSDate = 6,

// 60-bit reserved
OBJC_TAG_RESERVED_7 = 7,

// 52-bit payloads
OBJC_TAG_Photos_1 = 8,
OBJC_TAG_Photos_2 = 9,
OBJC_TAG_Photos_3 = 10,
OBJC_TAG_Photos_4 = 11,
OBJC_TAG_XPC_1 = 12,
OBJC_TAG_XPC_2 = 13,
OBJC_TAG_XPC_3 = 14,
OBJC_TAG_XPC_4 = 15,

OBJC_TAG_First60BitPayload = 0,
OBJC_TAG_Last60BitPayload = 6,
OBJC_TAG_First52BitPayload = 8,
OBJC_TAG_Last52BitPayload = 263,

OBJC_TAG_RESERVED_264 = 264
};

一看到#if#else#endif,就知道在做判断。
看一下具体在判断什么:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#if __has_feature(objc_fixed_enum)  ||  __cplusplus >= 201103L
/*
__has_feature(objc_fixed_enum):
__has_feature evaluates to 1 if the feature is both supported by Clang and standardized in the current language standard or 0 if not.
翻译一下就是:__has_feature取值为1如果特性被Clang支持并且被标准化在当前的语言标准中,取值为0如果不是的话。(原谅我耿(垃)直(圾)的翻译!!!)
Use __has_feature(objc_fixed_enum) to determine whether support for fixed underlying types is available in Objective-C.
翻译:用__has_feature(objc_fixed_enum)来决定是否支持固定的基础类型在Objective-C中。(我也不知道在说啥~)
__cplusplus >= 201103L:
C++11或者更新的标准。
*/
enum objc_tag_index_t : uint16_t
/*
定义枚举objc_tag_index_t,枚举类型为:uint16_t
这是c++11之后的特性
*/
#else
typedef uint16_t objc_tag_index_t;
enum
#endif

下面是一些枚举值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
{
// 60-bit payloads
OBJC_TAG_NSAtom = 0,
OBJC_TAG_1 = 1,
OBJC_TAG_NSString = 2, //NSString
OBJC_TAG_NSNumber = 3, //NSNumber
OBJC_TAG_NSIndexPath = 4, //NSIndexPath
OBJC_TAG_NSManagedObjectID = 5,
OBJC_TAG_NSDate = 6,//NSDate

// 60-bit reserved
OBJC_TAG_RESERVED_7 = 7,

// 52-bit payloads
OBJC_TAG_Photos_1 = 8,
OBJC_TAG_Photos_2 = 9,
OBJC_TAG_Photos_3 = 10,
OBJC_TAG_Photos_4 = 11,
OBJC_TAG_XPC_1 = 12,
OBJC_TAG_XPC_2 = 13,
OBJC_TAG_XPC_3 = 14,
OBJC_TAG_XPC_4 = 15,

OBJC_TAG_First60BitPayload = 0,
OBJC_TAG_Last60BitPayload = 6,
OBJC_TAG_First52BitPayload = 8,
OBJC_TAG_Last52BitPayload = 263,

OBJC_TAG_RESERVED_264 = 264
};

接下来,就是一些关于tagged pointer的函数了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// Returns true if tagged pointers are enabled.
// The other functions below must not be called if tagged pointers are disabled.
static inline bool
_objc_taggedPointersEnabled(void);

// Register a class for a tagged pointer tag.
// Aborts if the tag is invalid or already in use.
OBJC_EXPORT void
_objc_registerTaggedPointerClass(objc_tag_index_t tag, Class _Nonnull cls)
OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

// Returns the registered class for the given tag.
// Returns nil if the tag is valid but has no registered class.
// Aborts if the tag is invalid.
OBJC_EXPORT Class _Nullable
_objc_getClassForTag(objc_tag_index_t tag)
OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

// Create a tagged pointer object with the given tag and payload.
// Assumes the tag is valid.
// Assumes tagged pointers are enabled.
// The payload will be silently truncated to fit.
static inline void * _Nonnull
_objc_makeTaggedPointer(objc_tag_index_t tag, uintptr_t payload);

// Return true if ptr is a tagged pointer object.
// Does not check the validity of ptr's class.
static inline bool
_objc_isTaggedPointer(const void * _Nullable ptr);

// Extract the tag value from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// Does not check the validity of ptr's tag.
static inline objc_tag_index_t
_objc_getTaggedPointerTag(const void * _Nullable ptr);

// Extract the payload from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// The payload value is zero-extended.
static inline uintptr_t
_objc_getTaggedPointerValue(const void * _Nullable ptr);

// Extract the payload from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// The payload value is sign-extended.
static inline intptr_t
_objc_getTaggedPointerSignedValue(const void * _Nullable ptr);

中间也有各个函数的注释,还算注释的比较清楚。那我们就根据注释和源码,一个个看其实现吧(下面将声明和实现的代码粘贴到一起):
_objc_taggedPointersEnabled :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Returns true if tagged pointers are enabled.
// The other functions below must not be called if tagged pointers are disabled.
/*
翻译:返回true如果tagged pointers 是被允许的。
下面的其它函数不能被调用如果tagged pointers被禁用。
*/
static inline bool
_objc_taggedPointersEnabled(void);

static inline bool
_objc_taggedPointersEnabled(void)
{
extern uintptr_t objc_debug_taggedpointer_mask;
//引入一个外部变量 objc_debug_taggedpointer_mask
return (objc_debug_taggedpointer_mask != 0);
//如果objc_debug_taggedpointer_mask为为0,则返回flase,否则返回true
}

这个函数的实现其实很简单,关键是objc_debug_taggedpointer_mask,这是个什么东西?
它既然被extern引入,说明他在其他文件中肯定被定义并初始化过。so,找到它。
objc-gdb.h中找到了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
OBJC_EXPORT uintptr_t objc_debug_taggedpointer_mask
OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

//这里用OBJC_EXPORT导出objc_debug_taggedpointer_mask。
//OBJC_EXPORT的定义为:

#if !defined(OBJC_EXPORT)
# define OBJC_EXPORT OBJC_EXTERN OBJC_VISIBLE

//而OBJC_EXTERN的定义为:

#if !defined(OBJC_EXTERN)
# if defined(__cplusplus)
# define OBJC_EXTERN extern "C"
# else
# define OBJC_EXTERN extern
# endif
#endif

//OBJC_VISIBLE的定义为:
#if !defined(OBJC_VISIBLE)

# define OBJC_VISIBLE __attribute__((visibility("default")))

#endif

//所以这句相当于:
extern "C" __attribute__((visibility("default"))) uintptr_t objc_debug_taggedpointer_mask
OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

//同样的,OBJC_AVAILABLE也可以找到其定义:
/* OBJC_AVAILABLE: shorthand for all-OS availability */

#if !defined(OBJC_AVAILABLE)
# define OBJC_AVAILABLE(x, i, t, w, b) \
__OSX_AVAILABLE(x) __IOS_AVAILABLE(i) __TVOS_AVAILABLE(t) \
__WATCHOS_AVAILABLE(w) __BRIDGEOS_AVAILABLE(b)
#endif
/*可见,OBJC_AVAILABLE将扩展为4个宏,拿__OSX_AVAILABLE来看看吧:*/

/* for use marking APIs available info for Mac OSX */

#if defined(__has_attribute)
#if __has_attribute(availability)
#define __OSX_UNAVAILABLE __OS_AVAILABILITY(macosx,unavailable)
#define __OSX_AVAILABLE(_vers) __OS_AVAILABILITY(macosx,introduced=_vers)
#define __OSX_DEPRECATED(_start, _dep, _msg) __OSX_AVAILABLE(_start) __OS_AVAILABILITY_MSG(macosx,deprecated=_dep,_msg)
#endif
#endif

#define __OS_AVAILABILITY(_target, _availability) __attribute__((availability(_target,_availability)))

//所以:
#define __OSX_AVAILABLE(x) __attribute__((availability(macosx,x)))
//其他的也是类似的,这里就不展开多说了。其实主要目的是为了限制版本的。
//关于__attribute__((availability(macosx,x))) :
/*

平台,可为macosx, ios, tvos, watchos;
何时引入,introduced=版本;
何时弃用,deprecated=版本;
何时废弃,obsoleted=版本;
不可用,unavailable,标识此平台不可用;
额外信息,message=字符串,可用来额外说明,如提示新的可用的替代方法。
*/
/*所以,OBJC_EXPORT uintptr_t objc_debug_taggedpointer_mask
OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);
扩展开来其实是这样的:
*/
extern "C" __attribute__((visibility("default"))) uintptr_t objc_debug_taggedpointer_mask __attribute__((availability(macosx,introduced=10.9))) __attribute__((availability(ios, introduced=7.0,))) __attribute__((availability(tvos,introduced=9.0))) __attribute__((availability(watchos,introduced=1.0,))) __attribute__((availability(bridgeos,introduced=2.0,)))

/*
搞了这么多,各种define套define,搞的花里胡哨的,其实句话主要就是说:
objc_debug_taggedpointer_mask这个符号被导入,可见类型为`default `,且这个符号,分别在macosx10.9、ios7.0、tvos9.0、watchos1.0、bridgeos2.0时被引入的。
*/

然而,搞了半天,还是不知道objc_debug_taggedpointer_mask的值啊。
所以继续探索。
一番寻找之后,终于找到了其赋值的位置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
uintptr_t objc_debug_taggedpointer_mask = _OBJC_TAG_MASK;

#if OBJC_MSB_TAGGED_POINTERS
# define _OBJC_TAG_MASK (1UL<<63)
#else
# define _OBJC_TAG_MASK 1UL

#if (TARGET_OS_OSX || TARGET_OS_IOSMAC) && __x86_64__
// 64-bit Mac - tag bit is LSB
//这里注释的很清楚了,64位的mac,tagged标识位位LSB(最低有效位)
# define OBJC_MSB_TAGGED_POINTERS 0
#else
// Everything else - tag bit is MSB
//其他的tagged标识位都为MSB(最高有效位)
# define OBJC_MSB_TAGGED_POINTERS 1
#endif

绕这么一大圈,居然给我说是这个?so interesting!!!
回到最最初的那个_objc_taggedPointersEnabled函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
static inline bool 
_objc_taggedPointersEnabled(void);

static inline bool
_objc_taggedPointersEnabled(void)
{
extern uintptr_t objc_debug_taggedpointer_mask;
return (objc_debug_taggedpointer_mask != 0);
/*
这里等价于:
if(64bit system)
{
return true;
}
return flase;
*/
}

这个函数算是弄清楚了,接下来看另一个函数。
_objc_registerTaggedPointerClass

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
OBJC_EXPORT void
_objc_registerTaggedPointerClass(objc_tag_index_t tag, Class _Nonnull cls)
OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);
/***********************************************************************
* _objc_registerTaggedPointerClass
* Set the class to use for the given tagged pointer index.
* Aborts if the tag is out of range, or if the tag is already
* used by some other class.
**********************************************************************/
void
_objc_registerTaggedPointerClass(objc_tag_index_t tag, Class cls)
{
if (objc_debug_taggedpointer_mask == 0) {//首先判断系统是否支持tagged pointer 。objc_debug_taggedpointer_mask=0为tagged pointer disabled;
_objc_fatal("tagged pointers are disabled");
/*
调用_objc_fatal函数。
这是_objc_fatal的实现:
void _objc_fatal(const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
_vcprintf(fmt, args);
va_end(args);
//通过va_list获取格式化参数,并输出。
_cprintf("\n");

abort();
//退出
}
*/
//所以这里的意思就是输出"tagged pointers are disabled"并退出。
}

Class *slot = classSlotForTagIndex(tag);
if (!slot) {
_objc_fatal("tag index %u is invalid", (unsigned int)tag);
}

Class oldCls = *slot;

if (cls && oldCls && cls != oldCls) {
_objc_fatal("tag index %u used for two different classes "
"(was %p %s, now %p %s)", tag,
oldCls, oldCls->nameForLogging(),
cls, cls->nameForLogging());
}

*slot = cls;

// Store a placeholder class in the basic tag slot that is
// reserved for the extended tag space, if it isn't set already.
// Do this lazily when the first extended tag is registered so
// that old debuggers characterize bogus pointers correctly more often.
if (tag < OBJC_TAG_First60BitPayload || tag > OBJC_TAG_Last60BitPayload) {
Class *extSlot = classSlotForBasicTagIndex(OBJC_TAG_RESERVED_7);
if (*extSlot == nil) {
extern objc_class OBJC_CLASS_$___NSUnrecognizedTaggedPointer;
*extSlot = (Class)&OBJC_CLASS_$___NSUnrecognizedTaggedPointer;
}
}

继续往下看的话,又遇到一个之前没有看过的函数:

1
Class * classSlotForTagIndex(tag);

在runtime中找到它的实现部分:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// Returns a pointer to the class's storage in the tagged class arrays, 
// or nil if the tag is out of range.
/*
注释:返回一个指针给类的存储在tagged class 数组中。
*/
static Class *
classSlotForTagIndex(objc_tag_index_t tag)
{
if (tag >= OBJC_TAG_First60BitPayload && tag <= OBJC_TAG_Last60BitPayload) {
//这里判断tagged pointer的payloads位,是否为60位payloads。
/*
// 60-bit payloads
OBJC_TAG_NSAtom = 0,
OBJC_TAG_1 = 1,
OBJC_TAG_NSString = 2,
OBJC_TAG_NSNumber = 3,
OBJC_TAG_NSIndexPath = 4,
OBJC_TAG_NSManagedObjectID = 5,
OBJC_TAG_NSDate = 6,
*/
//以上7种类型都为60bit payloads(用60bit来存储数据)
return classSlotForBasicTagIndex(tag);//调用另一个函数,这里找到它的实现(在下面对其分析)
}

if (tag >= OBJC_TAG_First52BitPayload && tag <= OBJC_TAG_Last52BitPayload) {
int index = tag - OBJC_TAG_First52BitPayload;
uintptr_t tagObfuscator = ((objc_debug_taggedpointer_obfuscator
>> _OBJC_TAG_EXT_INDEX_SHIFT)
& _OBJC_TAG_EXT_INDEX_MASK);
return &objc_tag_ext_classes[index ^ tagObfuscator];
}

return nil;
}

分析classSlotForBasicTagIndex函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Returns a pointer to the class's storage in the tagged class arrays.
// Assumes the tag is a valid basic tag.假设标志是一个有效的标志(在上层调用中已经判断了tag)
static Class *
classSlotForBasicTagIndex(objc_tag_index_t tag)
{
//这里又出现一个奇怪的变量objc_debug_taggedpointer_obfuscator😞。。。看看他到底是什么东西?
uintptr_t tagObfuscator = ((objc_debug_taggedpointer_obfuscator
>> _OBJC_TAG_INDEX_SHIFT)
& _OBJC_TAG_INDEX_MASK);
uintptr_t obfuscatedTag = tag ^ tagObfuscator;
// Array index in objc_tag_classes includes the tagged bit itself
#if SUPPORT_MSB_TAGGED_POINTERS
return &objc_tag_classes[0x8 | obfuscatedTag];
#else
return &objc_tag_classes[(obfuscatedTag << 1) | 1];
#endif
}

分析objc_debug_taggedpointer_obfuscator

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
/***********************************************************************
* initializeTaggedPointerObfuscator(初始化initializeTaggedPointerObfuscator)
* Initialize objc_debug_taggedpointer_obfuscator with randomness.(初始化objc_debug_taggedpointer_obfuscator用随机数据)
*
* The tagged pointer obfuscator is intended to make it more difficult
* for an attacker to construct a particular object as a tagged pointer,
* in the presence of a buffer overflow or other write control over some
* memory. The obfuscator is XORed with the tagged pointers when setting
* or retrieving payload values. They are filled with randomness on first
* use.
**********************************************************************/
static void
initializeTaggedPointerObfuscator(void)
{
if (sdkIsOlderThan(10_14, 12_0, 12_0, 5_0, 3_0) ||
// Set the obfuscator to zero for apps linked against older SDKs,
// in case they're relying on the tagged pointer representation.
DisableTaggedPointerObfuscation) {
/*
// OPTION(var, env, help)
OPTION( DisableTaggedPointerObfuscation, OBJC_DISABLE_TAG_OBFUSCATION, "disable obfuscation of tagged pointers")
*/
/*
#define sdkIsOlderThan(x, i, t, w) (sdkVersion() < DYLD_OS_VERSION(x, i, t, w))
#if TARGET_OS_OSX
# define DYLD_OS_VERSION(x, i, t, w) DYLD_MACOSX_VERSION_##x
# define sdkVersion() dyld_get_program_sdk_version()

#elif TARGET_OS_IOS
# define DYLD_OS_VERSION(x, i, t, w) DYLD_IOS_VERSION_##i
# define sdkVersion() dyld_get_program_sdk_version()

#elif TARGET_OS_TV
// dyld does not currently have distinct constants for tvOS
# define DYLD_OS_VERSION(x, i, t, w) DYLD_IOS_VERSION_##t
# define sdkVersion() dyld_get_program_sdk_version()

#elif TARGET_OS_WATCH
# define DYLD_OS_VERSION(x, i, t, w) DYLD_WATCHOS_VERSION_##w
// watchOS has its own API for compatibility reasons
# define sdkVersion() dyld_get_program_sdk_watch_os_version()

#else
# error unknown OS
#endif
*/
objc_debug_taggedpointer_obfuscator = 0;
} else {

// Pull random data into the variable, then shift away all non-payload bits.(将随机数据放入变量中,然后移走所有非有效负载位。)
arc4random_buf(&objc_debug_taggedpointer_obfuscator,
sizeof(objc_debug_taggedpointer_obfuscator));
//arc4random_buf() fills the region buf of length nbytes with random data.

objc_debug_taggedpointer_obfuscator &= ~_OBJC_TAG_MASK;
//获取一个随机数,并将该随机数的tagged pointer 标识位(高1bit或者低1bit)置为0。即:
//objc_debug_taggedpointer_obfuscator = objc_debug_taggedpointer_obfuscator & (0x7FFFFFFFFFFFFFFF或者0xFFFFFFFFFFFFFFFE)
}
}

objc_debug_taggedpointer_obfuscator 也算是囫囵吞枣掌握了,回到刚刚那个classSlotForBasicTagIndex 函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
static Class *
classSlotForBasicTagIndex(objc_tag_index_t tag)
{
uintptr_t tagObfuscator = ((objc_debug_taggedpointer_obfuscator
>> _OBJC_TAG_INDEX_SHIFT)
& _OBJC_TAG_INDEX_MASK);
//这里就很好理解了,将随机得到的数据(tagged标识位已置0)右移_OBJC_TAG_INDEX_SHIFT位。

/*
#if OBJC_MSB_TAGGED_POINTERS
# define _OBJC_TAG_INDEX_SHIFT 60
#else
# define _OBJC_TAG_INDEX_SHIFT 1
#endif
总是有两种情况,这里就选一种研究(“_OBJC_TAG_INDEX_SHIFT 60”)
右移60位,相当于取随机数objc_debug_taggedpointer_obfuscator的高四位。

#define _OBJC_TAG_INDEX_MASK 0x7
然后再和0x7进行位与操作。
其实质就是取60bit~62bit。(高4bit中的低3bit)
*/
uintptr_t obfuscatedTag = tag ^ tagObfuscator;
//tag 和 tagObfuscator 进行异或操作。
// Array index in objc_tag_classes includes the tagged bit itself
#if SUPPORT_MSB_TAGGED_POINTERS
return &objc_tag_classes[0x8 | obfuscatedTag];
#else
return &objc_tag_classes[(obfuscatedTag << 1) | 1];
#endif
//将标志位置1,然后在objc_tag_classes数组中取出,再取地址返回。
}

回到开始的_objc_registerTaggedPointerClass函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
void
_objc_registerTaggedPointerClass(objc_tag_index_t tag, Class cls)
{
if (objc_debug_taggedpointer_mask == 0) {
_objc_fatal("tagged pointers are disabled");
}

Class *slot = classSlotForTagIndex(tag);
/*
这里已经知道了,slot是通过tag值,在objc_tag_classes数组中寻找对于的Class,并获取其地址,赋值给slot。
*/
if (!slot) {
_objc_fatal("tag index %u is invalid", (unsigned int)tag);
}
//如果没有获取到,则abort并告知tag index is invalid。

Class oldCls = *slot;

if (cls && oldCls && cls != oldCls) {
_objc_fatal("tag index %u used for two different classes "
"(was %p %s, now %p %s)", tag,
oldCls, oldCls->nameForLogging(),
cls, cls->nameForLogging());
}
//这里是判断该tag index是否被其他class使用。

*slot = cls;

//下面的这段,有待探究~~~
// Store a placeholder class in the basic tag slot that is
// reserved for the extended tag space, if it isn't set already.
// Do this lazily when the first extended tag is registered so
// that old debuggers characterize bogus pointers correctly more often.
if (tag < OBJC_TAG_First60BitPayload || tag > OBJC_TAG_Last60BitPayload) {
Class *extSlot = classSlotForBasicTagIndex(OBJC_TAG_RESERVED_7);
if (*extSlot == nil) {
extern objc_class OBJC_CLASS_$___NSUnrecognizedTaggedPointer;
*extSlot = (Class)&OBJC_CLASS_$___NSUnrecognizedTaggedPointer;
}
}
}

到这里,这个函数算是解析完了。其实它做的就是:
1、判断tagged pointer disabled与否。
2、用tag值在objc_tag_classes数组中查找相对应的Class(查找过程上面已经分析过)。
3、判断是否查找到。如果查找到,判断是否被占用。

下一个函数_objc_getClassForTag

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Returns the registered class for the given tag.
// Returns nil if the tag is valid but has no registered class.
// Aborts if the tag is invalid.
OBJC_EXPORT Class _Nullable
_objc_getClassForTag(objc_tag_index_t tag)
OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

/***********************************************************************
* _objc_getClassForTag
* Returns the class that is using the given tagged pointer tag.
* Returns nil if no class is using that tag or the tag is out of range.
**********************************************************************/
Class
_objc_getClassForTag(objc_tag_index_t tag)
{
Class *slot = classSlotForTagIndex(tag);
if (slot) return *slot;
else return nil;
}

嘿,巧了,classSlotForTagIndex函数,在分析上个函数的时候,分析过了。详细过程见classSlotForBasicTagIndex 函数的分析过程。

下一个函数_objc_makeTaggedPointer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// Create a tagged pointer object with the given tag and payload.
// Assumes the tag is valid.
// Assumes tagged pointers are enabled.
// The payload will be silently truncated to fit.
static inline void * _Nonnull
_objc_makeTaggedPointer(objc_tag_index_t tag, uintptr_t payload);

static inline void * _Nonnull
_objc_makeTaggedPointer(objc_tag_index_t tag, uintptr_t value)
{
// PAYLOAD_LSHIFT and PAYLOAD_RSHIFT are the payload extraction shifts.
// They are reversed here for payload insertion.

// assert(_objc_taggedPointersEnabled());
if (tag <= OBJC_TAG_Last60BitPayload) {
//这里只分析这种情况
// assert(((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT) == value);
uintptr_t result =
(_OBJC_TAG_MASK |
((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) |
((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT));
return _objc_encodeTaggedPointer(result);
} else {
// assert(tag >= OBJC_TAG_First52BitPayload);
// assert(tag <= OBJC_TAG_Last52BitPayload);
// assert(((value << _OBJC_TAG_EXT_PAYLOAD_RSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_LSHIFT) == value);
uintptr_t result =
(_OBJC_TAG_EXT_MASK |
((uintptr_t)(tag - OBJC_TAG_First52BitPayload) << _OBJC_TAG_EXT_INDEX_SHIFT) |
((value << _OBJC_TAG_EXT_PAYLOAD_RSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_LSHIFT));
return _objc_encodeTaggedPointer(result);
}
}

先分析result的值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
uintptr_t result = (_OBJC_TAG_MASK |  ((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) | ((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT));

/*
同样的,这里只分析 _OBJC_TAG_INDEX_SHIFT = 60 的情况
((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) :将tag的低4bit变为高4bit
((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT):将value的高4bit置0
((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) | ((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT):将tag的低四位给value的高四位。
上面的操作就相当于:
tag:0x89abcdef
value:0x76543210
tagvalue:0xf6543210

所以result为上述操作之后的tag和value值的tagged pointer标识位(63bit)置1。
举个例子:
tag=0x76543210
value=0xf9abcdef
那么result=(1<<63) | ((tag<<60) | ((value<<4)>>4)) = 0x89abcdef
*/

再来看看_objc_encodeTaggedPointer :

1
2
3
4
5
6
7
8
9
10
static inline void * _Nonnull
_objc_encodeTaggedPointer(uintptr_t ptr)
{
return (void *)(objc_debug_taggedpointer_obfuscator ^ ptr);
/*
objc_debug_taggedpointer_obfuscator在上面也是分析过的,他有一个初始化函数initializeTaggedPointerObfuscator。
其值为一个由arc4random_buf生成的64bit随机数,然后再将_OBJC_TAG_MASK位(最高位)置0。
所以的得到的为:result^objc_debug_taggedpointer_obfuscator
*/
}

看下一个函数_objc_isTaggedPointer

1
2
3
4
5
6
7
8
9
10
11
12
13
// Return true if ptr is a tagged pointer object.
// Does not check the validity of ptr's class.
static inline bool
_objc_isTaggedPointer(const void * _Nullable ptr);

static inline bool
_objc_isTaggedPointer(const void * _Nullable ptr)
{
return ((uintptr_t)ptr & _OBJC_TAG_MASK) == _OBJC_TAG_MASK;
/*
这个其实很简单,传入一个指针,判断指针的_OBJC_TAG_MASK位是否为1。
*/
}

_objc_getTaggedPointerTag

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Extract the tag value from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// Does not check the validity of ptr's tag.
static inline objc_tag_index_t
_objc_getTaggedPointerTag(const void * _Nullable ptr);

static inline objc_tag_index_t
_objc_getTaggedPointerTag(const void * _Nullable ptr)
{
// assert(_objc_isTaggedPointer(ptr));
uintptr_t value = _objc_decodeTaggedPointer(ptr);
/*
用随机数objc_debug_taggedpointer_obfuscator对指针进行decode。
*/
uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
// uintptr_t basicTag = (value >> 60) & 0x7;//将decode之后的value的高四位变为低四位,并将其3bit置为0。

//仅是研究NSNumber等类的话,下面的就不用去深究。
uintptr_t extTag = (value >> _OBJC_TAG_EXT_INDEX_SHIFT) & _OBJC_TAG_EXT_INDEX_MASK;

if (basicTag == _OBJC_TAG_INDEX_MASK) {
return (objc_tag_index_t)(extTag + OBJC_TAG_First52BitPayload);
} else {
return (objc_tag_index_t)basicTag;
}
}

_objc_getTaggedPointerValue

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Extract the payload from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// The payload value is zero-extended.
static inline uintptr_t
_objc_getTaggedPointerValue(const void * _Nullable ptr);

static inline uintptr_t
_objc_getTaggedPointerValue(const void * _Nullable ptr)
{
// assert(_objc_isTaggedPointer(ptr));
uintptr_t value = _objc_decodeTaggedPointer(ptr);
uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
if (basicTag == _OBJC_TAG_INDEX_MASK) {
return (value << _OBJC_TAG_EXT_PAYLOAD_LSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_RSHIFT;
} else {
return (value << _OBJC_TAG_PAYLOAD_LSHIFT) >> _OBJC_TAG_PAYLOAD_RSHIFT;
//主要看这里,value的值是_objc_decodeTaggedPointer的返回值,然后再将这个返回值高四位置0,即得。
}
}

_objc_getTaggedPointerSignedValue

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Extract the payload from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// The payload value is sign-extended.
static inline intptr_t
_objc_getTaggedPointerSignedValue(const void * _Nullable ptr);

static inline intptr_t
_objc_getTaggedPointerSignedValue(const void * _Nullable ptr)
{
// assert(_objc_isTaggedPointer(ptr));
uintptr_t value = _objc_decodeTaggedPointer(ptr);
uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
if (basicTag == _OBJC_TAG_INDEX_MASK) {
return ((intptr_t)value << _OBJC_TAG_EXT_PAYLOAD_LSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_RSHIFT;
} else {
return ((intptr_t)value << _OBJC_TAG_PAYLOAD_LSHIFT) >> _OBJC_TAG_PAYLOAD_RSHIFT;
}
}
/*
这个函数和上面的_objc_getTaggedPointerValue是一样的。
*/

理论到这算是走完了。
这里,总结一下:
tagged pointer是Apple在64bit系统上对NSNumber类的一些优化,主要目的是为了节省内存。其实现原理为:将指针的一部分(4bit)拿出来充当tag值,标记对象指针是否为tagged pointer。并在这4bit中,还存储了该指针的所属类在objc_tag_classes数组中的index。通过一系列函数可以对其进行操作,例如判断指针是否为tagged pointer(_objc_isTaggedPointer)、获取tagged pointer的数据(_objc_getTaggedPointerValue)、获取tagged pointer的类型(_objc_getTaggedPointerTag)等等。

实践(环境:MAC,版本:10.14.5)

理论搞懂了,那就来指导一下实践吧:

  • 1 由已知tagged pointer 获取其相关信息。
    (1)查看tagged pointer enable or disable。
    根据上述分析,判断pagged pointer是否启用,应该是调用函数_objc_taggedPointersEnabled
    但是_objc_taggedPointersEnabled函数是被static修饰的,所以不能extern并使用。只能直接重写其实现。
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    #import <Foundation/Foundation.h>

    bool sl_objc_taggedPointersEnabled(void)
    {
    extern uintptr_t objc_debug_taggedpointer_mask;
    return (objc_debug_taggedpointer_mask != 0);
    }

    int main(int argc, const char * argv[]) {
    @autoreleasepool {

    bool b = sl_objc_taggedPointersEnabled();
    NSLog(@"Enable:%d",b);
    }
    return 0;
    }
    查看打印:
    1
    2019-07-01 17:37:14.584230+0800 AAA[10865:711050] Enable:1
    可知,在mac10.14.5上,是enabled的。
    (2)判断一个指针是不是Tagged Pointer。
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    #   define _OBJC_TAG_MASK 1UL//本来需要判断的,这里测试的是已知的mac 10.14.5.所以直接定义
    bool sl_objc_isTaggedPointer(const void * _Nullable ptr)
    {
    return ((uintptr_t)ptr & _OBJC_TAG_MASK) == _OBJC_TAG_MASK;
    }
    int main(int argc, const char * argv[]) {
    @autoreleasepool {
    NSNumber *n = @123;
    NSString *s = @"123";
    NSString *s_m_c = [[s mutableCopy] copy];
    NSObject *o = [NSObject new];
    bool nb = sl_objc_isTaggedPointer((__bridge void *)n);
    bool sb = sl_objc_isTaggedPointer((__bridge void *)s);
    bool smcb = sl_objc_isTaggedPointer((__bridge void *)s_m_c);
    bool ob = sl_objc_isTaggedPointer((__bridge void *)o);
    NSLog(@"nb(0x%lx) is TaggedPointer %d",(uintptr_t)n,nb);
    NSLog(@"sb(0x%lx) is TaggedPointer %d",(uintptr_t)s,sb);
    NSLog(@"smcb(0x%lx) is TaggedPointer %d",(uintptr_t)s_m_c,smcb);
    NSLog(@"ob(0x%lx) is TaggedPointer %d",(uintptr_t)o,ob);
    }
    return 0;
    }
    打印:
    1
    2
    3
    4
    2019-07-01 17:58:36.116391+0800 AAA[10973:719191] nb(0x56b19f9c2784f641) is TaggedPointer 1
    2019-07-01 17:58:36.116530+0800 AAA[10973:719191] sb(0x100001058) is TaggedPointer 0
    2019-07-01 17:58:36.116538+0800 AAA[10973:719191] smcb(0x56b19f9c14b6bc53) is TaggedPointer 1
    2019-07-01 17:58:36.116544+0800 AAA[10973:719191] ob(0x10051e520) is TaggedPointer 0
    (3)获取tagged pointer的类型(tag枚举值)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    #   define _OBJC_TAG_INDEX_SHIFT 1
    #define _OBJC_TAG_INDEX_MASK 0x7
    extern uintptr_t objc_debug_taggedpointer_obfuscator;
    uintptr_t sl_objc_getTaggedPointerTag(const void * _Nullable ptr)
    {
    uintptr_t value = objc_debug_taggedpointer_obfuscator ^ (uintptr_t)ptr;
    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
    return basicTag;
    }
    int main(int argc, const char * argv[]) {
    @autoreleasepool {
    NSNumber *n = @123;
    NSString *s = @"123";
    NSString *s_m_c = [[s mutableCopy] copy];
    uintptr_t n_t = sl_objc_getTaggedPointerTag((__bridge void *)n);
    uintptr_t s_m_c_t = sl_objc_getTaggedPointerTag((__bridge void *)s_m_c);
    NSLog(@"n's type is %ld",n_t);
    NSLog(@"s_m_c's type is %ld",s_m_c_t);
    }
    return 0;
    }
    log:
    1
    2
    2019-07-04 09:41:34.017122+0800 AAA[11655:849484] n's type is 3
    2019-07-04 09:41:34.017318+0800 AAA[11655:849484] s_m_c's type is 2
    对比log和objc_tag_index_t:
    1
    2
    3
    4
    5
    6
    7
    8
    // 60-bit payloads
    OBJC_TAG_NSAtom = 0,
    OBJC_TAG_1 = 1,
    OBJC_TAG_NSString = 2,
    OBJC_TAG_NSNumber = 3,
    OBJC_TAG_NSIndexPath = 4,
    OBJC_TAG_NSManagedObjectID = 5,
    OBJC_TAG_NSDate = 6,
    (4)获取taggedpointer的所属类:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    #   define _OBJC_TAG_INDEX_SHIFT 1
    #define _OBJC_TAG_INDEX_MASK 0x7
    extern uintptr_t objc_debug_taggedpointer_obfuscator;
    uintptr_t sl_objc_getTaggedPointerTag(const void * _Nullable ptr)
    {
    uintptr_t value = objc_debug_taggedpointer_obfuscator ^ (uintptr_t)ptr;
    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
    return basicTag;
    }
    extern Class objc_debug_taggedpointer_classes[];
    Class *sl_classSlotForBasicTagIndex(uintptr_t tag)
    {
    uintptr_t tagObfuscator = ((objc_debug_taggedpointer_obfuscator
    >> _OBJC_TAG_INDEX_SHIFT)
    & _OBJC_TAG_INDEX_MASK);
    uintptr_t obfuscatedTag = tag ^ tagObfuscator;
    // Array index in objc_tag_classes includes the tagged bit itself
    return &objc_debug_taggedpointer_classes[(obfuscatedTag << 1) | 1];
    }
    int main(int argc, const char * argv[]) {
    @autoreleasepool {
    NSNumber *n = @123;
    NSString *s = @"123";
    NSString *s_m_c = [[s mutableCopy] copy];
    uintptr_t n_t = sl_objc_getTaggedPointerTag((__bridge void *)n);
    uintptr_t s_m_c_t = sl_objc_getTaggedPointerTag((__bridge void *)s_m_c);
    Class *n_t_c = sl_classSlotForBasicTagIndex(n_t);
    Class *s_m_c_t_c = sl_classSlotForBasicTagIndex(s_m_c_t);
    NSLog(@"n's Class is %@",*n_t_c);
    NSLog(@"s_m_c's Clss is %@",*s_m_c_t_c);
    }
    return 0;
    }
    log:
    1
    2
    2019-07-04 10:58:05.075581+0800 AAA[11913:876264] n's Class is __NSCFNumber
    2019-07-04 10:58:05.075754+0800 AAA[11913:876264] s_m_c's Clss is NSTaggedPointerString
    看到上面的例子,或许我该露出张狂但又不失谦虚的笑容,因为一切都那么顺利,理论得到了印证。然而,真的是这么一回事吗?下面看一个不尽人意的例子。
    (5)通过taggedpointer获取value:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#   define _OBJC_TAG_PAYLOAD_LSHIFT 0
# define _OBJC_TAG_PAYLOAD_RSHIFT 4
uintptr_t sl_objc_getTaggedPointerValue(const void * _Nullable ptr)
{
// assert(_objc_isTaggedPointer(ptr));
uintptr_t value = objc_debug_taggedpointer_obfuscator ^ (uintptr_t)ptr;
// uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
return (value << _OBJC_TAG_PAYLOAD_LSHIFT) >> _OBJC_TAG_PAYLOAD_RSHIFT;
}
int main(int argc, const char * argv[]) {
@autoreleasepool {
NSNumber *n = @123;
uintptr_t n_v = sl_objc_getTaggedPointerValue((__bridge void *)n);
NSLog(@"hex(123) = 0x%x",123);
NSLog(@"n's value is 0x%lx",n_v);
}
return 0;
}

log:

1
2
2019-07-04 11:14:59.639165+0800 AAA[11959:881845] hex(123) = 0x7b
2019-07-04 11:14:59.639316+0800 AAA[11959:881845] n's value is 0x7b2

是不是感觉有点不对劲?
换一个数据:

1
2
3
4
NSNumber *n = [NSNumber numberWithLongLong:121111];
uintptr_t n_v = sl_objc_getTaggedPointerValue((__bridge void *)n);
NSLog(@"hex(121111) = 0x%x",121111);
NSLog(@"n's value is 0x%lx",n_v);

log:

1
2
2019-07-04 11:18:03.279558+0800 AAA[11967:882991] hex(121111) = 0x1d917
2019-07-04 11:18:03.279719+0800 AAA[11967:882991] n's value is 0x1d9173

每次都发现算出来的value后面总是多一位。。。。
查了蛮多资料,只知道:
value的第1-4位是NSNumber的类型:比如,char是0、short是1、int是2、float是4。