blockcanary源码学习随笔

blockcanary是什么?

blockcanary是国内开发者MarkZhai开发的一套性能监控组件,它对主线程操作进行了完全透明的监控,并能输出有效的信息,帮助开发分析、定位到问题所在,迅速优化应用

下图为官方原理介绍示例图:

image.png

简介

Github地址:blockcanary

特点

  • 非侵入式
  • 使用简单
  • 实时监控
  • 提供完善的堆栈及内存信息

Android渲染机制

Android系统每隔16ms发出VSYNC信号,触发对UI进行渲染, 如果每次渲染都成功,这样就能够达到流畅的画面所需要的60fps,为了能够实现60fps,这意味着程序的大多数操作都必须在16ms内完成。如果超过了16ms那么可能就出现丢帧的情况。

本文主要对blockcanary的原理进行分析,关于渲染的详细机制及优化,推荐参考如下文章:

Android性能优化-渲染优化

blockcanary怎么用?

1、gradle引入库

1
2
debugImplementation 'com.github.markzhai:blockcanary-android:1.5.0'
releaseImplementation 'com.github.markzhai:blockcanary-no-op:1.5.0'

2、自定义Application并且在onCreate中进行初始化

1
2
3
4
5
6
7
public class ExampleApplication extends Application {

@Override public void onCreate() {
super.onCreate();
BlockCanary.install(this, new BlockCanaryContext()).start();
}
}

blockcanary核心执行流程是怎样?

blockcanary的核心原理是通过自定义一个Printer,设置到主线程ActivityThread的MainLooper中。MainLooper在dispatch消息前后都会调用Printer进行打印。从而获取前后执行的时间差值,判断是否超过设置的阈值。如果超过,则会将记录的栈信息及cpu信息发通知到前台。

关键类功能说明

说明
BlockCanary 外观类,提供初始化及开始、停止监听
BlockCanaryContext 配置上下文,可配置id、当前网络信息、卡顿阈值、log保存路径等
BlockCanaryInternals blockcanary核心的调度类,内部包含了monitor(设置到MainLooper的printer)、stackSampler(栈信息处理器)、cpuSampler(cpu信息处理器)、mInterceptorChain(注册的拦截器)、以及onBlockEvent的回调及拦截器的分发
LooperMonitor 继承了Printer接口,用于设置到MainLooper中。通过复写println的方法来获取MainLooper的dispatch前后的执行时间差,并控制stackSampler和cpuSampler的信息采集。
StackSampler 用于获取线程的栈信息,将采集的栈信息存储到一个以key为时间戳的LinkHashMap中。通过mCurrentThread.getStackTrace()获取当前线程的StackTraceElement
CpuSampler 用于获取cpu信息,将采集的cpu信息存储到一个以key为时间戳的LinkHashMap中。通过读取/proc/stat文件获取cpu的信息
DisplayService 继承了BlockInterceptor拦截器,onBlock回调会触发发送前台通知
DisplayActivity 用于显示记录的异常信息Activity

代码执行流程

leakcanary的核心流程主要包含3个步骤。

1、init-初始化

2、monitor-监听MainLooper的dispatch时间差,推送前台通知

3、dump-采集线程栈信息及cpu信息

这里先上一下整体的流程图,建议结合源码进行查看。

image

下面我们通过上述3个步骤相关的源码来进行分析。

1、init

根据Application中的使用,我们首先看install方法

1
2
3
4
5
6
7
public static BlockCanary install(Context context, BlockCanaryContext blockCanaryContext) {
//BlockCanaryContext.init会将保存应用的applicationContext和用户设置的配置参数
BlockCanaryContext.init(context, blockCanaryContext);
//etEnabled将根据用户的通知栏消息配置开启
setEnabled(context, DisplayActivity.class, BlockCanaryContext.get().displayNotification());
return get();
}

接着看get方法的实现如下:

1
2
3
4
5
6
7
8
9
10
11
//使用单例创建了一个BlockCanary对象
public static BlockCanary get() {
if (sInstance == null) {
synchronized (BlockCanary.class) {
if (sInstance == null) {
sInstance = new BlockCanary();
}
}
}
return sInstance;
}

接着我们看BlockCanary的对象的构造方法实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
private BlockCanary() {
//初始化lockCanaryInternals调度类
BlockCanaryInternals.setContext(BlockCanaryContext.get());
mBlockCanaryCore = BlockCanaryInternals.getInstance();
//为BlockCanaryInternals添加拦截器(责任链)BlockCanaryContext对BlockInterceptor是空实现
mBlockCanaryCore.addBlockInterceptor(BlockCanaryContext.get());
if (!BlockCanaryContext.get().displayNotification()) {
return;
}
//DisplayService只在开启通知栏消息的时候添加,当卡顿发生时将通过DisplayService发起通知栏消息
mBlockCanaryCore.addBlockInterceptor(new DisplayService());

}

接着我们看BlockCanaryInternals的构造方法,实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
public BlockCanaryInternals() {
//初始化栈采集器
stackSampler = new StackSampler(
Looper.getMainLooper().getThread(),
sContext.provideDumpInterval());
//初始化cpu采集器
cpuSampler = new CpuSampler(sContext.provideDumpInterval());

//初始化LooperMonitor,并实现了onBlockEvent的回调,该回调会在触发阈值后被调用
setMonitor(new LooperMonitor(new LooperMonitor.BlockListener() {

@Override
public void onBlockEvent(long realTimeStart, long realTimeEnd,
long threadTimeStart, long threadTimeEnd) {
ArrayList<String> threadStackEntries = stackSampler
.getThreadStackEntries(realTimeStart, realTimeEnd);
if (!threadStackEntries.isEmpty()) {
BlockInfo blockInfo = BlockInfo.newInstance()
.setMainThreadTimeCost(realTimeStart, realTimeEnd, threadTimeStart, threadTimeEnd)
.setCpuBusyFlag(cpuSampler.isCpuBusy(realTimeStart, realTimeEnd))
.setRecentCpuRate(cpuSampler.getCpuRateInfo())
.setThreadStackEntries(threadStackEntries)
.flushString();
LogWriter.save(blockInfo.toString());

if (mInterceptorChain.size() != 0) {
for (BlockInterceptor interceptor : mInterceptorChain) {
interceptor.onBlock(getContext().provideContext(), blockInfo);
}
}
}
}
}, getContext().provideBlockThreshold(), getContext().stopWhenDebugging()));

LogWriter.cleanObsolete();
}

2、monitor

首先我们先看下系统的Looper的loop()方法中对于printer的使用,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
for (;;) {
Message msg = queue.next(); // might block
if (msg == null) {
// No message indicates that the message queue is quitting.
return;
}

// 执行dispatchMessage前,执行Printer的println方法
final Printer logging = me.mLogging;
if (logging != null) {
logging.println(">>>>> Dispatching to " + msg.target + " " +
msg.callback + ": " + msg.what);
}

final long traceTag = me.mTraceTag;
long slowDispatchThresholdMs = me.mSlowDispatchThresholdMs;
long slowDeliveryThresholdMs = me.mSlowDeliveryThresholdMs;
if (thresholdOverride > 0) {
slowDispatchThresholdMs = thresholdOverride;
slowDeliveryThresholdMs = thresholdOverride;
}
final boolean logSlowDelivery = (slowDeliveryThresholdMs > 0) && (msg.when > 0);
final boolean logSlowDispatch = (slowDispatchThresholdMs > 0);

final boolean needStartTime = logSlowDelivery || logSlowDispatch;
final boolean needEndTime = logSlowDispatch;

if (traceTag != 0 && Trace.isTagEnabled(traceTag)) {
Trace.traceBegin(traceTag, msg.target.getTraceName(msg));
}

final long dispatchStart = needStartTime ? SystemClock.uptimeMillis() : 0;
final long dispatchEnd;
try {
msg.target.dispatchMessage(msg);
dispatchEnd = needEndTime ? SystemClock.uptimeMillis() : 0;
} finally {
if (traceTag != 0) {
Trace.traceEnd(traceTag);
}
}
if (logSlowDelivery) {
if (slowDeliveryDetected) {
if ((dispatchStart - msg.when) <= 10) {
Slog.w(TAG, "Drained");
slowDeliveryDetected = false;
}
} else {
if (showSlowLog(slowDeliveryThresholdMs, msg.when, dispatchStart, "delivery",
msg)) {
// Once we write a slow delivery log, suppress until the queue drains.
slowDeliveryDetected = true;
}
}
}
if (logSlowDispatch) {
showSlowLog(slowDispatchThresholdMs, dispatchStart, dispatchEnd, "dispatch", msg);
}
// 执行dispatchMessage后,执行Printer的println方法
if (logging != null) {
logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
}

// Make sure that during the course of dispatching the
// identity of the thread wasn't corrupted.
final long newIdent = Binder.clearCallingIdentity();
if (ident != newIdent) {
Log.wtf(TAG, "Thread identity changed from 0x"
+ Long.toHexString(ident) + " to 0x"
+ Long.toHexString(newIdent) + " while dispatching to "
+ msg.target.getClass().getName() + " "
+ msg.callback + " what=" + msg.what);
}

msg.recycleUnchecked();
}

当install进行初始化完成后,接着会调用start()方法,实现如下:

1
2
3
4
5
6
7
public void start() {
if (!mMonitorStarted) {
mMonitorStarted = true;
//把mBlockCanaryCore中的monitor设置MainLooper中进行监听
Looper.getMainLooper().setMessageLogging(mBlockCanaryCore.monitor);
}
}

当MainLooper执行dispatch的前后会调用printer的println方法,所以这里我们看LooperMonitor对println方法的实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
 @Override
public void println(String x) {
//如果再debug模式,不执行监听
if (mStopWhenDebugging && Debug.isDebuggerConnected()) {
return;
}
if (!mPrintingStarted) {//dispatchMesage前执行的println
//记录开始时间
mStartTimestamp = System.currentTimeMillis();
mStartThreadTimestamp = SystemClock.currentThreadTimeMillis();
mPrintingStarted = true;
//开始采集栈及cpu信息
startDump();
} else {//dispatchMesage后执行的println
//获取结束时间
final long endTime = System.currentTimeMillis();
mPrintingStarted = false;
//判断耗时是否超过阈值
if (isBlock(endTime)) {
notifyBlockEvent(endTime);
}
stopDump();
}
}
//判断是否超过阈值
private boolean isBlock(long endTime) {
return endTime - mStartTimestamp > mBlockThresholdMillis;
}
//回调监听
private void notifyBlockEvent(final long endTime) {
final long startTime = mStartTimestamp;
final long startThreadTime = mStartThreadTimestamp;
final long endThreadTime = SystemClock.currentThreadTimeMillis();
HandlerThreadFactory.getWriteLogThreadHandler().post(new Runnable() {
@Override
public void run() {
mBlockListener.onBlockEvent(startTime, endTime, startThreadTime, endThreadTime);
}
});
}

当发现时间差超过阈值后,会回调onBlockEvent。具体的实现在BlockCanaryInternals的构造方法中,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
setMonitor(new LooperMonitor(new LooperMonitor.BlockListener() {

@Override
public void onBlockEvent(long realTimeStart, long realTimeEnd,
long threadTimeStart, long threadTimeEnd) {
//根据开始及结束时间,从栈的map当中获取记录信息
ArrayList<String> threadStackEntries = stackSampler
.getThreadStackEntries(realTimeStart, realTimeEnd);
if (!threadStackEntries.isEmpty()) {
//构建 BlockInfo对象,设置相关的信息
BlockInfo blockInfo = BlockInfo.newInstance()
.setMainThreadTimeCost(realTimeStart, realTimeEnd, threadTimeStart, threadTimeEnd)
.setCpuBusyFlag(cpuSampler.isCpuBusy(realTimeStart, realTimeEnd))
.setRecentCpuRate(cpuSampler.getCpuRateInfo())
.setThreadStackEntries(threadStackEntries)
.flushString();
//记录信息
LogWriter.save(blockInfo.toString());
//遍历拦截器,通知
if (mInterceptorChain.size() != 0) {
for (BlockInterceptor interceptor : mInterceptorChain) {
interceptor.onBlock(getContext().provideContext(), blockInfo);
}
}
}
}
}, getContext().provideBlockThreshold(), getContext().stopWhenDebugging()));

最后我们看拦截器的实现DisplayService,会发送前台的通知,代码如下:

1
2
3
4
5
6
7
8
9
10
@Override
public void onBlock(Context context, BlockInfo blockInfo) {
Intent intent = new Intent(context, DisplayActivity.class);
intent.putExtra("show_latest", blockInfo.timeStart);
intent.setFlags(Intent.FLAG_ACTIVITY_NEW_TASK | Intent.FLAG_ACTIVITY_CLEAR_TOP);
PendingIntent pendingIntent = PendingIntent.getActivity(context, 1, intent, FLAG_UPDATE_CURRENT);
String contentTitle = context.getString(R.string.block_canary_class_has_blocked, blockInfo.timeStart);
String contentText = context.getString(R.string.block_canary_notification_message);
show(context, contentTitle, contentText, pendingIntent);
}

3、dump

从上面的流程我们可以知道,当dispatchMessage前的println触发时,会执行dump的start方法,当dispatchMessage后的println触发时,会执行dump的stop方法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
private void startDump() {
if (null != BlockCanaryInternals.getInstance().stackSampler) {
BlockCanaryInternals.getInstance().stackSampler.start();
}

if (null != BlockCanaryInternals.getInstance().cpuSampler) {
BlockCanaryInternals.getInstance().cpuSampler.start();
}
}

private void stopDump() {
if (null != BlockCanaryInternals.getInstance().stackSampler) {
BlockCanaryInternals.getInstance().stackSampler.stop();
}

if (null != BlockCanaryInternals.getInstance().cpuSampler) {
BlockCanaryInternals.getInstance().cpuSampler.stop();
}
}

下面我们分Stacksampler和CpuSampler进行介绍。

1、Stacksampler

start()的执行流程如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public void start() {
if (mShouldSample.get()) {
return;
}
mShouldSample.set(true);

HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
//通过一个HandlerThread延时执行了mRunnable
HandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable,
BlockCanaryInternals.getInstance().getSampleDelay());
}
//mRunnable在基类AbstractSampler中定义
private Runnable mRunnable = new Runnable() {
@Override
public void run() {
//抽象方法
doSample();
//继续执行采集
if (mShouldSample.get()) {
HandlerThreadFactory.getTimerThreadHandler()
.postDelayed(mRunnable, mSampleInterval);
}
}
};
//Stacksampler的doSample()实现
@Override
protected void doSample() {
StringBuilder stringBuilder = new StringBuilder();
//通过mCurrentThread.getStackTrace()获取StackTraceElement,加入到StringBuilder
for (StackTraceElement stackTraceElement : mCurrentThread.getStackTrace()) {
stringBuilder
.append(stackTraceElement.toString())
.append(BlockInfo.SEPARATOR);
}

synchronized (sStackMap) {
//Lru算法,控制LinkHashMap的长度
if (sStackMap.size() == mMaxEntryCount && mMaxEntryCount > 0) {
sStackMap.remove(sStackMap.keySet().iterator().next());
}
//加入到map中
sStackMap.put(System.currentTimeMillis(), stringBuilder.toString());
}
}

stop()的执行流程如下:

1
2
3
4
5
6
7
8
9
public void stop() {
if (!mShouldSample.get()) {
return;
}
//设置控制变量
mShouldSample.set(false);
//取消handler消息
HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
}

2、CpuSampler

其他执行流程均与StackSampler一致,这里主要分析doSample的实现,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
//主要通过获取/proc/stat文件 去获取cpu的信息
protected void doSample() {
BufferedReader cpuReader = null;
BufferedReader pidReader = null;

try {
cpuReader = new BufferedReader(new InputStreamReader(
new FileInputStream("/proc/stat")), BUFFER_SIZE);
String cpuRate = cpuReader.readLine();
if (cpuRate == null) {
cpuRate = "";
}

if (mPid == 0) {
mPid = android.os.Process.myPid();
}
pidReader = new BufferedReader(new InputStreamReader(
new FileInputStream("/proc/" + mPid + "/stat")), BUFFER_SIZE);
String pidCpuRate = pidReader.readLine();
if (pidCpuRate == null) {
pidCpuRate = "";
}

parse(cpuRate, pidCpuRate);
} catch (Throwable throwable) {
Log.e(TAG, "doSample: ", throwable);
} finally {
try {
if (cpuReader != null) {
cpuReader.close();
}
if (pidReader != null) {
pidReader.close();
}
} catch (IOException exception) {
Log.e(TAG, "doSample: ", exception);
}
}
}

blockcanary是如何进行卡顿的判定?

blockcanary的核心原理是通过自定义一个Printer,设置到主线程ActivityThread的MainLooper中。MainLooper在dispatch消息前后都会调用Printer进行打印。从而获取前后执行的时间差值,判断是否超过设置的阈值。如果超过,则判定为卡顿。

leakcanary是如何获取线程的堆栈信息?

通过mCurrentThread.getStackTrace()方法,遍历获取StackTraceElement,转化为一个StringBuilder的value,并存储到一个key为时间戳的LinkHashMap中。

leakcanary是如何获取cpu的信息?

通过读取/proc/stat文件,获取所有CPU活动的信息来计算CPU使用率。解析出信息后,转化为一个StringBuilder的value,并存储到一个key为时间戳的LinkHashMap中。

总结

思考

blockcanary充分的利用了Loop的机制,在MainLooper的loop方法中执行dispatchMessage前后都会执行printer的println进行输出,并且提供了方法设置printer。通过分析前后打印的时差与阈值进行比对,从而判定是否卡顿。

参考资料

Android性能优化-渲染优化

Android UI卡顿监测框架BlockCanary原理分析