Since I did't see here anything about perf which is a relatively new tool for profiling the kernel and user applications on Linux I decided to add this information.

First of all - this is a tutorial about Linux profiling with perf

You can use perf if your Linux Kernel is greater than 2.6.32 or oprofile if it is older. Both programs don't require from you to instrument your program (like gprof requires). However in order to get call graph correctly in perf you need to build you program with -fno-omit-frame-pointer. For example: g++ -fno-omit-frame-pointer -O2 main.cpp.

You can see "live" analysis of your application with perf top:

sudo perf top -p `pidof a.out` -K

Or you can record performance data of a running application and analyze them after that:

1) To record performance data:

perf record -p `pidof a.out`

or to record for 10 secs:

perf record -p `pidof a.out` sleep 10

or to record with call graph ()

perf record -g -p `pidof a.out`

2) To analyze the recorded data

perf report --stdio

perf report --stdio --sort=dso -g none

perf report --stdio -g none

perf report --stdio -g

Or you can record performace data of a application and analyze them after that just by launching the application in this way and waiting for it to exit:

perf record ./a.out

This is an example of profiling a test program

The test program is in file main.cpp (I will put main.cpp at the bottom of the message):

I compile it in this way:

g++ -m64 -fno-omit-frame-pointer -g main.cpp -L.  -ltcmalloc_minimal -o my_test

I use libmalloc_minimial.so since it is compiled with -fno-omit-frame-pointer while libc malloc seems to be compiled without this option. Then I run my test program

./my_test 100000000

Then I record performance data of a running process:

perf record -g  -p `pidof my_test` -o ./my_test.perf.data sleep 30

Then I analyze load per module:

perf report --stdio -g none --sort comm,dso -i ./my_test.perf.data

# Overhead  Command                 Shared Object

# ........  .......  ............................

#

70.06%  my_test  my_test

and so on ...

Then call chains are analyzed:

perf report --stdio -g graph -i ./my_test.perf.data | c++filt

0.16%  my_test  [kernel.kallsyms]             [k] _spin_lock

and so on ...

So at this point you know where your program spends time.

And this is main.cpp for the test:

#include <stdio.h>

#include <stdlib.h>

#include <time.h>

time_t f1(time_t time_value)

{

for (int j =0; j < 10; ++j) {

++time_value;

if (j%5 == 0) {

double *p = new double;

delete p;

}

}

return time_value;

}

time_t f2(time_t time_value)

{

for (int j =0; j < 40; ++j) {

++time_value;

}

time_value=f1(time_value);

return time_value;

}

time_t process_request(time_t time_value)

{

for (int j =0; j < 10; ++j) {

int *p = new int;

delete p;

for (int m =0; m < 10; ++m) {

++time_value;

}

}

for (int i =0; i < 10; ++i) {

time_value=f1(time_value);

time_value=f2(time_value);

}

return time_value;

}

int main(int argc, char* argv2[])

{

int number_loops = argc > 1 ? atoi(argv2[1]) : 1;

time_t time_value = time(0);

printf("number loops %d\n", number_loops);

printf("time_value: %d\n", time_value );

for (int i =0; i < number_loops; ++i) {

time_value = process_request(time_value);

}

printf("time_value: %ld\n", time_value );

return 0;

}

原文

http://stackoverflow.com/questions/1777556/alternatives-to-gprof#comment3480484_1779343

最新文章

  1. 浅谈时钟的生成(js手写代码)
  2. jQuery.ajaxComplete() 函数详解
  3. SharePoint 客户端对象模型共用ClientContext的坑
  4. elasticsearch curl operation
  5. SQL Server 2012实施与管理实战指南(笔记)——Ch5启动SQL Server服务和数据库
  6. SQL Server 2012 AlwaysOn集群配置指南
  7. Oracle PL/SQL中的循环处理(sql for循环)
  8. Java加密解密相关
  9. 如何判断MSSQL数据库磁盘出现了瓶颈
  10. c#委托与事件(详解)
  11. 对于mariadb安装后可以默认使用无密码登录的问题解决方案
  12. 2017多校第10场 HDU 6178 Monkeys 贪心,或者DP
  13. J2EE进阶(九)org.hibernate.LazyInitializationException: could not initialize proxy - no Session
  14. [SDOI2011]染色 BZOJ2243 树链剖分+线段树
  15. python3 生成器初识 NLP第五条
  16. php hook编程机制
  17. Android100【申明:来源于网络】
  18. oracle 简单输出语句与赋值
  19. linux elasticsearch-5.1.1的安装
  20. 演示--Jquery核心选择器

热门文章

  1. C#并行编程(5):需要知道的异步
  2. 我现所认知的SSH
  3. BZOJ.4816.[SDOI2017]数字表格(莫比乌斯反演)
  4. vue 直接改变数组数据不刷新
  5. FireDAC 下的 Sqlite [12] - 备忘录(草草结束这个话题了)
  6. CentOS 7下软阵列的创建
  7. LPC43xx SGPIO I2C Implementation
  8. ARM JTAG 调试原理
  9. delphi DockPresident
  10. TF400511: Your team has not defined any iterations to use as sprints