diff options
author | mfilitov <mfilitov@yandex-team.com> | 2024-08-23 13:23:47 +0300 |
---|---|---|
committer | mfilitov <mfilitov@yandex-team.com> | 2024-08-23 13:38:30 +0300 |
commit | 5741cf990d59e786aee8074fc01db7655490bcfd (patch) | |
tree | 7161f63ec7c981a5547c63001dcb6c3fe66d9dcf /contrib/tools | |
parent | 10ac4b53185c5638e7dae756765beb5f60019451 (diff) | |
download | ydb-5741cf990d59e786aee8074fc01db7655490bcfd.tar.gz |
Try adding flamegraph with simple copy
6d3bfb2f6ddb8ad122ad37485ce2d0ac8346e0ee
Diffstat (limited to 'contrib/tools')
36 files changed, 4784 insertions, 0 deletions
diff --git a/contrib/tools/flame-graph/README.md b/contrib/tools/flame-graph/README.md new file mode 100644 index 0000000000..1d8e5cd9ef --- /dev/null +++ b/contrib/tools/flame-graph/README.md @@ -0,0 +1,219 @@ +# Flame Graphs visualize profiled code + +Main Website: http://www.brendangregg.com/flamegraphs.html + +Example (click to zoom): +[![Example](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg)](https://jing.yandex-team.ru/files/yazevnul/contrib-tools-flame-graph-cpu-bash-flamegraph.svg) + +Other sites: +- The Flame Graph article in ACMQ and CACM: http://queue.acm.org/detail.cfm?id=2927301 http://cacm.acm.org/magazines/2016/6/202665-the-flame-graph/abstract +- CPU profiling using Linux perf\_events, DTrace, SystemTap, or ktap: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html +- CPU profiling using XCode Instruments: http://schani.wordpress.com/2012/11/16/flame-graphs-for-instruments/ +- CPU profiling using Xperf.exe: http://randomascii.wordpress.com/2013/03/26/summarizing-xperf-cpu-usage-with-flame-graphs/ +- Memory profiling: http://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html +- Other examples, updates, and news: http://www.brendangregg.com/flamegraphs.html#Updates + +Flame graphs can be created in three steps: + +1. Capture stacks +2. Fold stacks +3. flamegraph.pl + +1\. Capture stacks +================= +Stack samples can be captured using Linux perf\_events, FreeBSD pmcstat (hwpmc), DTrace, SystemTap, and many other profilers. See the stackcollapse-\* converters. + +### Linux perf\_events + +Using Linux perf\_events (aka "perf") to capture 60 seconds of 99 Hertz stack samples, both user- and kernel-level stacks, all processes: + +``` +# perf record -F 99 -a -g -- sleep 60 +# perf script > out.perf +``` + +Now only capturing PID 181: + +``` +# perf record -F 99 -p 181 -g -- sleep 60 +# perf script > out.perf +``` + +### DTrace + +Using DTrace to capture 60 seconds of kernel stacks at 997 Hertz: + +``` +# dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count(); } tick-60s { exit(0); }' -o out.kern_stacks +``` + +Using DTrace to capture 60 seconds of user-level stacks for PID 12345 at 97 Hertz: + +``` +# dtrace -x ustackframes=100 -n 'profile-97 /pid == 12345 && arg1/ { @[ustack()] = count(); } tick-60s { exit(0); }' -o out.user_stacks +``` + +60 seconds of user-level stacks, including time spent in-kernel, for PID 12345 at 97 Hertz: + +``` +# dtrace -x ustackframes=100 -n 'profile-97 /pid == 12345/ { @[ustack()] = count(); } tick-60s { exit(0); }' -o out.user_stacks +``` + +Switch `ustack()` for `jstack()` if the application has a ustack helper to include translated frames (eg, node.js frames; see: http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/). The rate for user-level stack collection is deliberately slower than kernel, which is especially important when using `jstack()` as it performs additional work to translate frames. + +2\. Fold stacks +============== +Use the stackcollapse programs to fold stack samples into single lines. The programs provided are: + +- `stackcollapse.pl`: for DTrace stacks +- `stackcollapse-perf.pl`: for Linux perf_events "perf script" output +- `stackcollapse-pmc.pl`: for FreeBSD pmcstat -G stacks +- `stackcollapse-stap.pl`: for SystemTap stacks +- `stackcollapse-instruments.pl`: for XCode Instruments +- `stackcollapse-vtune.pl`: for Intel VTune profiles +- `stackcollapse-ljp.awk`: for Lightweight Java Profiler +- `stackcollapse-jstack.pl`: for Java jstack(1) output +- `stackcollapse-gdb.pl`: for gdb(1) stacks +- `stackcollapse-go.pl`: for Golang pprof stacks +- `stackcollapse-vsprof.pl`: for Microsoft Visual Studio profiles + +Usage example: + +``` +For perf_events: +$ ./stackcollapse-perf.pl out.perf > out.folded + +For DTrace: +$ ./stackcollapse.pl out.kern_stacks > out.kern_folded +``` + +The output looks like this: + +``` +unix`_sys_sysenter_post_swapgs 1401 +unix`_sys_sysenter_post_swapgs;genunix`close 5 +unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf 85 +unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;c2audit`audit_closef 26 +unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;c2audit`audit_setf 5 +unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;genunix`audit_getstate 6 +unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;genunix`audit_unfalloc 2 +unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;genunix`closef 48 +[...] +``` + +3\. flamegraph.pl +================ +Use flamegraph.pl to render a SVG. + +``` +$ ./flamegraph.pl out.kern_folded > kernel.svg +``` + +An advantage of having the folded input file (and why this is separate to flamegraph.pl) is that you can use grep for functions of interest. Eg: + +``` +$ grep cpuid out.kern_folded | ./flamegraph.pl > cpuid.svg +``` + +Provided Examples +================= + +### Linux perf\_events + +An example output from Linux "perf script" is included, gzip'd, as example-perf-stacks.txt.gz. The resulting flame graph is example-perf.svg: + +[![Example](http://www.brendangregg.com/FlameGraphs/example-perf.svg)](https://jing.yandex-team.ru/files/yazevnul/contrib-tools-flame-graph-example-perf.svg) + +You can create this using: + +``` +$ gunzip -c example-perf-stacks.txt.gz | ./stackcollapse-perf.pl --all | ./flamegraph.pl --color=java --hash > example-perf.svg +``` + +This shows my typical workflow: I'll gzip profiles on the target, then copy them to my laptop for analysis. Since I have hundreds of profiles, I leave them gzip'd! + +Since this profile included Java, I used the flamegraph.pl --color=java palette. I've also used stackcollapse-perf.pl --all, which includes all annotations that help flamegraph.pl use separate colors for kernel and user level code. The resulting flame graph uses: green == Java, yellow == C++, red == user-mode native, orange == kernel. + +This profile was from an analysis of vert.x performance. The benchmark client, wrk, is also visible in the flame graph. + +### DTrace + +An example output from DTrace is also included, example-dtrace-stacks.txt, and the resulting flame graph, example-dtrace.svg: + +[![Example](http://www.brendangregg.com/FlameGraphs/example-dtrace.svg)](https://jing.yandex-team.ru/files/yazevnul/contrib-tools-flame-graph-example-dtrace.svg) + +You can generate this using: + +``` +$ ./stackcollapse.pl example-stacks.txt | ./flamegraph.pl > example.svg +``` + +This was from a particular performance investigation: the Flame Graph identified that CPU time was spent in the lofs module, and quantified that time. + + +Options +======= +See the USAGE message (--help) for options: + +USAGE: ./flamegraph.pl [options] infile > outfile.svg + + --title TEXT # change title text + --subtitle TEXT # second level title (optional) + --width NUM # width of image (default 1200) + --height NUM # height of each frame (default 16) + --minwidth NUM # omit smaller functions (default 0.1 pixels) + --fonttype FONT # font type (default "Verdana") + --fontsize NUM # font size (default 12) + --countname TEXT # count type label (default "samples") + --nametype TEXT # name type label (default "Function:") + --colors PALETTE # set color palette. choices are: hot (default), mem, + # io, wakeup, chain, java, js, perl, red, green, blue, + # aqua, yellow, purple, orange + --bgcolors COLOR # set background colors. gradient choices are yellow + # (default), blue, green, grey; flat colors use "#rrggbb" + --hash # colors are keyed by function name hash + --cp # use consistent palette (palette.map) + --reverse # generate stack-reversed flame graph + --inverted # icicle graph + --flamechart # produce a flame chart (sort by time, do not merge stacks) + --negate # switch differential hues (blue<->red) + --notes TEXT # add notes comment in SVG (for debugging) + --help # this message + + eg, + ./flamegraph.pl --title="Flame Graph: malloc()" trace.txt > graph.svg + +As suggested in the example, flame graphs can process traces of any event, +such as malloc()s, provided stack traces are gathered. + + +Consistent Palette +================== +If you use the `--cp` option, it will use the $colors selection and randomly +generate the palette like normal. Any future flamegraphs created using the `--cp` +option will use the same palette map. Any new symbols from future flamegraphs +will have their colors randomly generated using the $colors selection. + +If you don't like the palette, just delete the palette.map file. + +This allows your to change your colorscheme between flamegraphs to make the +differences REALLY stand out. + +Example: + +Say we have 2 captures, one with a problem, and one when it was working +(whatever "it" is): + +``` +cat working.folded | ./flamegraph.pl --cp > working.svg +# this generates a palette.map, as per the normal random generated look. + +cat broken.folded | ./flamegraph.pl --cp --colors mem > broken.svg +# this svg will use the same palette.map for the same events, but a very +# different colorscheme for any new events. +``` + +Take a look at the demo directory for an example: + +palette-example-working.svg +palette-example-broken.svg diff --git a/contrib/tools/flame-graph/REVISION b/contrib/tools/flame-graph/REVISION new file mode 100644 index 0000000000..9544dc8e63 --- /dev/null +++ b/contrib/tools/flame-graph/REVISION @@ -0,0 +1 @@ +https://github.com/brendangregg/FlameGraph/tree/a258e78f17abdf2ce21c2515cfe8306a44774e2a diff --git a/contrib/tools/flame-graph/aix-perf.pl b/contrib/tools/flame-graph/aix-perf.pl new file mode 100755 index 0000000000..1edd082ecf --- /dev/null +++ b/contrib/tools/flame-graph/aix-perf.pl @@ -0,0 +1,31 @@ +#!/usr/bin/perl + +use Getopt::Std; + +getopt('urt'); + +unless ($opt_r && $opt_t){ + print "Usage: $0 [ -u user] -r sample_count -t sleep_time\n"; + exit(0); +} + +my $i; +my @proc = ""; +for ($i = 0; $i < $opt_r ; $i++){ + if ($opt_u){ + $proc = `/usr/sysv/bin/ps -u $opt_u `; + $proc =~ s/^.*\n//; + $proc =~ s/\s*(\d+).*\n/\1 /g; + @proc = split(/\s+/,$proc); + } else { + opendir(my $dh, '/proc') || die "Cant't open /proc: $!"; + @proc = grep { /^[\d]+$/ } readdir($dh); + closedir ($dh); + } + + foreach my $pid (@proc){ + my $command = "/usr/bin/procstack $pid"; + print `$command 2>/dev/null`; + } + select(undef, undef, undef, $opt_t); +} diff --git a/contrib/tools/flame-graph/dev/README b/contrib/tools/flame-graph/dev/README new file mode 100644 index 0000000000..599732e3d7 --- /dev/null +++ b/contrib/tools/flame-graph/dev/README @@ -0,0 +1,13 @@ +EXPERIMENTAL: This includes some work in progress code, which may not work +properly. + +Hot Cold Graphs +=============== +These show both on-CPU time (in shades of red) and off-CPU time (blocked time; +in shades of blue) in a similar style to the Flame Graph. + +Thread Hot Cold Graphs +====================== +These are per-Thread Hot Cold Graphs, which helps separate logical functions +that are managed by their threads, and helps you estimate speed up for the +threads of interest. diff --git a/contrib/tools/flame-graph/dev/gatherhc-kern.d b/contrib/tools/flame-graph/dev/gatherhc-kern.d new file mode 100755 index 0000000000..1188205ee1 --- /dev/null +++ b/contrib/tools/flame-graph/dev/gatherhc-kern.d @@ -0,0 +1,32 @@ +#!/usr/sbin/dtrace -s + +#pragma D option stackframes=100 +#pragma D option defaultargs + +profile:::profile-999 +/arg0/ +{ + @[stack(), 1] = sum(1000); +} + +sched:::off-cpu +{ + self->start = timestamp; +} + +sched:::on-cpu +/(this->start = self->start)/ +{ + this->delta = (timestamp - this->start) / 1000; + @[stack(), 0] = sum(this->delta); + self->start = 0; +} + +profile:::tick-60s, +dtrace:::END +{ + normalize(@, 1000); + printa("%koncpu:%d ms:%@d\n", @); + trunc(@); + exit(0); +} diff --git a/contrib/tools/flame-graph/dev/gatherthc-kern.d b/contrib/tools/flame-graph/dev/gatherthc-kern.d new file mode 100755 index 0000000000..50ef8915fc --- /dev/null +++ b/contrib/tools/flame-graph/dev/gatherthc-kern.d @@ -0,0 +1,32 @@ +#!/usr/sbin/dtrace -s + +#pragma D option stackframes=100 +#pragma D option defaultargs + +profile:::profile-999 +/arg0/ +{ + @[stack(), (uint64_t)curthread, pid, tid, execname, 1] = sum(1000); +} + +sched:::off-cpu +{ + self->start = timestamp; +} + +sched:::on-cpu +/(this->start = self->start)/ +{ + this->delta = (timestamp - this->start) / 1000; + @[stack(), (uint64_t)curthread, pid, tid, execname, 0] = sum(this->delta); + self->start = 0; +} + +profile:::tick-60s, +dtrace:::END +{ + normalize(@, 1000); + printa("%kthread:%d pid:%d tid:%d name:%s oncpu:%d ms:%@d\n", @); + trunc(@); + exit(0); +} diff --git a/contrib/tools/flame-graph/dev/hcstackcollapse.pl b/contrib/tools/flame-graph/dev/hcstackcollapse.pl new file mode 100755 index 0000000000..9e562824e5 --- /dev/null +++ b/contrib/tools/flame-graph/dev/hcstackcollapse.pl @@ -0,0 +1,87 @@ +#!/usr/bin/perl -w +# +# hcstackcollapse.pl collapse hot/cold multiline stacks into single lines. +# +# EXPERIMENTAL: This is a work in progress, and may not work properly. +# +# Parses a multiline stack followed by oncpu status and ms on a separate line +# (see example below) and outputs a comma separated stack followed by a space +# and the number. If memory addresses (+0xd) are present, they are stripped, +# and resulting identical stacks are colased with their counts summed. +# +# USAGE: ./hcstackcollapse.pl infile > outfile +# +# Example input: +# +# mysqld`_Z10do_commandP3THD+0xd4 +# mysqld`handle_one_connection+0x1a6 +# libc.so.1`_thrp_setup+0x8d +# libc.so.1`_lwp_start +# oncpu:1 ms:2664 +# +# Example output: +# +# libc.so.1`_lwp_start,libc.so.1`_thrp_setup,mysqld`handle_one_connection,mysqld`_Z10do_commandP3THD oncpu:1 ms:2664 +# +# Input may contain many stacks, and can be generated using DTrace. The +# first few lines of input are skipped (see $headerlines). +# +# Copyright 2013 Joyent, Inc. All rights reserved. +# Copyright 2013 Brendan Gregg. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE +# or http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at usr/src/OPENSOLARIS.LICENSE. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# +# 14-Aug-2011 Brendan Gregg Created this. + +use strict; + +my %collapsed; +my $headerlines = 2; + +sub remember_stack { + my ($stack, $oncpu, $count) = @_; + $collapsed{"$stack $oncpu"} += $count; +} + +my $nr = 0; +my @stack; + +foreach (<>) { + next if $nr++ < $headerlines; + chomp; + + if (m/^oncpu:(\d+) ms:(\d+)$/) { + remember_stack(join(",", @stack), $1, $2) unless $2 == 0; + @stack = (); + next; + } + + next if (m/^\s*$/); + + my $frame = $_; + $frame =~ s/^\s*//; + $frame =~ s/\+.*$//; + $frame = "-" if $frame eq ""; + unshift @stack, $frame; +} + +foreach my $k (sort { $a cmp $b } keys %collapsed) { + print "$k $collapsed{$k}\n"; +} diff --git a/contrib/tools/flame-graph/dev/hotcoldgraph.pl b/contrib/tools/flame-graph/dev/hotcoldgraph.pl new file mode 100755 index 0000000000..1d53476913 --- /dev/null +++ b/contrib/tools/flame-graph/dev/hotcoldgraph.pl @@ -0,0 +1,267 @@ +#!/usr/bin/perl -w +# +# hotcoldgraph.pl flame/cold stack grapher. +# +# EXPERIMENTAL: This is a work in progress, and may not work properly. +# +# This takes on and off-cpu stack timings (see hcstackcollapse.pl) and +# renders a call graph, allowing latency in codepaths to be quickly identified. +# +# USAGE: ./hotcoldgraph.pl input.txt > graph.svg +# +# grep funcA input.txt | ./hotcoldgraph.pl > graph.svg +# +# The input is stack frames and sample counts formatted as single lines. Each +# frame in the stack is comma separated, with a space and count at the end of +# the line. These can be generated using DTrace with stackcollapse.pl. +# +# The output graph shows relative presense of functions in stack samples. The +# ordering on the x-axis has no meaning; since the data is samples, time order +# of events is not known. The order used sorts function names alphabeticly. +# +# HISTORY +# +# This was inspired by Neelakanth Nadgir's excellent function_call_graph.rb +# program, which visualized function entry and return trace events. As Neel +# wrote: "The output displayed is inspired by Roch's CallStackAnalyzer which +# was in turn inspired by the work on vftrace by Jan Boerhout". See: +# http://blogs.sun.com/realneel/entry/visualizing_callstacks_via_dtrace_and +# +# For the on-CPU graph only, see flamegraph.pl. +# +# Copyright 2013 Joyent, Inc. All rights reserved. +# Copyright 2013 Brendan Gregg. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE +# or http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at usr/src/OPENSOLARIS.LICENSE. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# +# 10-Sep-2011 Brendan Gregg Created this. + +use strict; + +# tunables +my $fonttype = "Verdana"; +my $imagewidth = 1200; # max width, pixels +my $frameheight = 16; # max height is dynamic +my $fontsize = 12; # base text size +my $minwidth = 0.1; # min function width, pixels + +# internals +my $ypad1 = $fontsize * 4; # pad top, include title +my $ypad2 = $fontsize * 2 + 10; # pad bottom, include labels +my $xpad = 10; # pad lefm and right +my $timemax = 0; +my $depthmax = 0; +my %Events; + +# SVG functions +{ package SVG; + sub new { + my $class = shift; + my $self = {}; + bless ($self, $class); + return $self; + } + + sub header { + my ($self, $w, $h) = @_; + $self->{svg} .= <<SVG; +<?xml version="1.0" standalone="no"?> +<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"> +<svg version="1.1" width="$w" height="$h" onload="init(evt)" viewBox="0 0 $w $h" xmlns="http://www.w3.org/2000/svg" > +SVG + } + + sub include { + my ($self, $content) = @_; + $self->{svg} .= $content; + } + + sub colorAllocate { + my ($self, $r, $g, $b) = @_; + return "rgb($r,$g,$b)"; + } + + sub filledRectangle { + my ($self, $x1, $y1, $x2, $y2, $fill, $extra) = @_; + $x1 = sprintf "%0.1f", $x1; + $x2 = sprintf "%0.1f", $x2; + my $w = sprintf "%0.1f", $x2 - $x1; + my $h = sprintf "%0.1f", $y2 - $y1; + $extra = defined $extra ? $extra : ""; + $self->{svg} .= qq/<rect x="$x1" y="$y1" width="$w" height="$h" fill="$fill" $extra \/>\n/; + } + + sub stringTTF { + my ($self, $color, $font, $size, $angle, $x, $y, $str, $loc, $extra) = @_; + $loc = defined $loc ? $loc : "left"; + $extra = defined $extra ? $extra : ""; + $self->{svg} .= qq/<text text-anchor="$loc" x="$x" y="$y" font-size="$size" font-family="$font" fill="$color" $extra >$str<\/text>\n/; + } + + sub svg { + my $self = shift; + return "$self->{svg}</svg>\n"; + } + 1; +} + +sub color { + my $type = shift; + if (defined $type and $type eq "hot") { + my $r = 205 + int(rand(50)); + my $g = 0 + int(rand(230)); + my $b = 0 + int(rand(55)); + return "rgb($r,$g,$b)"; + } + if (defined $type and $type eq "cold") { + my $r = 0 + int(rand(40)); + my $b = 205 + int(rand(50)); + my $g = 0 + int(rand(150)); + return "rgb($r,$g,$b)"; + } + return "rgb(0,0,0)"; +} + +my %Node; +my %Tmp; + +sub flow { + my ($a, $b, $ca, $cb, $v) = @_; + my @A = split ",", $a; + my @B = split ",", $b; + + my $len_a = $#A; + my $len_b = $#B; + $depthmax = $len_b if $len_b > $depthmax; + + my $i = 0; + my $len_same = 0; + for (; $i <= $len_a; $i++) { + last if $i > $len_b; + last if $A[$i] ne $B[$i]; + } + $len_same = $i; + $len_same = 0 if $ca != $cb; + + for ($i = $len_a; $i >= $len_same; $i--) { + my $k = "$A[$i]-$i"; + # a unique ID is constructed from func-depth-etime; + # func-depth isn't unique, it may be repeated later. + $Node{"$k-$v-$ca"}->{stime} = $Tmp{$k}->{stime}; + delete $Tmp{$k}->{stime}; + delete $Tmp{$k}; + } + + for ($i = $len_same; $i <= $len_b; $i++) { + my $k = "$B[$i]-$i"; + $Tmp{$k}->{stime} = $v; + } +} + +# Parse input +my @Data = <>; +my $laststack = ""; +my $lastcpu = 0; +my $time = 0; +foreach (sort @Data) { + chomp; + my ($stack, $cpu, $samples) = split ' '; + $stack = ",$stack"; + next unless defined $samples; + flow($laststack, $stack, $lastcpu, $cpu, $time); + $time += $samples; + $laststack = $stack; + $lastcpu = $cpu; +} +flow($laststack, "", $lastcpu, 0, $time); +$timemax = $time or die "ERROR: No stack counts found\n"; + +# Draw canvas +my $widthpertime = ($imagewidth - 2 * $xpad) / $timemax; +my $imageheight = ($depthmax * $frameheight) + $ypad1 + $ypad2; +my $im = SVG->new(); +$im->header($imagewidth, $imageheight); +my $inc = <<INC; +<defs > + <linearGradient id="background" y1="0" y2="1" x1="0" x2="0" > + <stop stop-color="#eeeeee" offset="5%" /> + <stop stop-color="#eeeeb0" offset="95%" /> + </linearGradient> +</defs> +<style type="text/css"> + rect[rx]:hover { stroke:black; stroke-width:1; } + text:hover { stroke:black; stroke-width:1; stroke-opacity:0.35; } +</style> +<script type="text/ecmascript"> +<![CDATA[ + var details; + function init(evt) { details = document.getElementById("details").firstChild; } + function s(info) { details.nodeValue = info; } + function c() { details.nodeValue = ' '; } +]]> +</script> +INC +$im->include($inc); +$im->filledRectangle(0, 0, $imagewidth, $imageheight, 'url(#background)'); +my ($white, $black, $vvdgrey, $vdgrey) = ( + $im->colorAllocate(255, 255, 255), + $im->colorAllocate(0, 0, 0), + $im->colorAllocate(40, 40, 40), + $im->colorAllocate(160, 160, 160), + ); +$im->stringTTF($black, $fonttype, $fontsize + 5, 0.0, int($imagewidth / 2), $fontsize * 2, "Flame Graph", "middle"); +$im->stringTTF($black, $fonttype, $fontsize, 0.0, $xpad, $imageheight - ($ypad2 / 2), 'Function:'); +$im->stringTTF($black, $fonttype, $fontsize, 0.0, $xpad + 60, $imageheight - ($ypad2 / 2), " ", "", 'id="details"'); + +# Draw frames +foreach my $id (keys %Node) { + my ($func, $depth, $etime, $cpu) = split "-", $id; + die "missing start for $id" if !defined $Node{$id}->{stime}; + my $stime = $Node{$id}->{stime}; + my $samples = $etime - $stime; + + my $x1 = $xpad + $stime * $widthpertime; + my $x2 = $xpad + $etime * $widthpertime; + my $width = $x2 - $x1; + next if $width < $minwidth; + + my $y1 = $imageheight - $ypad2 - ($depth + 1) * $frameheight + 1; + my $y2 = $imageheight - $ypad2 - $depth * $frameheight; + + my $info; + if ($func eq "" and $depth == 0) { + $info = "all ($samples ms, 100%)"; + } else { + my $pct = sprintf "%.2f", ((100 * $samples) / $timemax); + $info = "$func ($samples ms, $pct%)"; + } + my $color = $cpu ? "hot" : "cold"; + $im->filledRectangle($x1, $y1, $x2, $y2, color($color), 'rx="2" ry="2" onmouseover="s(' . "'$info'" . ')" onmouseout="c()"'); + + if ($width > 50) { + my $chars = int($width / (0.7 * $fontsize)); + my $text = substr $func, 0, $chars; + $text .= ".." if $chars < length $func; + $im->stringTTF($black, $fonttype, $fontsize, 0.0, $x1 + 3, 3 + ($y1 + $y2) / 2, $text, "", + 'onmouseover="s(' . "'$info'" . ')" onmouseout="c()"'); + } +} + +print $im->svg; diff --git a/contrib/tools/flame-graph/dev/thcstackcollapse.pl b/contrib/tools/flame-graph/dev/thcstackcollapse.pl new file mode 100755 index 0000000000..2aa307ef08 --- /dev/null +++ b/contrib/tools/flame-graph/dev/thcstackcollapse.pl @@ -0,0 +1,89 @@ +#!/usr/bin/perl -w +# +# thcstackcollapse.pl collapse thread hot/cold multiline stacks into +# single lines. +# +# EXPERIMENTAL: This is a work in progress, and may not work properly. +# +# Parses a multiline stack followed by thread ID, PID, TID, name, oncpu status, +# and ms on a separate line (see example below) and outputs a comma separated +# stack followed by a space and the numbers. If memory addresses (+0xd) are +# present, they are stripped, and resulting identical stacks are colased with +# their counts summed. +# +# USAGE: ./thcstackcollapse.pl infile > outfile +# +# Example input: +# +# mysqld`_Z10do_commandP3THD+0xd4 +# mysqld`handle_one_connection+0x1a6 +# libc.so.1`_thrp_setup+0x8d +# libc.so.1`_lwp_start +# thread:0x78372480 pid:826 tid:3 name:mysqld oncpu:1 ms:2664 +# +# Example output: +# +# libc.so.1`_lwp_start,libc.so.1`_thrp_setup,mysqld`handle_one_connection,mysqld`_Z10do_commandP3THD 0x78372480 826 3 mysqld 1 2664 +# +# Input may contain many stacks, and can be generated using DTrace. The +# first few lines of input are skipped (see $headerlines). +# +# Copyright 2013 Joyent, Inc. All rights reserved. +# Copyright 2013 Brendan Gregg. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE +# or http://www.opensolaris.org/os/licensing. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at usr/src/OPENSOLARIS.LICENSE. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# +# 14-Aug-2011 Brendan Gregg Created this. + +use strict; + +my %collapsed; +my $headerlines = 2; + +sub remember_stack { + my ($stack, $thread, $pid, $tid, $name, $oncpu, $count) = @_; + $collapsed{"$stack $thread $pid $tid $name $oncpu"} += $count; +} + +my $nr = 0; +my @stack; + +foreach (<>) { + next if $nr++ < $headerlines; + chomp; + + next if (m/^\s*$/); + + if (m/^thread:(\d+) pid:(\d+) tid:(\d+) name:(.*?) oncpu:(\d+) ms:(\d+)$/) { + remember_stack(join(",", @stack), $1, $2, $3, $4, $5, $6) if $6 > 0; + @stack = (); + next; + } + + my $frame = $_; + $frame =~ s/^\s*//; + $frame =~ s/\+.*$//; + $frame = "-" if $frame eq ""; + unshift @stack, $frame; +} + +foreach my $k (sort { $a cmp $b } keys %collapsed) { + print "$k $collapsed{$k}\n"; +} diff --git a/contrib/tools/flame-graph/difffolded.pl b/contrib/tools/flame-graph/difffolded.pl new file mode 100755 index 0000000000..4c76c2ecf3 --- /dev/null +++ b/contrib/tools/flame-graph/difffolded.pl @@ -0,0 +1,115 @@ +#!/usr/bin/perl -w +# +# difffolded.pl diff two folded stack files. Use this for generating +# flame graph differentials. +# +# USAGE: ./difffolded.pl [-hns] folded1 folded2 | ./flamegraph.pl > diff2.svg +# +# Options are described in the usage message (-h). +# +# The flamegraph will be colored based on higher samples (red) and smaller +# samples (blue). The frame widths will be based on the 2nd folded file. +# This might be confusing if stack frames disappear entirely; it will make +# the most sense to ALSO create a differential based on the 1st file widths, +# while switching the hues; eg: +# +# ./difffolded.pl folded2 folded1 | ./flamegraph.pl --negate > diff1.svg +# +# Here's what they mean when comparing a before and after profile: +# +# diff1.svg: widths show the before profile, colored by what WILL happen +# diff2.svg: widths show the after profile, colored by what DID happen +# +# INPUT: See stackcollapse* programs. +# +# OUTPUT: The full list of stacks, with two columns, one from each file. +# If a stack wasn't present in a file, the column value is zero. +# +# folded_stack_trace count_from_folded1 count_from_folded2 +# +# eg: +# +# funca;funcb;funcc 31 33 +# ... +# +# COPYRIGHT: Copyright (c) 2014 Brendan Gregg. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License +# as published by the Free Software Foundation; either version 2 +# of the License, or (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software Foundation, +# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. +# +# (http://www.gnu.org/copyleft/gpl.html) +# +# 28-Oct-2014 Brendan Gregg Created this. + +use strict; +use Getopt::Std; + +# defaults +my $normalize = 0; # make sample counts equal +my $striphex = 0; # strip hex numbers + +sub usage { + print STDERR <<USAGE_END; +USAGE: $0 [-hns] folded1 folded2 | flamegraph.pl > diff2.svg + -h # help message + -n # normalize sample counts + -s # strip hex numbers (addresses) +See stackcollapse scripts for generating folded files. +Also consider flipping the files and hues to highlight reduced paths: +$0 folded2 folded1 | ./flamegraph.pl --negate > diff1.svg +USAGE_END + exit 2; +} + +usage() if @ARGV < 2; +our($opt_h, $opt_n, $opt_s); +getopts('ns') or usage(); +usage() if $opt_h; +$normalize = 1 if defined $opt_n; +$striphex = 1 if defined $opt_s; + +my ($total1, $total2) = (0, 0); +my %Folded; + +my $file1 = $ARGV[0]; +my $file2 = $ARGV[1]; + +open FILE, $file1 or die "ERROR: Can't read $file1\n"; +while (<FILE>) { + chomp; + my ($stack, $count) = (/^(.*)\s+?(\d+(?:\.\d*)?)$/); + $stack =~ s/0x[0-9a-fA-F]+/0x.../g if $striphex; + $Folded{$stack}{1} += $count; + $total1 += $count; +} +close FILE; + +open FILE, $file2 or die "ERROR: Can't read $file2\n"; +while (<FILE>) { + chomp; + my ($stack, $count) = (/^(.*)\s+?(\d+(?:\.\d*)?)$/); + $stack =~ s/0x[0-9a-fA-F]+/0x.../g if $striphex; + $Folded{$stack}{2} += $count; + $total2 += $count; +} +close FILE; + +foreach my $stack (keys %Folded) { + $Folded{$stack}{1} = 0 unless defined $Folded{$stack}{1}; + $Folded{$stack}{2} = 0 unless defined $Folded{$stack}{2}; + if ($normalize && $total1 != $total2) { + $Folded{$stack}{1} = int($Folded{$stack}{1} * $total2 / $total1); + } + print "$stack $Folded{$stack}{1} $Folded{$stack}{2}\n"; +} diff --git a/contrib/tools/flame-graph/files.pl b/contrib/tools/flame-graph/files.pl new file mode 100755 index 0000000000..50426b2e47 --- /dev/null +++ b/contrib/tools/flame-graph/files.pl @@ -0,0 +1,62 @@ +#!/usr/bin/perl -w +# +# files.pl Print file sizes in folded format, for a flame graph. +# +# This helps you understand storage consumed by a file system, by creating +# a flame graph visualization of space consumed. This is basically a Perl +# version of the "find" command, which emits in folded format for piping +# into flamegraph.pl. +# +# Copyright (c) 2017 Brendan Gregg. +# Licensed under the Apache License, Version 2.0 (the "License") +# +# 03-Feb-2017 Brendan Gregg Created this. + +use strict; +use File::Find; + +sub usage { + print STDERR "USAGE: $0 [--xdev] [DIRECTORY]...\n"; + print STDERR " eg, $0 /Users\n"; + print STDERR " To not descend directories on other filesystems:"; + print STDERR " eg, $0 --xdev /\n"; + print STDERR "Intended to be piped to flamegraph.pl. Full example:\n"; + print STDERR " $0 /Users | flamegraph.pl " . + "--hash --countname=bytes > files.svg\n"; + print STDERR " $0 /usr /home /root /etc | flamegraph.pl " . + "--hash --countname=bytes > files.svg\n"; + print STDERR " $0 --xdev / | flamegraph.pl " . + "--hash --countname=bytes > files.svg\n"; + exit 1; +} + +usage() if @ARGV == 0 or $ARGV[0] eq "--help" or $ARGV[0] eq "-h"; + +my $filter_xdev = 0; +my $xdev_id; + +foreach my $dir (@ARGV) { + if ($dir eq "--xdev") { + $filter_xdev = 1; + } else { + find(\&wanted, $dir); + } +} + +sub wanted { + my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size) = lstat($_); + return unless defined $size; + if ($filter_xdev) { + if (!$xdev_id) { + $xdev_id = $dev; + } elsif ($xdev_id ne $dev) { + $File::Find::prune = 1; + return; + } + } + my $path = $File::Find::name; + $path =~ tr/\//;/; # delimiter + $path =~ tr/;.a-zA-Z0-9-/_/c; # ditch whitespace and other chars + $path =~ s/^;//; + print "$path $size\n"; +} diff --git a/contrib/tools/flame-graph/filter-perf-events.awk b/contrib/tools/flame-graph/filter-perf-events.awk new file mode 100755 index 0000000000..64300ff2f4 --- /dev/null +++ b/contrib/tools/flame-graph/filter-perf-events.awk @@ -0,0 +1,15 @@ +#!/usr/bin/awk -f +# Filter perf script output by event type +# Sample usage: awk -f ./filter-perf-events.awk -v event=stalled-cycles-frontend <perf.data.dump + +/^[^[:space:]]/ { + match_event = match($0, event ":"); +} + +match_event { + print +} + +NF == 0 { + match_event = 0 +} diff --git a/contrib/tools/flame-graph/flamegraph.pl b/contrib/tools/flame-graph/flamegraph.pl new file mode 100755 index 0000000000..d039dcbc29 --- /dev/null +++ b/contrib/tools/flame-graph/flamegraph.pl @@ -0,0 +1,1191 @@ +#!/usr/bin/perl -w +# +# flamegraph.pl flame stack grapher. +# +# This takes stack samples and renders a call graph, allowing hot functions +# and codepaths to be quickly identified. Stack samples can be generated using +# tools such as DTrace, perf, SystemTap, and Instruments. +# +# USAGE: ./flamegraph.pl [options] input.txt > graph.svg +# +# grep funcA input.txt | ./flamegraph.pl [options] > graph.svg +# +# Then open the resulting .svg in a web browser, for interactivity: mouse-over +# frames for info, click to zoom, and ctrl-F to search. +# +# Options are listed in the usage message (--help). +# +# The input is stack frames and sample counts formatted as single lines. Each +# frame in the stack is semicolon separated, with a space and count at the end +# of the line. These can be generated for Linux perf script output using +# stackcollapse-perf.pl, for DTrace using stackcollapse.pl, and for other tools +# using the other stackcollapse programs. Example input: +# +# swapper;start_kernel;rest_init;cpu_idle;default_idle;native_safe_halt 1 +# +# An optional extra column of counts can be provided to generate a differential +# flame graph of the counts, colored red for more, and blue for less. This +# can be useful when using flame graphs for non-regression testing. +# See the header comment in the difffolded.pl program for instructions. +# +# The input functions can optionally have annotations at the end of each +# function name, following a precedent by some tools (Linux perf's _[k]): +# _[k] for kernel +# _[i] for inlined +# _[j] for jit +# _[w] for waker +# Some of the stackcollapse programs support adding these annotations, eg, +# stackcollapse-perf.pl --kernel --jit. They are used merely for colors by +# some palettes, eg, flamegraph.pl --color=java. +# +# The output flame graph shows relative presence of functions in stack samples. +# The ordering on the x-axis has no meaning; since the data is samples, time +# order of events is not known. The order used sorts function names +# alphabetically. +# +# While intended to process stack samples, this can also process stack traces. +# For example, tracing stacks for memory allocation, or resource usage. You +# can use --title to set the title to reflect the content, and --countname +# to change "samples" to "bytes" etc. +# +# There are a few different palettes, selectable using --color. By default, +# the colors are selected at random (except for differentials). Functions +# called "-" will be printed gray, which can be used for stack separators (eg, +# between user and kernel stacks). +# +# HISTORY +# +# This was inspired by Neelakanth Nadgir's excellent function_call_graph.rb +# program, which visualized function entry and return trace events. As Neel +# wrote: "The output displayed is inspired by Roch's CallStackAnalyzer which +# was in turn inspired by the work on vftrace by Jan Boerhout". See: +# https://blogs.oracle.com/realneel/entry/visualizing_callstacks_via_dtrace_and +# +# Copyright 2016 Netflix, Inc. +# Copyright 2011 Joyent, Inc. All rights reserved. +# Copyright 2011 Brendan Gregg. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# +# 11-Oct-2014 Adrien Mahieux Added zoom. +# 21-Nov-2013 Shawn Sterling Added consistent palette file option +# 17-Mar-2013 Tim Bunce Added options and more tunables. +# 15-Dec-2011 Dave Pacheco Support for frames with whitespace. +# 10-Sep-2011 Brendan Gregg Created this. + +use strict; + +use Getopt::Long; + +use open qw(:std :utf8); + +# tunables +my $encoding; +my $fonttype = "Verdana"; +my $imagewidth = 1200; # max width, pixels +my $frameheight = 16; # max height is dynamic +my $fontsize = 12; # base text size +my $fontwidth = 0.59; # avg width relative to fontsize +my $minwidth = 0.1; # min function width, pixels +my $nametype = "Function:"; # what are the names in the data? +my $countname = "samples"; # what are the counts in the data? +my $colors = "hot"; # color theme +my $bgcolors = ""; # background color theme +my $nameattrfile; # file holding function attributes +my $timemax; # (override the) sum of the counts +my $factor = 1; # factor to scale counts by +my $hash = 0; # color by function name +my $palette = 0; # if we use consistent palettes (default off) +my %palette_map; # palette map hash +my $pal_file = "palette.map"; # palette map file name +my $stackreverse = 0; # reverse stack order, switching merge end +my $inverted = 0; # icicle graph +my $flamechart = 0; # produce a flame chart (sort by time, do not merge stacks) +my $negate = 0; # switch differential hues +my $titletext = ""; # centered heading +my $titledefault = "Flame Graph"; # overwritten by --title +my $titleinverted = "Icicle Graph"; # " " +my $searchcolor = "rgb(230,0,230)"; # color for search highlighting +my $notestext = ""; # embedded notes in SVG +my $subtitletext = ""; # second level title (optional) +my $help = 0; + +sub usage { + die <<USAGE_END; +USAGE: $0 [options] infile > outfile.svg\n + --title TEXT # change title text + --subtitle TEXT # second level title (optional) + --width NUM # width of image (default 1200) + --height NUM # height of each frame (default 16) + --minwidth NUM # omit smaller functions (default 0.1 pixels) + --fonttype FONT # font type (default "Verdana") + --fontsize NUM # font size (default 12) + --countname TEXT # count type label (default "samples") + --nametype TEXT # name type label (default "Function:") + --colors PALETTE # set color palette. choices are: hot (default), mem, + # io, wakeup, chain, java, js, perl, red, green, blue, + # aqua, yellow, purple, orange + --bgcolors COLOR # set background colors. gradient choices are yellow + # (default), blue, green, grey; flat colors use "#rrggbb" + --hash # colors are keyed by function name hash + --cp # use consistent palette (palette.map) + --reverse # generate stack-reversed flame graph + --inverted # icicle graph + --flamechart # produce a flame chart (sort by time, do not merge stacks) + --negate # switch differential hues (blue<->red) + --notes TEXT # add notes comment in SVG (for debugging) + --help # this message + + eg, + $0 --title="Flame Graph: malloc()" trace.txt > graph.svg +USAGE_END +} + +GetOptions( + 'fonttype=s' => \$fonttype, + 'width=i' => \$imagewidth, + 'height=i' => \$frameheight, + 'encoding=s' => \$encoding, + 'fontsize=f' => \$fontsize, + 'fontwidth=f' => \$fontwidth, + 'minwidth=f' => \$minwidth, + 'title=s' => \$titletext, + 'subtitle=s' => \$subtitletext, + 'nametype=s' => \$nametype, + 'countname=s' => \$countname, + 'nameattr=s' => \$nameattrfile, + 'total=s' => \$timemax, + 'factor=f' => \$factor, + 'colors=s' => \$colors, + 'bgcolors=s' => \$bgcolors, + 'hash' => \$hash, + 'cp' => \$palette, + 'reverse' => \$stackreverse, + 'inverted' => \$inverted, + 'flamechart' => \$flamechart, + 'negate' => \$negate, + 'notes=s' => \$notestext, + 'help' => \$help, +) or usage(); +$help && usage(); + +# internals +my $ypad1 = $fontsize * 3; # pad top, include title +my $ypad2 = $fontsize * 2 + 10; # pad bottom, include labels +my $ypad3 = $fontsize * 2; # pad top, include subtitle (optional) +my $xpad = 10; # pad lefm and right +my $framepad = 1; # vertical padding for frames +my $depthmax = 0; +my %Events; +my %nameattr; + +if ($flamechart && $titletext eq "") { + $titletext = "Flame Chart"; +} + +if ($titletext eq "") { + unless ($inverted) { + $titletext = $titledefault; + } else { + $titletext = $titleinverted; + } +} + +if ($nameattrfile) { + # The name-attribute file format is a function name followed by a tab then + # a sequence of tab separated name=value pairs. + open my $attrfh, $nameattrfile or die "Can't read $nameattrfile: $!\n"; + while (<$attrfh>) { + chomp; + my ($funcname, $attrstr) = split /\t/, $_, 2; + die "Invalid format in $nameattrfile" unless defined $attrstr; + $nameattr{$funcname} = { map { split /=/, $_, 2 } split /\t/, $attrstr }; + } +} + +if ($notestext =~ /[<>]/) { + die "Notes string can't contain < or >" +} + +# background colors: +# - yellow gradient: default (hot, java, js, perl) +# - green gradient: mem +# - blue gradient: io, wakeup, chain +# - gray gradient: flat colors (red, green, blue, ...) +if ($bgcolors eq "") { + # choose a default + if ($colors eq "mem") { + $bgcolors = "green"; + } elsif ($colors =~ /^(io|wakeup|chain)$/) { + $bgcolors = "blue"; + } elsif ($colors =~ /^(red|green|blue|aqua|yellow|purple|orange)$/) { + $bgcolors = "grey"; + } else { + $bgcolors = "yellow"; + } +} +my ($bgcolor1, $bgcolor2); +if ($bgcolors eq "yellow") { + $bgcolor1 = "#eeeeee"; # background color gradient start + $bgcolor2 = "#eeeeb0"; # background color gradient stop +} elsif ($bgcolors eq "blue") { + $bgcolor1 = "#eeeeee"; $bgcolor2 = "#e0e0ff"; +} elsif ($bgcolors eq "green") { + $bgcolor1 = "#eef2ee"; $bgcolor2 = "#e0ffe0"; +} elsif ($bgcolors eq "grey") { + $bgcolor1 = "#f8f8f8"; $bgcolor2 = "#e8e8e8"; +} elsif ($bgcolors =~ /^#......$/) { + $bgcolor1 = $bgcolor2 = $bgcolors; +} else { + die "Unrecognized bgcolor option \"$bgcolors\"" +} + +# SVG functions +{ package SVG; + sub new { + my $class = shift; + my $self = {}; + bless ($self, $class); + return $self; + } + + sub header { + my ($self, $w, $h) = @_; + my $enc_attr = ''; + if (defined $encoding) { + $enc_attr = qq{ encoding="$encoding"}; + } + $self->{svg} .= <<SVG; +<?xml version="1.0"$enc_attr standalone="no"?> +<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"> +<svg version="1.1" width="$w" height="$h" onload="init(evt)" viewBox="0 0 $w $h" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"> +<!-- Flame graph stack visualization. See https://github.com/brendangregg/FlameGraph for latest version, and http://www.brendangregg.com/flamegraphs.html for examples. --> +<!-- NOTES: $notestext --> +SVG + } + + sub include { + my ($self, $content) = @_; + $self->{svg} .= $content; + } + + sub colorAllocate { + my ($self, $r, $g, $b) = @_; + return "rgb($r,$g,$b)"; + } + + sub group_start { + my ($self, $attr) = @_; + + my @g_attr = map { + exists $attr->{$_} ? sprintf(qq/$_="%s"/, $attr->{$_}) : () + } qw(id class); + push @g_attr, $attr->{g_extra} if $attr->{g_extra}; + if ($attr->{href}) { + my @a_attr; + push @a_attr, sprintf qq/xlink:href="%s"/, $attr->{href} if $attr->{href}; + # default target=_top else links will open within SVG <object> + push @a_attr, sprintf qq/target="%s"/, $attr->{target} || "_top"; + push @a_attr, $attr->{a_extra} if $attr->{a_extra}; + $self->{svg} .= sprintf qq/<a %s>\n/, join(' ', (@a_attr, @g_attr)); + } else { + $self->{svg} .= sprintf qq/<g %s>\n/, join(' ', @g_attr); + } + + $self->{svg} .= sprintf qq/<title>%s<\/title>/, $attr->{title} + if $attr->{title}; # should be first element within g container + } + + sub group_end { + my ($self, $attr) = @_; + $self->{svg} .= $attr->{href} ? qq/<\/a>\n/ : qq/<\/g>\n/; + } + + sub filledRectangle { + my ($self, $x1, $y1, $x2, $y2, $fill, $extra) = @_; + $x1 = sprintf "%0.1f", $x1; + $x2 = sprintf "%0.1f", $x2; + my $w = sprintf "%0.1f", $x2 - $x1; + my $h = sprintf "%0.1f", $y2 - $y1; + $extra = defined $extra ? $extra : ""; + $self->{svg} .= qq/<rect x="$x1" y="$y1" width="$w" height="$h" fill="$fill" $extra \/>\n/; + } + + sub stringTTF { + my ($self, $id, $x, $y, $str, $extra) = @_; + $x = sprintf "%0.2f", $x; + $id = defined $id ? qq/id="$id"/ : ""; + $extra ||= ""; + $self->{svg} .= qq/<text $id x="$x" y="$y" $extra>$str<\/text>\n/; + } + + sub svg { + my $self = shift; + return "$self->{svg}</svg>\n"; + } + 1; +} + +sub namehash { + # Generate a vector hash for the name string, weighting early over + # later characters. We want to pick the same colors for function + # names across different flame graphs. + my $name = shift; + my $vector = 0; + my $weight = 1; + my $max = 1; + my $mod = 10; + # if module name present, trunc to 1st char + $name =~ s/.(.*?)`//; + foreach my $c (split //, $name) { + my $i = (ord $c) % $mod; + $vector += ($i / ($mod++ - 1)) * $weight; + $max += 1 * $weight; + $weight *= 0.70; + last if $mod > 12; + } + return (1 - $vector / $max) +} + +sub color { + my ($type, $hash, $name) = @_; + my ($v1, $v2, $v3); + + if ($hash) { + $v1 = namehash($name); + $v2 = $v3 = namehash(scalar reverse $name); + } else { + $v1 = rand(1); + $v2 = rand(1); + $v3 = rand(1); + } + + # theme palettes + if (defined $type and $type eq "hot") { + my $r = 205 + int(50 * $v3); + my $g = 0 + int(230 * $v1); + my $b = 0 + int(55 * $v2); + return "rgb($r,$g,$b)"; + } + if (defined $type and $type eq "mem") { + my $r = 0; + my $g = 190 + int(50 * $v2); + my $b = 0 + int(210 * $v1); + return "rgb($r,$g,$b)"; + } + if (defined $type and $type eq "io") { + my $r = 80 + int(60 * $v1); + my $g = $r; + my $b = 190 + int(55 * $v2); + return "rgb($r,$g,$b)"; + } + + # multi palettes + if (defined $type and $type eq "java") { + # Handle both annotations (_[j], _[i], ...; which are + # accurate), as well as input that lacks any annotations, as + # best as possible. Without annotations, we get a little hacky + # and match on java|org|com, etc. + if ($name =~ m:_\[j\]$:) { # jit annotation + $type = "green"; + } elsif ($name =~ m:_\[i\]$:) { # inline annotation + $type = "aqua"; + } elsif ($name =~ m:^L?(java|javax|jdk|net|org|com|io|sun)/:) { # Java + $type = "green"; + } elsif ($name =~ m:_\[k\]$:) { # kernel annotation + $type = "orange"; + } elsif ($name =~ /::/) { # C++ + $type = "yellow"; + } else { # system + $type = "red"; + } + # fall-through to color palettes + } + if (defined $type and $type eq "perl") { + if ($name =~ /::/) { # C++ + $type = "yellow"; + } elsif ($name =~ m:Perl: or $name =~ m:\.pl:) { # Perl + $type = "green"; + } elsif ($name =~ m:_\[k\]$:) { # kernel + $type = "orange"; + } else { # system + $type = "red"; + } + # fall-through to color palettes + } + if (defined $type and $type eq "js") { + # Handle both annotations (_[j], _[i], ...; which are + # accurate), as well as input that lacks any annotations, as + # best as possible. Without annotations, we get a little hacky, + # and match on a "/" with a ".js", etc. + if ($name =~ m:_\[j\]$:) { # jit annotation + if ($name =~ m:/:) { + $type = "green"; # source + } else { + $type = "aqua"; # builtin + } + } elsif ($name =~ /::/) { # C++ + $type = "yellow"; + } elsif ($name =~ m:/.*\.js:) { # JavaScript (match "/" in path) + $type = "green"; + } elsif ($name =~ m/:/) { # JavaScript (match ":" in builtin) + $type = "aqua"; + } elsif ($name =~ m/^ $/) { # Missing symbol + $type = "green"; + } elsif ($name =~ m:_\[k\]:) { # kernel + $type = "orange"; + } else { # system + $type = "red"; + } + # fall-through to color palettes + } + if (defined $type and $type eq "wakeup") { + $type = "aqua"; + # fall-through to color palettes + } + if (defined $type and $type eq "chain") { + if ($name =~ m:_\[w\]:) { # waker + $type = "aqua" + } else { # off-CPU + $type = "blue"; + } + # fall-through to color palettes + } + + # color palettes + if (defined $type and $type eq "red") { + my $r = 200 + int(55 * $v1); + my $x = 50 + int(80 * $v1); + return "rgb($r,$x,$x)"; + } + if (defined $type and $type eq "green") { + my $g = 200 + int(55 * $v1); + my $x = 50 + int(60 * $v1); + return "rgb($x,$g,$x)"; + } + if (defined $type and $type eq "blue") { + my $b = 205 + int(50 * $v1); + my $x = 80 + int(60 * $v1); + return "rgb($x,$x,$b)"; + } + if (defined $type and $type eq "yellow") { + my $x = 175 + int(55 * $v1); + my $b = 50 + int(20 * $v1); + return "rgb($x,$x,$b)"; + } + if (defined $type and $type eq "purple") { + my $x = 190 + int(65 * $v1); + my $g = 80 + int(60 * $v1); + return "rgb($x,$g,$x)"; + } + if (defined $type and $type eq "aqua") { + my $r = 50 + int(60 * $v1); + my $g = 165 + int(55 * $v1); + my $b = 165 + int(55 * $v1); + return "rgb($r,$g,$b)"; + } + if (defined $type and $type eq "orange") { + my $r = 190 + int(65 * $v1); + my $g = 90 + int(65 * $v1); + return "rgb($r,$g,0)"; + } + + return "rgb(0,0,0)"; +} + +sub color_scale { + my ($value, $max) = @_; + my ($r, $g, $b) = (255, 255, 255); + $value = -$value if $negate; + if ($value > 0) { + $g = $b = int(210 * ($max - $value) / $max); + } elsif ($value < 0) { + $r = $g = int(210 * ($max + $value) / $max); + } + return "rgb($r,$g,$b)"; +} + +sub color_map { + my ($colors, $func) = @_; + if (exists $palette_map{$func}) { + return $palette_map{$func}; + } else { + $palette_map{$func} = color($colors, $hash, $func); + return $palette_map{$func}; + } +} + +sub write_palette { + open(FILE, ">$pal_file"); + foreach my $key (sort keys %palette_map) { + print FILE $key."->".$palette_map{$key}."\n"; + } + close(FILE); +} + +sub read_palette { + if (-e $pal_file) { + open(FILE, $pal_file) or die "can't open file $pal_file: $!"; + while ( my $line = <FILE>) { + chomp($line); + (my $key, my $value) = split("->",$line); + $palette_map{$key}=$value; + } + close(FILE) + } +} + +my %Node; # Hash of merged frame data +my %Tmp; + +# flow() merges two stacks, storing the merged frames and value data in %Node. +sub flow { + my ($last, $this, $v, $d) = @_; + + my $len_a = @$last - 1; + my $len_b = @$this - 1; + + my $i = 0; + my $len_same; + for (; $i <= $len_a; $i++) { + last if $i > $len_b; + last if $last->[$i] ne $this->[$i]; + } + $len_same = $i; + + for ($i = $len_a; $i >= $len_same; $i--) { + my $k = "$last->[$i];$i"; + # a unique ID is constructed from "func;depth;etime"; + # func-depth isn't unique, it may be repeated later. + $Node{"$k;$v"}->{stime} = delete $Tmp{$k}->{stime}; + if (defined $Tmp{$k}->{delta}) { + $Node{"$k;$v"}->{delta} = delete $Tmp{$k}->{delta}; + } + delete $Tmp{$k}; + } + + for ($i = $len_same; $i <= $len_b; $i++) { + my $k = "$this->[$i];$i"; + $Tmp{$k}->{stime} = $v; + if (defined $d) { + $Tmp{$k}->{delta} += $i == $len_b ? $d : 0; + } + } + + return $this; +} + +# parse input +my @Data; +my @SortedData; +my $last = []; +my $time = 0; +my $delta = undef; +my $ignored = 0; +my $line; +my $maxdelta = 1; + +# reverse if needed +foreach (<>) { + chomp; + $line = $_; + if ($stackreverse) { + # there may be an extra samples column for differentials + # XXX todo: redo these REs as one. It's repeated below. + my($stack, $samples) = (/^(.*)\s+?(\d+(?:\.\d*)?)$/); + my $samples2 = undef; + if ($stack =~ /^(.*)\s+?(\d+(?:\.\d*)?)$/) { + $samples2 = $samples; + ($stack, $samples) = $stack =~ (/^(.*)\s+?(\d+(?:\.\d*)?)$/); + unshift @Data, join(";", reverse split(";", $stack)) . " $samples $samples2"; + } else { + unshift @Data, join(";", reverse split(";", $stack)) . " $samples"; + } + } else { + unshift @Data, $line; + } +} + +if ($flamechart) { + # In flame chart mode, just reverse the data so time moves from left to right. + @SortedData = reverse @Data; +} else { + @SortedData = sort @Data; +} + +# process and merge frames +foreach (@SortedData) { + chomp; + # process: folded_stack count + # eg: func_a;func_b;func_c 31 + my ($stack, $samples) = (/^(.*)\s+?(\d+(?:\.\d*)?)$/); + unless (defined $samples and defined $stack) { + ++$ignored; + next; + } + + # there may be an extra samples column for differentials: + my $samples2 = undef; + if ($stack =~ /^(.*)\s+?(\d+(?:\.\d*)?)$/) { + $samples2 = $samples; + ($stack, $samples) = $stack =~ (/^(.*)\s+?(\d+(?:\.\d*)?)$/); + } + $delta = undef; + if (defined $samples2) { + $delta = $samples2 - $samples; + $maxdelta = abs($delta) if abs($delta) > $maxdelta; + } + + # for chain graphs, annotate waker frames with "_[w]", for later + # coloring. This is a hack, but has a precedent ("_[k]" from perf). + if ($colors eq "chain") { + my @parts = split ";--;", $stack; + my @newparts = (); + $stack = shift @parts; + $stack .= ";--;"; + foreach my $part (@parts) { + $part =~ s/;/_[w];/g; + $part .= "_[w]"; + push @newparts, $part; + } + $stack .= join ";--;", @parts; + } + + # merge frames and populate %Node: + $last = flow($last, [ '', split ";", $stack ], $time, $delta); + + if (defined $samples2) { + $time += $samples2; + } else { + $time += $samples; + } +} +flow($last, [], $time, $delta); + +warn "Ignored $ignored lines with invalid format\n" if $ignored; +unless ($time) { + warn "ERROR: No stack counts found\n"; + my $im = SVG->new(); + # emit an error message SVG, for tools automating flamegraph use + my $imageheight = $fontsize * 5; + $im->header($imagewidth, $imageheight); + $im->stringTTF(undef, int($imagewidth / 2), $fontsize * 2, + "ERROR: No valid input provided to flamegraph.pl."); + print $im->svg; + exit 2; +} +if ($timemax and $timemax < $time) { + warn "Specified --total $timemax is less than actual total $time, so ignored\n" + if $timemax/$time > 0.02; # only warn is significant (e.g., not rounding etc) + undef $timemax; +} +$timemax ||= $time; + +my $widthpertime = ($imagewidth - 2 * $xpad) / $timemax; +my $minwidth_time = $minwidth / $widthpertime; + +# prune blocks that are too narrow and determine max depth +while (my ($id, $node) = each %Node) { + my ($func, $depth, $etime) = split ";", $id; + my $stime = $node->{stime}; + die "missing start for $id" if not defined $stime; + + if (($etime-$stime) < $minwidth_time) { + delete $Node{$id}; + next; + } + $depthmax = $depth if $depth > $depthmax; +} + +# draw canvas, and embed interactive JavaScript program +my $imageheight = (($depthmax + 1) * $frameheight) + $ypad1 + $ypad2; +$imageheight += $ypad3 if $subtitletext ne ""; +my $titlesize = $fontsize + 5; +my $im = SVG->new(); +my ($black, $vdgrey, $dgrey) = ( + $im->colorAllocate(0, 0, 0), + $im->colorAllocate(160, 160, 160), + $im->colorAllocate(200, 200, 200), + ); +$im->header($imagewidth, $imageheight); +my $inc = <<INC; +<defs> + <linearGradient id="background" y1="0" y2="1" x1="0" x2="0" > + <stop stop-color="$bgcolor1" offset="5%" /> + <stop stop-color="$bgcolor2" offset="95%" /> + </linearGradient> +</defs> +<style type="text/css"> + text { font-family:$fonttype; font-size:${fontsize}px; fill:$black; } + #search, #ignorecase { opacity:0.1; cursor:pointer; } + #search:hover, #search.show, #ignorecase:hover, #ignorecase.show { opacity:1; } + #subtitle { text-anchor:middle; font-color:$vdgrey; } + #title { text-anchor:middle; font-size:${titlesize}px} + #unzoom { cursor:pointer; } + #frames > *:hover { stroke:black; stroke-width:0.5; cursor:pointer; } + .hide { display:none; } + .parent { opacity:0.5; } +</style> +<script type="text/ecmascript"> +<![CDATA[ + "use strict"; + var details, searchbtn, unzoombtn, matchedtxt, svg, searching, currentSearchTerm, ignorecase, ignorecaseBtn; + function init(evt) { + details = document.getElementById("details").firstChild; + searchbtn = document.getElementById("search"); + ignorecaseBtn = document.getElementById("ignorecase"); + unzoombtn = document.getElementById("unzoom"); + matchedtxt = document.getElementById("matched"); + svg = document.getElementsByTagName("svg")[0]; + searching = 0; + currentSearchTerm = null; + } + + window.addEventListener("click", function(e) { + var target = find_group(e.target); + if (target) { + if (target.nodeName == "a") { + if (e.ctrlKey === false) return; + e.preventDefault(); + } + if (target.classList.contains("parent")) unzoom(); + zoom(target); + } + else if (e.target.id == "unzoom") unzoom(); + else if (e.target.id == "search") search_prompt(); + else if (e.target.id == "ignorecase") toggle_ignorecase(); + }, false) + + // mouse-over for info + // show + window.addEventListener("mouseover", function(e) { + var target = find_group(e.target); + if (target) details.nodeValue = "$nametype " + g_to_text(target); + }, false) + + // clear + window.addEventListener("mouseout", function(e) { + var target = find_group(e.target); + if (target) details.nodeValue = ' '; + }, false) + + // ctrl-F for search + window.addEventListener("keydown",function (e) { + if (e.keyCode === 114 || (e.ctrlKey && e.keyCode === 70)) { + e.preventDefault(); + search_prompt(); + } + }, false) + + // ctrl-I to toggle case-sensitive search + window.addEventListener("keydown",function (e) { + if (e.ctrlKey && e.keyCode === 73) { + e.preventDefault(); + toggle_ignorecase(); + } + }, false) + + // functions + function find_child(node, selector) { + var children = node.querySelectorAll(selector); + if (children.length) return children[0]; + return; + } + function find_group(node) { + var parent = node.parentElement; + if (!parent) return; + if (parent.id == "frames") return node; + return find_group(parent); + } + function orig_save(e, attr, val) { + if (e.attributes["_orig_" + attr] != undefined) return; + if (e.attributes[attr] == undefined) return; + if (val == undefined) val = e.attributes[attr].value; + e.setAttribute("_orig_" + attr, val); + } + function orig_load(e, attr) { + if (e.attributes["_orig_"+attr] == undefined) return; + e.attributes[attr].value = e.attributes["_orig_" + attr].value; + e.removeAttribute("_orig_"+attr); + } + function g_to_text(e) { + var text = find_child(e, "title").firstChild.nodeValue; + return (text) + } + function g_to_func(e) { + var func = g_to_text(e); + // if there's any manipulation we want to do to the function + // name before it's searched, do it here before returning. + return (func); + } + function update_text(e) { + var r = find_child(e, "rect"); + var t = find_child(e, "text"); + var w = parseFloat(r.attributes.width.value) -3; + var txt = find_child(e, "title").textContent.replace(/\\([^(]*\\)\$/,""); + t.attributes.x.value = parseFloat(r.attributes.x.value) + 3; + + // Smaller than this size won't fit anything + if (w < 2 * $fontsize * $fontwidth) { + t.textContent = ""; + return; + } + + t.textContent = txt; + // Fit in full text width + if (/^ *\$/.test(txt) || t.getSubStringLength(0, txt.length) < w) + return; + + for (var x = txt.length - 2; x > 0; x--) { + if (t.getSubStringLength(0, x + 2) <= w) { + t.textContent = txt.substring(0, x) + ".."; + return; + } + } + t.textContent = ""; + } + + // zoom + function zoom_reset(e) { + if (e.attributes != undefined) { + orig_load(e, "x"); + orig_load(e, "width"); + } + if (e.childNodes == undefined) return; + for (var i = 0, c = e.childNodes; i < c.length; i++) { + zoom_reset(c[i]); + } + } + function zoom_child(e, x, ratio) { + if (e.attributes != undefined) { + if (e.attributes.x != undefined) { + orig_save(e, "x"); + e.attributes.x.value = (parseFloat(e.attributes.x.value) - x - $xpad) * ratio + $xpad; + if (e.tagName == "text") + e.attributes.x.value = find_child(e.parentNode, "rect[x]").attributes.x.value + 3; + } + if (e.attributes.width != undefined) { + orig_save(e, "width"); + e.attributes.width.value = parseFloat(e.attributes.width.value) * ratio; + } + } + + if (e.childNodes == undefined) return; + for (var i = 0, c = e.childNodes; i < c.length; i++) { + zoom_child(c[i], x - $xpad, ratio); + } + } + function zoom_parent(e) { + if (e.attributes) { + if (e.attributes.x != undefined) { + orig_save(e, "x"); + e.attributes.x.value = $xpad; + } + if (e.attributes.width != undefined) { + orig_save(e, "width"); + e.attributes.width.value = parseInt(svg.width.baseVal.value) - ($xpad * 2); + } + } + if (e.childNodes == undefined) return; + for (var i = 0, c = e.childNodes; i < c.length; i++) { + zoom_parent(c[i]); + } + } + function zoom(node) { + var attr = find_child(node, "rect").attributes; + var width = parseFloat(attr.width.value); + var xmin = parseFloat(attr.x.value); + var xmax = parseFloat(xmin + width); + var ymin = parseFloat(attr.y.value); + var ratio = (svg.width.baseVal.value - 2 * $xpad) / width; + + // XXX: Workaround for JavaScript float issues (fix me) + var fudge = 0.0001; + + unzoombtn.classList.remove("hide"); + + var el = document.getElementById("frames").children; + for (var i = 0; i < el.length; i++) { + var e = el[i]; + var a = find_child(e, "rect").attributes; + var ex = parseFloat(a.x.value); + var ew = parseFloat(a.width.value); + var upstack; + // Is it an ancestor + if ($inverted == 0) { + upstack = parseFloat(a.y.value) > ymin; + } else { + upstack = parseFloat(a.y.value) < ymin; + } + if (upstack) { + // Direct ancestor + if (ex <= xmin && (ex+ew+fudge) >= xmax) { + e.classList.add("parent"); + zoom_parent(e); + update_text(e); + } + // not in current path + else + e.classList.add("hide"); + } + // Children maybe + else { + // no common path + if (ex < xmin || ex + fudge >= xmax) { + e.classList.add("hide"); + } + else { + zoom_child(e, xmin, ratio); + update_text(e); + } + } + } + search(); + } + function unzoom() { + unzoombtn.classList.add("hide"); + var el = document.getElementById("frames").children; + for(var i = 0; i < el.length; i++) { + el[i].classList.remove("parent"); + el[i].classList.remove("hide"); + zoom_reset(el[i]); + update_text(el[i]); + } + search(); + } + + // search + function toggle_ignorecase() { + ignorecase = !ignorecase; + if (ignorecase) { + ignorecaseBtn.classList.add("show"); + } else { + ignorecaseBtn.classList.remove("show"); + } + reset_search(); + search(); + } + function reset_search() { + var el = document.querySelectorAll("#frames rect"); + for (var i = 0; i < el.length; i++) { + orig_load(el[i], "fill") + } + } + function search_prompt() { + if (!searching) { + var term = prompt("Enter a search term (regexp " + + "allowed, eg: ^ext4_)" + + (ignorecase ? ", ignoring case" : "") + + "\\nPress Ctrl-i to toggle case sensitivity", ""); + if (term != null) { + currentSearchTerm = term; + search(); + } + } else { + reset_search(); + searching = 0; + currentSearchTerm = null; + searchbtn.classList.remove("show"); + searchbtn.firstChild.nodeValue = "Search" + matchedtxt.classList.add("hide"); + matchedtxt.firstChild.nodeValue = "" + } + } + function search(term) { + if (currentSearchTerm === null) return; + var term = currentSearchTerm; + + var re = new RegExp(term, ignorecase ? 'i' : ''); + var el = document.getElementById("frames").children; + var matches = new Object(); + var maxwidth = 0; + for (var i = 0; i < el.length; i++) { + var e = el[i]; + var func = g_to_func(e); + var rect = find_child(e, "rect"); + if (func == null || rect == null) + continue; + + // Save max width. Only works as we have a root frame + var w = parseFloat(rect.attributes.width.value); + if (w > maxwidth) + maxwidth = w; + + if (func.match(re)) { + // highlight + var x = parseFloat(rect.attributes.x.value); + orig_save(rect, "fill"); + rect.attributes.fill.value = "$searchcolor"; + + // remember matches + if (matches[x] == undefined) { + matches[x] = w; + } else { + if (w > matches[x]) { + // overwrite with parent + matches[x] = w; + } + } + searching = 1; + } + } + if (!searching) + return; + + searchbtn.classList.add("show"); + searchbtn.firstChild.nodeValue = "Reset Search"; + + // calculate percent matched, excluding vertical overlap + var count = 0; + var lastx = -1; + var lastw = 0; + var keys = Array(); + for (k in matches) { + if (matches.hasOwnProperty(k)) + keys.push(k); + } + // sort the matched frames by their x location + // ascending, then width descending + keys.sort(function(a, b){ + return a - b; + }); + // Step through frames saving only the biggest bottom-up frames + // thanks to the sort order. This relies on the tree property + // where children are always smaller than their parents. + var fudge = 0.0001; // JavaScript floating point + for (var k in keys) { + var x = parseFloat(keys[k]); + var w = matches[keys[k]]; + if (x >= lastx + lastw - fudge) { + count += w; + lastx = x; + lastw = w; + } + } + // display matched percent + matchedtxt.classList.remove("hide"); + var pct = 100 * count / maxwidth; + if (pct != 100) pct = pct.toFixed(1) + matchedtxt.firstChild.nodeValue = "Matched: " + pct + "%"; + } +]]> +</script> +INC +$im->include($inc); +$im->filledRectangle(0, 0, $imagewidth, $imageheight, 'url(#background)'); +$im->stringTTF("title", int($imagewidth / 2), $fontsize * 2, $titletext); +$im->stringTTF("subtitle", int($imagewidth / 2), $fontsize * 4, $subtitletext) if $subtitletext ne ""; +$im->stringTTF("details", $xpad, $imageheight - ($ypad2 / 2), " "); +$im->stringTTF("unzoom", $xpad, $fontsize * 2, "Reset Zoom", 'class="hide"'); +$im->stringTTF("search", $imagewidth - $xpad - 100, $fontsize * 2, "Search"); +$im->stringTTF("ignorecase", $imagewidth - $xpad - 16, $fontsize * 2, "ic"); +$im->stringTTF("matched", $imagewidth - $xpad - 100, $imageheight - ($ypad2 / 2), " "); + +if ($palette) { + read_palette(); +} + +# draw frames +$im->group_start({id => "frames"}); +while (my ($id, $node) = each %Node) { + my ($func, $depth, $etime) = split ";", $id; + my $stime = $node->{stime}; + my $delta = $node->{delta}; + + $etime = $timemax if $func eq "" and $depth == 0; + + my $x1 = $xpad + $stime * $widthpertime; + my $x2 = $xpad + $etime * $widthpertime; + my ($y1, $y2); + unless ($inverted) { + $y1 = $imageheight - $ypad2 - ($depth + 1) * $frameheight + $framepad; + $y2 = $imageheight - $ypad2 - $depth * $frameheight; + } else { + $y1 = $ypad1 + $depth * $frameheight; + $y2 = $ypad1 + ($depth + 1) * $frameheight - $framepad; + } + + my $samples = sprintf "%.0f", ($etime - $stime) * $factor; + (my $samples_txt = $samples) # add commas per perlfaq5 + =~ s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g; + + my $info; + if ($func eq "" and $depth == 0) { + $info = "all ($samples_txt $countname, 100%)"; + } else { + my $pct = sprintf "%.2f", ((100 * $samples) / ($timemax * $factor)); + my $escaped_func = $func; + # clean up SVG breaking characters: + $escaped_func =~ s/&/&/g; + $escaped_func =~ s/</</g; + $escaped_func =~ s/>/>/g; + $escaped_func =~ s/"/"/g; + $escaped_func =~ s/_\[[kwij]\]$//; # strip any annotation + unless (defined $delta) { + $info = "$escaped_func ($samples_txt $countname, $pct%)"; + } else { + my $d = $negate ? -$delta : $delta; + my $deltapct = sprintf "%.2f", ((100 * $d) / ($timemax * $factor)); + $deltapct = $d > 0 ? "+$deltapct" : $deltapct; + $info = "$escaped_func ($samples_txt $countname, $pct%; $deltapct%)"; + } + } + + my $nameattr = { %{ $nameattr{$func}||{} } }; # shallow clone + $nameattr->{title} ||= $info; + $im->group_start($nameattr); + + my $color; + if ($func eq "--") { + $color = $vdgrey; + } elsif ($func eq "-") { + $color = $dgrey; + } elsif (defined $delta) { + $color = color_scale($delta, $maxdelta); + } elsif ($palette) { + $color = color_map($colors, $func); + } else { + $color = color($colors, $hash, $func); + } + $im->filledRectangle($x1, $y1, $x2, $y2, $color, 'rx="2" ry="2"'); + + my $chars = int( ($x2 - $x1) / ($fontsize * $fontwidth)); + my $text = ""; + if ($chars >= 3) { # room for one char plus two dots + $func =~ s/_\[[kwij]\]$//; # strip any annotation + $text = substr $func, 0, $chars; + substr($text, -2, 2) = ".." if $chars < length $func; + $text =~ s/&/&/g; + $text =~ s/</</g; + $text =~ s/>/>/g; + } + $im->stringTTF(undef, $x1 + 3, 3 + ($y1 + $y2) / 2, $text); + + $im->group_end($nameattr); +} +$im->group_end(); + +print $im->svg; + +if ($palette) { + write_palette(); +} + +# vim: ts=8 sts=8 sw=8 noexpandtab diff --git a/contrib/tools/flame-graph/jmaps b/contrib/tools/flame-graph/jmaps new file mode 100755 index 0000000000..f8014f5a82 --- /dev/null +++ b/contrib/tools/flame-graph/jmaps @@ -0,0 +1,104 @@ +#!/bin/bash +# +# jmaps - creates java /tmp/perf-PID.map symbol maps for all java processes. +# +# This is a helper script that finds all running "java" processes, then executes +# perf-map-agent on them all, creating symbol map files in /tmp. These map files +# are read by perf_events (aka "perf") when doing system profiles (specifically, +# the "report" and "script" subcommands). +# +# USAGE: jmaps [-u] +# -u # unfoldall: include inlined symbols +# +# My typical workflow is this: +# +# perf record -F 99 -a -g -- sleep 30; jmaps +# perf script > out.stacks +# ./stackcollapse-perf.pl out.stacks | ./flamegraph.pl --color=java --hash > out.stacks.svg +# +# The stackcollapse-perf.pl and flamegraph.pl programs come from: +# https://github.com/brendangregg/FlameGraph +# +# REQUIREMENTS: +# Tune two environment settings below. +# +# 13-Feb-2015 Brendan Gregg Created this. +# 20-Feb-2017 " " Added -u for unfoldall. + +JAVA_HOME=${JAVA_HOME:-/usr/lib/jvm/java-8-oracle} +AGENT_HOME=${AGENT_HOME:-/usr/lib/jvm/perf-map-agent} # from https://github.com/jvm-profiling-tools/perf-map-agent +debug=0 + +if [[ "$USER" != root ]]; then + echo "ERROR: not root user? exiting..." + exit +fi + +if [[ ! -x $JAVA_HOME ]]; then + echo "ERROR: JAVA_HOME not set correctly; edit $0 and fix" + exit +fi + +if [[ ! -x $AGENT_HOME ]]; then + echo "ERROR: AGENT_HOME not set correctly; edit $0 and fix" + exit +fi + +if [[ "$1" == "-u" ]]; then + opts=unfoldall +fi + +# figure out where the agent files are: +AGENT_OUT="" +AGENT_JAR="" +if [[ -e $AGENT_HOME/out/attach-main.jar ]]; then + AGENT_JAR=$AGENT_HOME/out/attach-main.jar +elif [[ -e $AGENT_HOME/attach-main.jar ]]; then + AGENT_JAR=$AGENT_HOME/attach-main.jar +fi +if [[ -e $AGENT_HOME/out/libperfmap.so ]]; then + AGENT_OUT=$AGENT_HOME/out +elif [[ -e $AGENT_HOME/libperfmap.so ]]; then + AGENT_OUT=$AGENT_HOME +fi +if [[ "$AGENT_OUT" == "" || "$AGENT_JAR" == "" ]]; then + echo "ERROR: Missing perf-map-agent files in $AGENT_HOME. Check installation." + exit +fi + +# Fetch map for all "java" processes +echo "Fetching maps for all java processes..." +for pid in $(pgrep -x java); do + mapfile=/tmp/perf-$pid.map + [[ -e $mapfile ]] && rm $mapfile + + cmd="cd $AGENT_OUT; $JAVA_HOME/bin/java -Xms32m -Xmx128m -cp $AGENT_JAR:$JAVA_HOME/lib/tools.jar net.virtualvoid.perf.AttachOnce $pid $opts" + (( debug )) && echo $cmd + + user=$(ps ho user -p $pid) + group=$(ps ho group -p $pid) + if [[ "$user" != root ]]; then + if [[ "$user" == [0-9]* ]]; then + # UID only, likely GID too, run sudo with #UID: + cmd="sudo -u '#'$user -g '#'$group sh -c '$cmd'" + else + cmd="sudo -u $user -g $group sh -c '$cmd'" + fi + fi + + echo "Mapping PID $pid (user $user):" + if (( debug )); then + time eval $cmd + else + eval $cmd + fi + if [[ -e "$mapfile" ]]; then + chown root $mapfile + chmod 666 $mapfile + else + echo "ERROR: $mapfile not created." + fi + + echo "wc(1): $(wc $mapfile)" + echo +done diff --git a/contrib/tools/flame-graph/pkgsplit-perf.pl b/contrib/tools/flame-graph/pkgsplit-perf.pl new file mode 100755 index 0000000000..3a9902da49 --- /dev/null +++ b/contrib/tools/flame-graph/pkgsplit-perf.pl @@ -0,0 +1,86 @@ +#!/usr/bin/perl -w +# +# pkgsplit-perf.pl Split IP samples on package names "/", eg, Java. +# +# This is for the creation of Java package flame graphs. Example steps: +# +# perf record -F 199 -a -- sleep 30; ./jmaps +# perf script | ./pkgsplit-perf.pl | ./flamegraph.pl > out.svg +# +# Note that stack traces are not sampled (no -g), as we split Java package +# names into frames rather than stack frames. +# +# (jmaps is a helper script for automating perf-map-agent: Java symbol dumps.) +# +# The default output of "perf script" varies between kernel versions, so we'll +# need to deal with that here. I could make people use the perf script option +# to pick fields, so our input is static, but A) I prefer the simplicity of +# just saying: run "perf script", and B) the option to choose fields itself +# changed between kernel versions! -f became -F. +# +# Copyright 2017 Netflix, Inc. +# Licensed under the Apache License, Version 2.0 (the "License") +# +# 20-Sep-2016 Brendan Gregg Created this. + +use strict; + +my $include_pname = 1; # include process names in stacks +my $include_pid = 0; # include process ID with process name +my $include_tid = 0; # include process & thread ID with process name + +while (<>) { + # filter comments + next if /^#/; + + # filter idle events + next if /xen_hypercall_sched_op|cpu_idle|native_safe_halt/; + + my ($pid, $tid, $pname); + + # Linux 3.13: + # java 13905 [000] 8048.096572: cpu-clock: 7fd781ac3053 Ljava/util/Arrays$ArrayList;::toArray (/tmp/perf-12149.map) + # java 8301 [050] 13527.392454: cycles: 7fa8a80d9bff Dictionary::find(int, unsigned int, Symbol*, ClassLoaderData*, Handle, Thread*) (/usr/lib/jvm/java-8-oracle-1.8.0.121/jre/lib/amd64/server/libjvm.so) + # java 4567/8603 [023] 13527.389886: cycles: 7fa863349895 Lcom/google/gson/JsonObject;::add (/tmp/perf-4567.map) + # + # Linux 4.8: + # java 30894 [007] 452884.077440: 10101010 cpu-clock: 7f0acc8eff67 Lsun/nio/ch/SocketChannelImpl;::read+0x27 (/tmp/perf-30849.map) + # bash 26858/26858 [006] 5440237.995639: cpu-clock: 433573 [unknown] (/bin/bash) + # + if (/^\s+(\S.+?)\s+(\d+)\/*(\d+)*\s.*?:.*:/) { + # parse process name and pid/tid + if ($3) { + ($pid, $tid) = ($2, $3); + } else { + ($pid, $tid) = ("?", $2); + } + + if ($include_tid) { + $pname = "$1-$pid/$tid"; + } elsif ($include_pid) { + $pname = "$1-$pid"; + } else { + $pname = $1; + } + $pname =~ tr/ /_/; + } else { + # not a match + next; + } + + # parse rest of line + s/^.*?:.*?:\s+//; + s/ \(.*?\)$//; + chomp; + my ($addr, $func) = split(' ', $_, 2); + + # strip Java's leading "L" + $func =~ s/^L//; + + # replace numbers with X + $func =~ s/[0-9]/X/g; + + # colon delimitered + $func =~ s:/:;:g; + print "$pname;$func 1\n"; +} diff --git a/contrib/tools/flame-graph/range-perf.pl b/contrib/tools/flame-graph/range-perf.pl new file mode 100755 index 0000000000..0fca6decd2 --- /dev/null +++ b/contrib/tools/flame-graph/range-perf.pl @@ -0,0 +1,137 @@ +#!/usr/bin/perl -w +# +# range-perf Extract a time range from Linux "perf script" output. +# +# USAGE EXAMPLE: +# +# perf record -F 100 -a -- sleep 60 +# perf script | ./perf2range.pl 10 20 # range 10 to 20 seconds only +# perf script | ./perf2range.pl 0 0.5 # first half second only +# +# MAKING A SERIES OF FLAME GRAPHS: +# +# Let's say you had the output of "perf script" in a file, out.stacks01, which +# was for a 180 second profile. The following command creates a series of +# flame graphs for each 10 second interval: +# +# for i in `seq 0 10 170`; do cat out.stacks01 | \ +# ./perf2range.pl $i $((i + 10)) | ./stackcollapse-perf.pl | \ +# grep -v cpu_idle | ./flamegraph.pl --hash --color=java \ +# --title="range $i $((i + 10))" > out.range_$i.svg; echo $i done; done +# +# In that example, I used "--color=java" for the Java palette, and excluded +# the idle CPU task. Customize as needed. +# +# Copyright 2017 Netflix, Inc. +# Licensed under the Apache License, Version 2.0 (the "License") +# +# 21-Feb-2017 Brendan Gregg Created this. + +use strict; +use Getopt::Long; +use POSIX 'floor'; + +sub usage { + die <<USAGE_END; +USAGE: $0 [options] min_seconds max_seconds + --timeraw # use raw timestamps from perf + --timezerosecs # time starts at 0 secs, but keep offset from perf + eg, + $0 10 20 # only include samples between 10 and 20 seconds +USAGE_END +} + +my $timeraw = 0; +my $timezerosecs = 0; +GetOptions( + 'timeraw' => \$timeraw, + 'timezerosecs' => \$timezerosecs, +) or usage(); + +if (@ARGV < 2 || $ARGV[0] eq "-h" || $ARGV[0] eq "--help") { + usage(); + exit; +} +my $begin = $ARGV[0]; +my $end = $ARGV[1]; + +# +# Parsing +# +# IP only examples: +# +# java 52025 [026] 99161.926202: cycles: +# java 14341 [016] 252732.474759: cycles: 7f36571947c0 nmethod::is_nmethod() const (/... +# java 14514 [022] 28191.353083: cpu-clock: 7f92b4fdb7d4 Ljava_util_List$size$0;::call (/tmp/perf-11936.map) +# swapper 0 [002] 6035557.056977: 10101010 cpu-clock: ffffffff810013aa xen_hypercall_sched_op+0xa (/lib/modules/4.9-virtual/build/vmlinux) +# bash 25370 603are 6036.991603: 10101010 cpu-clock: 4b931e [unknown] (/bin/bash) +# bash 25370/25370 6036036.799684: cpu-clock: 4b913b [unknown] (/bin/bash) +# other combinations are possible. +# +# Stack examples (-g): +# +# swapper 0 [021] 28648.467059: cpu-clock: +# ffffffff810013aa xen_hypercall_sched_op ([kernel.kallsyms]) +# ffffffff8101cb2f default_idle ([kernel.kallsyms]) +# ffffffff8101d406 arch_cpu_idle ([kernel.kallsyms]) +# ffffffff810bf475 cpu_startup_entry ([kernel.kallsyms]) +# ffffffff81010228 cpu_bringup_and_idle ([kernel.kallsyms]) +# +# java 14375 [022] 28648.467079: cpu-clock: +# 7f92bdd98965 Ljava/io/OutputStream;::write (/tmp/perf-11936.map) +# 7f8808cae7a8 [unknown] ([unknown]) +# +# swapper 0 [005] 5076.836336: cpu-clock: +# ffffffff81051586 native_safe_halt ([kernel.kallsyms]) +# ffffffff8101db4f default_idle ([kernel.kallsyms]) +# ffffffff8101e466 arch_cpu_idle ([kernel.kallsyms]) +# ffffffff810c2b31 cpu_startup_entry ([kernel.kallsyms]) +# ffffffff810427cd start_secondary ([kernel.kallsyms]) +# +# swapper 0 [002] 6034779.719110: 10101010 cpu-clock: +# 2013aa xen_hypercall_sched_op+0xfe20000a (/lib/modules/4.9-virtual/build/vmlinux) +# a72f0e default_idle+0xfe20001e (/lib/modules/4.9-virtual/build/vmlinux) +# 2392bf arch_cpu_idle+0xfe20000f (/lib/modules/4.9-virtual/build/vmlinux) +# a73333 default_idle_call+0xfe200023 (/lib/modules/4.9-virtual/build/vmlinux) +# 2c91a4 cpu_startup_entry+0xfe2001c4 (/lib/modules/4.9-virtual/build/vmlinux) +# 22b64a cpu_bringup_and_idle+0xfe20002a (/lib/modules/4.9-virtual/build/vmlinux) +# +# bash 25370/25370 6035935.188539: cpu-clock: +# b9218 [unknown] (/bin/bash) +# 2037fe8 [unknown] ([unknown]) +# other combinations are possible. +# +# This regexp matches the event line, and puts time in $1, and the event name +# in $2: +# +my $event_regexp = qr/ +([0-9\.]+): *\S* *(\S+):/; + +my $line; +my $start = 0; +my $ok = 0; +my $time; + +while (1) { + $line = <STDIN>; + last unless defined $line; + next if $line =~ /^#/; # skip comments + + if ($line =~ $event_regexp) { + my ($ts, $event) = ($1, $2, $3); + $start = $ts if $start == 0; + + if ($timezerosecs) { + $time = $ts - floor($start); + } elsif (!$timeraw) { + $time = $ts - $start; + } else { + $time = $ts; # raw times + } + + $ok = 1 if $time >= $begin; + # assume samples are in time order: + exit if $time > $end; + } + + print $line if $ok; +} diff --git a/contrib/tools/flame-graph/stackcollapse-aix.pl b/contrib/tools/flame-graph/stackcollapse-aix.pl new file mode 100755 index 0000000000..8456d56b91 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-aix.pl @@ -0,0 +1,61 @@ +#!/usr/bin/perl -ws +# +# stackcollapse-aix Collapse AIX /usr/bin/procstack backtraces +# +# Parse a list of backtraces as generated with the poor man's aix-perf.pl +# profiler +# + +use strict; + +my $process = ""; +my $current = ""; +my $previous_function = ""; + +my %stacks; + +while(<>) { + chomp; + if (m/^\d+:/) { + if(!($current eq "")) { + $current = $process . ";" . $current; + $stacks{$current} += 1; + $current = ""; + } + m/^\d+: ([^ ]*)/; + $process = $1; + $current = ""; + } + elsif(m/^---------- tid# \d+/){ + if(!($current eq "")) { + $current = $process . ";" . $current; + $stacks{$current} += 1; + } + $current = ""; + } + elsif(m/^(0x[0-9abcdef]*) *([^ ]*) ([^ ]*) ([^ ]*)/) { + my $function = $2; + my $alt = $1; + $function=~s/\(.*\)?//; + if($function =~ /^\[.*\]$/) { + $function = $alt; + } + if ($current) { + $current = $function . ";" . $current; + } + else { + $current = $function; + } + } +} + +if(!($current eq "")) { + $current = $process . ";" . $current; + $stacks{$current} += 1; + $current = ""; + $process = ""; +} + +foreach my $k (sort { $a cmp $b } keys %stacks) { + print "$k $stacks{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-bpftrace.pl b/contrib/tools/flame-graph/stackcollapse-bpftrace.pl new file mode 100755 index 0000000000..e5d32a5769 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-bpftrace.pl @@ -0,0 +1,66 @@ +#!/usr/bin/perl -w +# +# stackcollapse-bpftrace.pl collapse bpftrace samples into single lines. +# +# USAGE ./stackcollapse-bpftrace.pl infile > outfile +# +# Example input: +# +# @[ +# _raw_spin_lock_bh+0 +# tcp_recvmsg+808 +# inet_recvmsg+81 +# sock_recvmsg+67 +# sock_read_iter+144 +# new_sync_read+228 +# __vfs_read+41 +# vfs_read+142 +# sys_read+85 +# do_syscall_64+115 +# entry_SYSCALL_64_after_hwframe+61 +# ]: 3 +# +# Example output: +# +# entry_SYSCALL_64_after_hwframe+61;do_syscall_64+115;sys_read+85;vfs_read+142;__vfs_read+41;new_sync_read+228;sock_read_iter+144;sock_recvmsg+67;inet_recvmsg+81;tcp_recvmsg+808;_raw_spin_lock_bh+0 3 +# +# Copyright 2018 Peter Sanford. All rights reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License +# as published by the Free Software Foundation; either version 2 +# of the License, or (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software Foundation, +# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. +# +# (http://www.gnu.org/copyleft/gpl.html) +# + +use strict; + +my @stack; +my $in_stack = 0; + +foreach (<>) { + chomp; + if (!$in_stack) { + if (/^@\[/) { + $in_stack = 1; + } + } else { + if (m/^\]: (\d+)/) { + print join(';', reverse(@stack)) . " $1\n"; + $in_stack = 0; + @stack = (); + } else { + push(@stack, $_); + } + } +} diff --git a/contrib/tools/flame-graph/stackcollapse-elfutils.pl b/contrib/tools/flame-graph/stackcollapse-elfutils.pl new file mode 100755 index 0000000000..c5e6b17e44 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-elfutils.pl @@ -0,0 +1,98 @@ +#!/usr/bin/perl -w +# +# stackcollapse-elfutils Collapse elfutils stack (eu-stack) backtraces +# +# Parse a list of elfutils backtraces as generated with the poor man's +# profiler [1]: +# +# for x in $(seq 1 "$nsamples"); do +# eu-stack -p "$pid" "$@" +# sleep "$sleeptime" +# done +# +# [1] http://poormansprofiler.org/ +# +# Copyright 2014 Gabriel Corona. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END + +use strict; +use Getopt::Long; + +my $with_pid = 0; +my $with_tid = 0; + +GetOptions('pid' => \$with_pid, + 'tid' => \$with_tid) +or die <<USAGE_END; +USAGE: $0 [options] infile > outfile\n + --pid # include PID + --tid # include TID +USAGE_END + +my $pid = ""; +my $tid = ""; +my $current = ""; +my $previous_function = ""; + +my %stacks; + +sub add_current { + if(!($current eq "")) { + my $entry; + if ($with_tid) { + $current = "TID=$tid;$current"; + } + if ($with_pid) { + $current = "PID=$pid;$current"; + } + $stacks{$current} += 1; + $current = ""; + } +} + +while(<>) { + chomp; + if (m/^PID ([0-9]*)/) { + add_current(); + $pid = $1; + } + elsif(m/^TID ([0-9]*)/) { + add_current(); + $tid = $1; + } elsif(m/^#[0-9]* *0x[0-9a-f]* (.*)/) { + if ($current eq "") { + $current = $1; + } else { + $current = "$1;$current"; + } + } elsif(m/^#[0-9]* *0x[0-9a-f]*/) { + if ($current eq "") { + $current = "[unknown]"; + } else { + $current = "[unknown];$current"; + } + } +} +add_current(); + +foreach my $k (sort { $a cmp $b } keys %stacks) { + print "$k $stacks{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-gdb.pl b/contrib/tools/flame-graph/stackcollapse-gdb.pl new file mode 100755 index 0000000000..8e9831b22e --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-gdb.pl @@ -0,0 +1,72 @@ +#!/usr/bin/perl -ws +# +# stackcollapse-gdb Collapse GDB backtraces +# +# Parse a list of GDB backtraces as generated with the poor man's +# profiler [1]: +# +# for x in $(seq 1 500); do +# gdb -ex "set pagination 0" -ex "thread apply all bt" -batch -p $pid 2> /dev/null +# sleep 0.01 +# done +# +# [1] http://poormansprofiler.org/ +# +# Copyright 2014 Gabriel Corona. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END + +use strict; + +my $current = ""; +my $previous_function = ""; + +my %stacks; + +while(<>) { + chomp; + if (m/^Thread/) { + $current="" + } + elsif(m/^#[0-9]* *([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*)/) { + my $function = $3; + my $alt = $1; + if(not($1 =~ /0x[a-zA-Z0-9]*/)) { + $function = $alt; + } + if ($current eq "") { + $current = $function; + } else { + $current = $function . ";" . $current; + } + } elsif(!($current eq "")) { + $stacks{$current} += 1; + $current = ""; + } +} + +if(!($current eq "")) { + $stacks{$current} += 1; + $current = ""; +} + +foreach my $k (sort { $a cmp $b } keys %stacks) { + print "$k $stacks{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-go.pl b/contrib/tools/flame-graph/stackcollapse-go.pl new file mode 100755 index 0000000000..3b2ce3c552 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-go.pl @@ -0,0 +1,150 @@ +#!/usr/bin/perl -w +# +# stackcollapse-go.pl collapse golang samples into single lines. +# +# Parses golang smaples generated by "go tool pprof" and outputs stacks as +# single lines, with methods separated by semicolons, and then a space and an +# occurrence count. For use with flamegraph.pl. +# +# USAGE: ./stackcollapse-go.pl infile > outfile +# +# Example Input: +# ... +# Samples: +# samples/count cpu/nanoseconds +# 1 10000000: 1 2 +# 2 10000000: 3 2 +# 1 10000000: 4 2 +# ... +# Locations +# 1: 0x58b265 scanblock :0 s=0 +# 2: 0x599530 GC :0 s=0 +# 3: 0x58a999 flushptrbuf :0 s=0 +# 4: 0x58d6a8 runtime.MSpan_Sweep :0 s=0 +# ... +# Mappings +# ... +# +# Example Output: +# +# GC;flushptrbuf 2 +# GC;runtime.MSpan_Sweep 1 +# GC;scanblock 1 +# +# Input may contain many stacks as generated from go tool pprof: +# +# go tool pprof -seconds=60 -raw -output=a.pprof http://$ADDR/debug/pprof/profile +# +# For format of text profile, See golang/src/internal/pprof/profile/profile.go +# +# Copyright 2017 Sijie Yang (yangsijie@baidu.com). All rights reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License +# as published by the Free Software Foundation; either version 2 +# of the License, or (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software Foundation, +# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. +# +# (http://www.gnu.org/copyleft/gpl.html) +# +# 16-Jan-2017 Sijie Yang Created this. + +use strict; + +use Getopt::Long; + +# tunables +my $help = 0; + +sub usage { + die <<USAGE_END; +USAGE: $0 infile > outfile\n +USAGE_END +} + +GetOptions( + 'help' => \$help, +) or usage(); +$help && usage(); + +# internals +my $state = "ignore"; +my %stacks; +my %frames; +my %collapsed; + +sub remember_stack { + my ($stack, $count) = @_; + $stacks{$stack} += $count; +} + +# +# Output stack string in required format. For example, for the following samples, +# format_statck() would return GC;runtime.MSpan_Sweep for stack "4 2" +# +# Locations +# 1: 0x58b265 scanblock :0 s=0 +# 2: 0x599530 GC :0 s=0 +# 3: 0x58a999 flushptrbuf :0 s=0 +# 4: 0x58d6a8 runtime.MSpan_Sweep :0 s=0 +# +sub format_statck { + my ($stack) = @_; + my @loc_list = split(/ /, $stack); + + for (my $i=0; $i<=$#loc_list; $i++) { + my $loc_name = $frames{$loc_list[$i]}; + $loc_list[$i] = $loc_name if ($loc_name); + } + return join(";", reverse(@loc_list)); +} + +foreach (<>) { + next if m/^#/; + chomp; + + if ($state eq "ignore") { + if (/Samples:/) { + $state = "sample"; + next; + } + + } elsif ($state eq "sample") { + if (/^\s*([0-9]+)\s*[0-9]+: ([0-9 ]+)/) { + my $samples = $1; + my $stack = $2; + remember_stack($stack, $samples); + } elsif (/Locations/) { + $state = "location"; + next; + } + + } elsif ($state eq "location") { + if (/^\s*([0-9]*): 0x[0-9a-f]+ (M=[0-9]+ )?([^ ]+) .*/) { + my $loc_id = $1; + my $loc_name = $3; + $frames{$loc_id} = $loc_name; + } elsif (/Mappings/) { + $state = "mapping"; + last; + } + } +} + +foreach my $k (keys %stacks) { + my $stack = format_statck($k); + my $count = $stacks{$k}; + $collapsed{$stack} += $count; +} + +foreach my $k (sort { $a cmp $b } keys %collapsed) { + print "$k $collapsed{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-instruments.pl b/contrib/tools/flame-graph/stackcollapse-instruments.pl new file mode 100755 index 0000000000..fb017b9a1e --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-instruments.pl @@ -0,0 +1,26 @@ +#!/usr/bin/perl -w +# +# stackcollapse-instruments.pl +# +# Parses a CSV file containing a call tree as produced by XCode +# Instruments and produces output suitable for flamegraph.pl. +# +# USAGE: ./stackcollapse-instruments.pl infile > outfile + +use strict; + +my @stack = (); + +<>; +foreach (<>) { + chomp; + /\d+\.\d+ms[^,]+,(\d+(?:\.\d*)?),\s+,(\s*)(.+)/ or die; + my $func = $3; + my $depth = length ($2); + $stack [$depth] = $3; + foreach my $i (0 .. $depth - 1) { + print $stack [$i]; + print ";"; + } + print "$func $1\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-java-exceptions.pl b/contrib/tools/flame-graph/stackcollapse-java-exceptions.pl new file mode 100755 index 0000000000..19badbca6c --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-java-exceptions.pl @@ -0,0 +1,72 @@ +#!/usr/bin/perl -w +# +# stackcolllapse-java-exceptions.pl collapse java exceptions (found in logs) into single lines. +# +# Parses Java error stacks found in a log file and outputs them as +# single lines, with methods separated by semicolons, and then a space and an +# occurrence count. Inspired by stackcollapse-jstack.pl except that it does +# not act as a performance profiler. +# +# It can be useful if a Java process dumps a lot of different stacks in its logs +# and you want to quickly identify the biggest culprits. +# +# USAGE: ./stackcollapse-java-exceptions.pl infile > outfile +# +# Copyright 2018 Paul de Verdiere. All rights reserved. + +use strict; +use Getopt::Long; + +# tunables +my $shorten_pkgs = 0; # shorten package names +my $no_pkgs = 0; # really shorten package names!! +my $help = 0; + +sub usage { + die <<USAGE_END; +USAGE: $0 [options] infile > outfile\n + --shorten-pkgs : shorten package names + --no-pkgs : suppress package names (makes SVG much more readable) + +USAGE_END +} + +GetOptions( + 'shorten-pkgs!' => \$shorten_pkgs, + 'no-pkgs!' => \$no_pkgs, + 'help' => \$help, +) or usage(); +$help && usage(); + +my %collapsed; + +sub remember_stack { + my ($stack, $count) = @_; + $collapsed{$stack} += $count; +} + +my @stack; + +foreach (<>) { + chomp; + + if (/^\s*at ([^\(]*)/) { + my $func = $1; + if ($shorten_pkgs || $no_pkgs) { + my ($pkgs, $clsFunc) = ( $func =~ m/(.*\.)([^.]+\.[^.]+)$/ ); + $pkgs =~ s/(\w)\w*/$1/g; + $func = $no_pkgs ? $clsFunc: $pkgs . $clsFunc; + } + unshift @stack, $func; + } elsif (@stack ) { + next if m/.*waiting on .*/; + remember_stack(join(";", @stack), 1) if @stack; + undef @stack; + } +} + +remember_stack(join(";", @stack), 1) if @stack; + +foreach my $k (sort { $a cmp $b } keys %collapsed) { + print "$k $collapsed{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-jstack.pl b/contrib/tools/flame-graph/stackcollapse-jstack.pl new file mode 100755 index 0000000000..da5740b6ee --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-jstack.pl @@ -0,0 +1,176 @@ +#!/usr/bin/perl -w +# +# stackcollapse-jstack.pl collapse jstack samples into single lines. +# +# Parses Java stacks generated by jstack(1) and outputs RUNNABLE stacks as +# single lines, with methods separated by semicolons, and then a space and an +# occurrence count. This also filters some other "RUNNABLE" states that we +# know are probably not running, such as epollWait. For use with flamegraph.pl. +# +# You want this to process the output of at least 100 jstack(1)s. ie, run it +# 100 times with a sleep interval, and append to a file. This is really a poor +# man's Java profiler, due to the overheads of jstack(1), and how it isn't +# capturing stacks asynchronously. For a better profiler, see: +# http://www.brendangregg.com/blog/2014-06-12/java-flame-graphs.html +# +# USAGE: ./stackcollapse-jstack.pl infile > outfile +# +# Example input: +# +# "MyProg" #273 daemon prio=9 os_prio=0 tid=0x00007f273c038800 nid=0xe3c runnable [0x00007f28a30f2000] +# java.lang.Thread.State: RUNNABLE +# at java.net.SocketInputStream.socketRead0(Native Method) +# at java.net.SocketInputStream.read(SocketInputStream.java:121) +# ... +# at java.lang.Thread.run(Thread.java:744) +# +# Example output: +# +# MyProg;java.lang.Thread.run;java.net.SocketInputStream.read;java.net.SocketInputStream.socketRead0 1 +# +# Input may be created and processed using: +# +# i=0; while (( i++ < 200 )); do jstack PID >> out.jstacks; sleep 10; done +# cat out.jstacks | ./stackcollapse-jstack.pl > out.stacks-folded +# +# WARNING: jstack(1) incurs overheads. Test before use, or use a real profiler. +# +# Copyright 2014 Brendan Gregg. All rights reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License +# as published by the Free Software Foundation; either version 2 +# of the License, or (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software Foundation, +# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. +# +# (http://www.gnu.org/copyleft/gpl.html) +# +# 14-Sep-2014 Brendan Gregg Created this. + +use strict; + +use Getopt::Long; + +# tunables +my $include_tname = 1; # include thread names in stacks +my $include_tid = 0; # include thread IDs in stacks +my $shorten_pkgs = 0; # shorten package names +my $help = 0; + +sub usage { + die <<USAGE_END; +USAGE: $0 [options] infile > outfile\n + --include-tname + --no-include-tname # include/omit thread names in stacks (default: include) + --include-tid + --no-include-tid # include/omit thread IDs in stacks (default: omit) + --shorten-pkgs + --no-shorten-pkgs # (don't) shorten package names (default: don't shorten) + + eg, + $0 --no-include-tname stacks.txt > collapsed.txt +USAGE_END +} + +GetOptions( + 'include-tname!' => \$include_tname, + 'include-tid!' => \$include_tid, + 'shorten-pkgs!' => \$shorten_pkgs, + 'help' => \$help, +) or usage(); +$help && usage(); + + +# internals +my %collapsed; + +sub remember_stack { + my ($stack, $count) = @_; + $collapsed{$stack} += $count; +} + +my @stack; +my $tname; +my $state = "?"; + +foreach (<>) { + next if m/^#/; + chomp; + + if (m/^$/) { + # only include RUNNABLE states + goto clear if $state ne "RUNNABLE"; + + # save stack + if (defined $tname) { unshift @stack, $tname; } + remember_stack(join(";", @stack), 1) if @stack; +clear: + undef @stack; + undef $tname; + $state = "?"; + next; + } + + # + # While parsing jstack output, the $state variable may be altered from + # RUNNABLE to other states. This causes the stacks to be filtered later, + # since only RUNNABLE stacks are included. + # + + if (/^"([^"]*)/) { + my $name = $1; + + if ($include_tname) { + $tname = $name; + unless ($include_tid) { + $tname =~ s/-\d+$//; + } + } + + # set state for various background threads + $state = "BACKGROUND" if $name =~ /C. CompilerThread/; + $state = "BACKGROUND" if $name =~ /Signal Dispatcher/; + $state = "BACKGROUND" if $name =~ /Service Thread/; + $state = "BACKGROUND" if $name =~ /Attach Listener/; + + } elsif (/java.lang.Thread.State: (\S+)/) { + $state = $1 if $state eq "?"; + } elsif (/^\s*at ([^\(]*)/) { + my $func = $1; + if ($shorten_pkgs) { + my ($pkgs, $clsFunc) = ( $func =~ m/(.*\.)([^.]+\.[^.]+)$/ ); + $pkgs =~ s/(\w)\w*/$1/g; + $func = $pkgs . $clsFunc; + } + unshift @stack, $func; + + # fix state for epollWait + $state = "WAITING" if $func =~ /epollWait/; + $state = "WAITING" if $func =~ /EPoll\.wait/; + + + # fix state for various networking functions + $state = "NETWORK" if $func =~ /socketAccept$/; + $state = "NETWORK" if $func =~ /Socket.*accept0$/; + $state = "NETWORK" if $func =~ /socketRead0$/; + + } elsif (/^\s*-/ or /^2\d\d\d-/ or /^Full thread dump/ or + /^JNI global references:/) { + # skip these info lines + next; + } else { + warn "Unrecognized line: $_"; + } +} + +foreach my $k (sort { $a cmp $b } keys %collapsed) { + print "$k $collapsed{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-ljp.awk b/contrib/tools/flame-graph/stackcollapse-ljp.awk new file mode 100755 index 0000000000..59aaae3d73 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-ljp.awk @@ -0,0 +1,74 @@ +#!/usr/bin/awk -f +# +# stackcollapse-ljp.awk collapse lightweight java profile reports +# into single lines stacks. +# +# Parses a list of multiline stacks generated by: +# +# https://code.google.com/p/lightweight-java-profiler +# +# and outputs a semicolon separated stack followed by a space and a count. +# +# USAGE: ./stackcollapse-ljp.pl infile > outfile +# +# Example input: +# +# 42 3 my_func_b(prog.java:455) +# my_func_a(prog.java:123) +# java.lang.Thread.run(Thread.java:744) +# [...] +# +# Example output: +# +# java.lang.Thread.run;my_func_a;my_func_b 42 +# +# The unused number is the number of frames in each stack. +# +# Copyright 2014 Brendan Gregg. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# +# 12-Jun-2014 Brendan Gregg Created this. + +$1 == "Total" { + # We're done. Print last stack and exit. + print stack, count + exit +} + +{ + # Strip file location. Comment this out to keep. + gsub(/\(.*\)/, "") +} + +NF == 3 { + # New stack begins. Print previous buffered stack. + if (count) + print stack, count + + # Begin a new stack. + count = $1 + stack = $3 +} + +NF == 1 { + # Build stack. + stack = $1 ";" stack +} diff --git a/contrib/tools/flame-graph/stackcollapse-perf-sched.awk b/contrib/tools/flame-graph/stackcollapse-perf-sched.awk new file mode 100755 index 0000000000..142a2a18dc --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-perf-sched.awk @@ -0,0 +1,228 @@ +#!/usr/bin/awk -f + +# +# This program generates collapsed off-cpu stacks fit for use by flamegraph.pl +# from scheduler data collected via perf_events. +# +# Outputs the cumulative time off cpu in us for each distinct stack observed. +# +# Some awk variables further control behavior: +# +# record_tid If truthy, causes all stack traces to include the +# command and LWP id. +# +# record_wake_stack If truthy, stacks include the frames from the wakeup +# event in addition to the sleep event. +# See http://www.brendangregg.com/FlameGraphs/offcpuflamegraphs.html#Wakeup +# +# recurse If truthy, attempt to recursively identify and +# visualize the full wakeup stack chain. +# See http://www.brendangregg.com/FlameGraphs/offcpuflamegraphs.html#ChainGraph +# +# Note that this is only an approximation, as only the +# last sleep event is recorded (e.g. if a thread slept +# multiple times before waking another thread, only the +# last sleep event is used). Implies record_wake_stack=1 +# +# To set any of these variables from the command line, run via: +# +# stackcollapse-perf-sched.awk -v recurse=1 +# +# == Important warning == +# +# WARNING: tracing all scheduler events is very high overhead in perf. Even +# more alarmingly, there appear to be bugs in perf that prevent it from reliably +# getting consistent traces (even with large trace buffers), causing it to +# produce empty perf.data files with error messages of the form: +# +# 0x952790 [0x736d]: failed to process type: 3410 +# +# This failure is not determinisitic, so re-executing perf record will +# eventually succeed. +# +# == Usage == +# +# First, record data via perf_events: +# +# sudo perf record -g -e 'sched:sched_switch' \ +# -e 'sched:sched_stat_sleep' -e 'sched:sched_stat_blocked' \ +# -p <pid> -o perf.data -- sleep 1 +# +# Then post process with this script: +# +# sudo perf script -f time,comm,pid,tid,event,ip,sym,dso,trace -i perf.data | \ +# stackcollapse-perf-sched.awk -v recurse=1 | \ +# flamegraph.pl --color=io --countname=us >out.svg +# + +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# + +# +# Copyright (c) 2015 by MemSQL. All rights reserved. +# + +# +# Match a perf captured variable, returning just the contents. For example, for +# the following line, get_perf_captured_variable("pid") would return "27235": +# +# swapper 0 [006] 708189.626415: sched:sched_stat_sleep: comm=memsqld pid=27235 delay=100078421 [ns +# +function get_perf_captured_variable(variable) +{ + match($0, variable "=[^[:space:]]+") + return substr($0, RSTART + length(variable) + 1, + RLENGTH - length(variable) - 1) +} + +# +# The timestamp is the first field that ends in a colon, e.g.: +# +# swapper 0 [006] 708189.626415: sched:sched_stat_sleep: comm=memsqld pid=27235 delay=100078421 [ns +# +# or +# +# swapper 0/0 708189.626415: sched:sched_stat_sleep: comm=memsqld pid=27235 delay=100078421 [ns] +# +function get_perf_timestamp() +{ + match($0, " [^ :]+:") + return substr($0, RSTART + 1, RLENGTH - 2) +} + +!/^#/ && /sched:sched_switch/ { + switchcommand = get_perf_captured_variable("comm") + + switchpid = get_perf_captured_variable("prev_pid") + + switchtime=get_perf_timestamp() + + switchstack="" +} + +# +# Strip the function name from a stack entry +# +# Stack entry is expected to be formatted like: +# c60849 MyClass::Foo(unsigned long) (/home/areece/a.out) +# +function get_function_name() +{ + # We start from 2 since we don't need the hex offset. + # We stop at NF - 1 since we don't need the library path. + funcname = $2 + for (i = 3; i <= NF - 1; i++) { + funcname = funcname " " $i + } + return funcname +} + +(switchpid != 0 && /^[[:space:]]/) { + if (switchstack == "") { + switchstack = get_function_name() + } else { + switchstack = get_function_name() ";" switchstack + } +} + +(switchpid != 0 && /^$/) { + switch_stacks[switchpid] = switchstack + delete last_switch_stacks[switchpid] + switch_time[switchpid] = switchtime + + switchpid=0 + switchcommand="" + switchstack="" +} + +!/^#/ && (/sched:sched_stat_sleep/ || /sched:sched_stat_blocked/) { + wakecommand=$1 + wakepid=$2 + + waketime=get_perf_timestamp() + + stat_next_command = get_perf_captured_variable("comm") + + stat_next_pid = get_perf_captured_variable("pid") + + stat_delay_ns = int(get_perf_captured_variable("delay")) + + wakestack="" +} + +(stat_next_pid != 0 && /^[[:space:]]/) { + if (wakestack == "") { + wakestack = get_function_name() + } else { + # We build the wakestack in reverse order. + wakestack = wakestack ";" get_function_name() + } +} + +(stat_next_pid != 0 && /^$/) { + # + # For some reason, perf appears to output duplicate + # sched:sched_stat_sleep and sched:sched_stat_blocked events. We only + # handle the first event. + # + if (stat_next_pid in switch_stacks) { + last_wake_time[stat_next_pid] = waketime + + stack = switch_stacks[stat_next_pid] + if (recurse || record_wake_stack) { + stack = stack ";" wakestack + if (record_tid) { + stack = stack ";" wakecommand "-" wakepid + } else { + stack = stack ";" wakecommand + } + } + + if (recurse) { + if (last_wake_time[wakepid] > last_switch_time[stat_next_pid]) { + stack = stack ";-;" last_switch_stacks[wakepid] + } + last_switch_stacks[stat_next_pid] = stack + } + + delete switch_stacks[stat_next_pid] + + if (record_tid) { + stack_times[stat_next_command "-" stat_next_pid ";" stack] += stat_delay_ns + } else { + stack_times[stat_next_command ";" stack] += stat_delay_ns + } + } + + wakecommand="" + wakepid=0 + stat_next_pid=0 + stat_next_command="" + stat_delay_ms=0 +} + +END { + for (stack in stack_times) { + if (int(stack_times[stack] / 1000) > 0) { + print stack, int(stack_times[stack] / 1000) + } + } +} diff --git a/contrib/tools/flame-graph/stackcollapse-perf.pl b/contrib/tools/flame-graph/stackcollapse-perf.pl new file mode 100755 index 0000000000..5cefa82a0e --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-perf.pl @@ -0,0 +1,345 @@ +#!/usr/bin/perl -w +# +# stackcollapse-perf.pl collapse perf samples into single lines. +# +# Parses a list of multiline stacks generated by "perf script", and +# outputs a semicolon separated stack followed by a space and a count. +# If memory addresses (+0xd) are present, they are stripped, and resulting +# identical stacks are colased with their counts summed. +# +# USAGE: ./stackcollapse-perf.pl [options] infile > outfile +# +# Run "./stackcollapse-perf.pl -h" to list options. +# +# Example input: +# +# swapper 0 [000] 158665.570607: cpu-clock: +# ffffffff8103ce3b native_safe_halt ([kernel.kallsyms]) +# ffffffff8101c6a3 default_idle ([kernel.kallsyms]) +# ffffffff81013236 cpu_idle ([kernel.kallsyms]) +# ffffffff815bf03e rest_init ([kernel.kallsyms]) +# ffffffff81aebbfe start_kernel ([kernel.kallsyms].init.text) +# [...] +# +# Example output: +# +# swapper;start_kernel;rest_init;cpu_idle;default_idle;native_safe_halt 1 +# +# Input may be created and processed using: +# +# perf record -a -g -F 997 sleep 60 +# perf script | ./stackcollapse-perf.pl > out.stacks-folded +# +# The output of "perf script" should include stack traces. If these are missing +# for you, try manually selecting the perf script output; eg: +# +# perf script -f comm,pid,tid,cpu,time,event,ip,sym,dso,trace | ... +# +# This is also required for the --pid or --tid options, so that the output has +# both the PID and TID. +# +# Copyright 2012 Joyent, Inc. All rights reserved. +# Copyright 2012 Brendan Gregg. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# +# 02-Mar-2012 Brendan Gregg Created this. +# 02-Jul-2014 " " Added process name to stacks. + +use strict; +use Getopt::Long; + +my %collapsed; + +sub remember_stack { + my ($stack, $count) = @_; + $collapsed{$stack} += $count; +} +my $annotate_kernel = 0; # put an annotation on kernel function +my $annotate_jit = 0; # put an annotation on jit symbols +my $annotate_all = 0; # enale all annotations +my $include_pname = 1; # include process names in stacks +my $include_pid = 0; # include process ID with process name +my $include_tid = 0; # include process & thread ID with process name +my $include_addrs = 0; # include raw address where a symbol can't be found +my $tidy_java = 1; # condense Java signatures +my $tidy_generic = 1; # clean up function names a little +my $target_pname; # target process name from perf invocation +my $event_filter = ""; # event type filter, defaults to first encountered event +my $event_defaulted = 0; # whether we defaulted to an event (none provided) +my $event_warning = 0; # if we printed a warning for the event + +my $show_inline = 0; +my $show_context = 0; +GetOptions('inline' => \$show_inline, + 'context' => \$show_context, + 'pid' => \$include_pid, + 'kernel' => \$annotate_kernel, + 'jit' => \$annotate_jit, + 'all' => \$annotate_all, + 'tid' => \$include_tid, + 'addrs' => \$include_addrs, + 'event-filter=s' => \$event_filter) +or die <<USAGE_END; +USAGE: $0 [options] infile > outfile\n + --pid # include PID with process names [1] + --tid # include TID and PID with process names [1] + --inline # un-inline using addr2line + --all # all annotations (--kernel --jit) + --kernel # annotate kernel functions with a _[k] + --jit # annotate jit functions with a _[j] + --context # adds source context to --inline + --addrs # include raw addresses where symbols can't be found + --event-filter=EVENT # event name filter\n +[1] perf script must emit both PID and TIDs for these to work; eg, Linux < 4.1: + perf script -f comm,pid,tid,cpu,time,event,ip,sym,dso,trace + for Linux >= 4.1: + perf script -F comm,pid,tid,cpu,time,event,ip,sym,dso,trace + If you save this output add --header on Linux >= 3.14 to include perf info. +USAGE_END + +if ($annotate_all) { + $annotate_kernel = $annotate_jit = 1; +} + +# for the --inline option +sub inline { + my ($pc, $mod) = @_; + + # capture addr2line output + my $a2l_output = `addr2line -a $pc -e $mod -i -f -s -C`; + + # remove first line + $a2l_output =~ s/^(.*\n){1}//; + + my @fullfunc; + my $one_item = ""; + for (split /^/, $a2l_output) { + chomp $_; + + # remove discriminator info if exists + $_ =~ s/ \(discriminator \S+\)//; + + if ($one_item eq "") { + $one_item = $_; + } else { + if ($show_context == 1) { + unshift @fullfunc, $one_item . ":$_"; + } else { + unshift @fullfunc, $one_item; + } + $one_item = ""; + } + } + + return join(";", @fullfunc); +} + +my @stack; +my $pname; +my $m_pid; +my $m_tid; + +# +# Main loop +# +while (defined($_ = <>)) { + + # find the name of the process launched by perf, by stepping backwards + # over the args to find the first non-option (no dash): + if (/^# cmdline/) { + my @args = split ' ', $_; + foreach my $arg (reverse @args) { + if ($arg !~ /^-/) { + $target_pname = $arg; + $target_pname =~ s:.*/::; # strip pathname + last; + } + } + } + + # skip remaining comments + next if m/^#/; + chomp; + + # end of stack. save cached data. + if (m/^$/) { + # ignore filtered samples + next if not $pname; + + if ($include_pname) { + if (defined $pname) { + unshift @stack, $pname; + } else { + unshift @stack, ""; + } + } + remember_stack(join(";", @stack), 1) if @stack; + undef @stack; + undef $pname; + next; + } + + # + # event record start + # + if (/^(\S.+?)\s+(\d+)\/*(\d+)*\s+/) { + # default "perf script" output has TID but not PID + # eg, "java 25607 4794564.109216: cycles:" + # eg, "java 12688 [002] 6544038.708352: cpu-clock:" + # eg, "V8 WorkerThread 25607 4794564.109216: cycles:" + # eg, "java 24636/25607 [000] 4794564.109216: cycles:" + # eg, "java 12688/12764 6544038.708352: cpu-clock:" + # eg, "V8 WorkerThread 24636/25607 [000] 94564.109216: cycles:" + # other combinations possible + my ($comm, $pid, $tid) = ($1, $2, $3); + if (not $tid) { + $tid = $pid; + $pid = "?"; + } + + if (/(\S+):\s*$/) { + my $event = $1; + + if ($event_filter eq "") { + # By default only show events of the first encountered + # event type. Merging together different types, such as + # instructions and cycles, produces misleading results. + $event_filter = $event; + $event_defaulted = 1; + } elsif ($event ne $event_filter) { + if ($event_defaulted and $event_warning == 0) { + # only print this warning if necessary: + # when we defaulted and there was + # multiple event types. + print STDERR "Filtering for events of type: $event\n"; + $event_warning = 1; + } + next; + } + } + + ($m_pid, $m_tid) = ($pid, $tid); + + if ($include_tid) { + $pname = "$comm-$m_pid/$m_tid"; + } elsif ($include_pid) { + $pname = "$comm-$m_pid"; + } else { + $pname = "$comm"; + } + $pname =~ tr/ /_/; + + # + # stack line + # + } elsif (/^\s*(\w+)\s*(.+) \((\S*)\)/) { + # ignore filtered samples + next if not $pname; + + my ($pc, $rawfunc, $mod) = ($1, $2, $3); + + # Linux 4.8 included symbol offsets in perf script output by default, eg: + # 7fffb84c9afc cpu_startup_entry+0x800047c022ec ([kernel.kallsyms]) + # strip these off: + $rawfunc =~ s/\+0x[\da-f]+$//; + + if ($show_inline == 1 && $mod !~ m/(perf-\d+.map|kernel\.|\[[^\]]+\])/) { + unshift @stack, inline($pc, $mod); + next; + } + + next if $rawfunc =~ /^\(/; # skip process names + + my @inline; + for (split /\->/, $rawfunc) { + my $func = $_; + + if ($func eq "[unknown]") { + if ($mod ne "[unknown]") { # use module name instead, if known + $func = $mod; + $func =~ s/.*\///; + } else { + $func = "unknown"; + } + + if ($include_addrs) { + $func = "\[$func \<$pc\>\]"; + } else { + $func = "\[$func\]"; + } + } + + if ($tidy_generic) { + $func =~ s/;/:/g; + if ($func !~ m/\.\(.*\)\./) { + # This doesn't look like a Go method name (such as + # "net/http.(*Client).Do"), so everything after the first open + # paren (that is not part of an "(anonymous namespace)") is + # just noise. + $func =~ s/\((?!anonymous namespace\)).*//; + } + # now tidy this horrible thing: + # 13a80b608e0a RegExp:[&<>\"\'] (/tmp/perf-7539.map) + $func =~ tr/"\'//d; + # fall through to $tidy_java + } + + if ($tidy_java and $pname eq "java") { + # along with $tidy_generic, converts the following: + # Lorg/mozilla/javascript/ContextFactory;.call(Lorg/mozilla/javascript/ContextAction;)Ljava/lang/Object; + # Lorg/mozilla/javascript/ContextFactory;.call(Lorg/mozilla/javascript/C + # Lorg/mozilla/javascript/MemberBox;.<init>(Ljava/lang/reflect/Method;)V + # into: + # org/mozilla/javascript/ContextFactory:.call + # org/mozilla/javascript/ContextFactory:.call + # org/mozilla/javascript/MemberBox:.init + $func =~ s/^L// if $func =~ m:/:; + } + + # + # Annotations + # + # detect inlined from the @inline array + # detect kernel from the module name; eg, frames to parse include: + # ffffffff8103ce3b native_safe_halt ([kernel.kallsyms]) + # 8c3453 tcp_sendmsg (/lib/modules/4.3.0-rc1-virtual/build/vmlinux) + # 7d8 ipv4_conntrack_local+0x7f8f80b8 ([nf_conntrack_ipv4]) + # detect jit from the module name; eg: + # 7f722d142778 Ljava/io/PrintStream;::print (/tmp/perf-19982.map) + if (scalar(@inline) > 0) { + $func .= "_[i]"; # inlined + } elsif ($annotate_kernel == 1 && $mod =~ m/(^\[|vmlinux$)/ && $mod !~ /unknown/) { + $func .= "_[k]"; # kernel + } elsif ($annotate_jit == 1 && $mod =~ m:/tmp/perf-\d+\.map:) { + $func .= "_[j]"; # jitted + } + push @inline, $func; + } + + unshift @stack, @inline; + } else { + warn "Unrecognized line: $_"; + } +} + +foreach my $k (sort { $a cmp $b } keys %collapsed) { + print "$k $collapsed{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-pmc.pl b/contrib/tools/flame-graph/stackcollapse-pmc.pl new file mode 100755 index 0000000000..c78fb96c68 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-pmc.pl @@ -0,0 +1,74 @@ +#!/usr/bin/env perl +# +# Copyright (c) 2014 Ed Maste. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND +# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +# SUCH DAMAGE. +# +# stackcollapse-pmc.pl collapse hwpmc samples into single lines. +# +# Parses a list of multiline stacks generated by "hwpmc -G", and outputs a +# semicolon-separated stack followed by a space and a count. +# +# Usage: +# pmcstat -S unhalted-cycles -O pmc.out +# pmcstat -R pmc.out -z16 -G pmc.graph +# stackcollapse-pmc.pl pmc.graph > pmc.stack +# +# Example input: +# +# 03.07% [17] witness_unlock @ /boot/kernel/kernel +# 70.59% [12] __mtx_unlock_flags +# 16.67% [2] selfdfree +# 100.0% [2] sys_poll +# 100.0% [2] amd64_syscall +# 08.33% [1] pmap_ts_referenced +# 100.0% [1] vm_pageout +# 100.0% [1] fork_exit +# ... +# +# Example output: +# +# amd64_syscall;sys_poll;selfdfree;__mtx_unlock_flags;witness_unlock 2 +# amd64_syscall;sys_poll;pmap_ts_referenced;__mtx_unlock_flagsgeout;fork_exit 1 +# ... + +use warnings; +use strict; + +my @stack; +my $prev_count; +my $prev_indent = -1; + +foreach (<>) { + if (m/^( *)[0-9.]+% \[([0-9]+)\]\s*(\S+)/) { + my $indent = length($1); + if ($indent <= $prev_indent) { + print join(';', reverse(@stack[0 .. $prev_indent])) . + " $prev_count\n"; + } + $stack[$indent] = $3; + $prev_count = $2; + $prev_indent = $indent; + } +} +print join(';', reverse(@stack[0 .. $prev_indent])) . " $prev_count\n"; diff --git a/contrib/tools/flame-graph/stackcollapse-recursive.pl b/contrib/tools/flame-graph/stackcollapse-recursive.pl new file mode 100755 index 0000000000..9eae54592c --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-recursive.pl @@ -0,0 +1,60 @@ +#!/usr/bin/perl -ws +# +# stackcollapse-recursive Collapse direct recursive backtraces +# +# Post-process a stack list and merge direct recursive calls: +# +# Example input: +# +# main;recursive;recursive;recursive;helper 1 +# +# Output: +# +# main;recursive;helper 1 +# +# Copyright 2014 Gabriel Corona. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END + +my %stacks; + +while(<>) { + chomp; + my ($stack_, $value) = (/^(.*)\s+?(\d+(?:\.\d*)?)$/); + if ($stack_) { + my @stack = split(/;/, $stack_); + + my @result = (); + my $i; + my $last=""; + for($i=0; $i!=@stack; ++$i) { + if(!($stack[$i] eq $last)) { + $result[@result] = $stack[$i]; + $last = $stack[$i]; + } + } + + $stacks{join(";", @result)} += $value; + } +} + +foreach my $k (sort { $a cmp $b } keys %stacks) { + print "$k $stacks{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-sample.awk b/contrib/tools/flame-graph/stackcollapse-sample.awk new file mode 100755 index 0000000000..bafc4af346 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-sample.awk @@ -0,0 +1,231 @@ +#!/usr/bin/awk -f +# +# Uses MacOS' /usr/bin/sample to generate a flamegraph of a process +# +# Usage: +# +# sudo sample [pid] -file /dev/stdout | stackcollapse-sample.awk | flamegraph.pl +# +# Options: +# +# The output will show the name of the library/framework at the call-site +# with the form AppKit`NSApplication or libsystem`start_wqthread. +# +# If showing the framework or library name is not required, pass +# MODULES=0 as an argument of the sample program. +# +# The generated SVG will be written to the output stream, and can be piped +# into flamegraph.pl directly, or written to a file for conversion later. +# +# --- +# +# Copyright (c) 2017, Apple Inc. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are met: +# +# 1. Redistributions of source code must retain the above copyright notice, +# this list of conditions and the following disclaimer. +# +# 2. Redistributions in binary form must reproduce the above copyright notice, +# this list of conditions and the following disclaimer in the documentation +# and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +# POSSIBILITY OF SUCH DAMAGE. +# + +BEGIN { + + # Command line options + MODULES = 1 # Allows the user to enable/disable printing of modules. + + # Internal variables + _FOUND_STACK = 0 # Found the stack traces in the output. + _LEVEL = -1 # The current level of indentation we are running. + + # The set of symbols to ignore for 'waiting' threads, for ease of use. + # This will hide waiting threads from the view, making it easier to + # see what is actually running in the sample. These may be adjusted + # as necessary or appended to if other symbols need to be filtered out. + + _IGNORE["libsystem_kernel`__psynch_cvwait"] = 1 + _IGNORE["libsystem_kernel`__select"] = 1 + _IGNORE["libsystem_kernel`__semwait_signal"] = 1 + _IGNORE["libsystem_kernel`__ulock_wait"] = 1 + _IGNORE["libsystem_kernel`__wait4"] = 1 + _IGNORE["libsystem_kernel`__workq_kernreturn"] = 1 + _IGNORE["libsystem_kernel`kevent"] = 1 + _IGNORE["libsystem_kernel`mach_msg_trap"] = 1 + _IGNORE["libsystem_kernel`read"] = 1 + _IGNORE["libsystem_kernel`semaphore_wait_trap"] = 1 + + # The same set of symbols as above, without the module name. + _IGNORE["__psynch_cvwait"] = 1 + _IGNORE["__select"] = 1 + _IGNORE["__semwait_signal"] = 1 + _IGNORE["__ulock_wait"] = 1 + _IGNORE["__wait4"] = 1 + _IGNORE["__workq_kernreturn"] = 1 + _IGNORE["kevent"] = 1 + _IGNORE["mach_msg_trap"] = 1 + _IGNORE["read"] = 1 + _IGNORE["semaphore_wait_trap"] = 1 + +} + +# This is the first line in the /usr/bin/sample output that indicates the +# samples follow subsequently. Until we see this line, the rest is ignored. + +/^Call graph/ { + _FOUND_STACK = 1 +} + +# This is found when we have reached the end of the stack output. +# Identified by the string "Total number in stack (...)". + +/^Total number/ { + _FOUND_STACK = 0 + printStack(_NEST,0) +} + +# Prints the stack from FROM to TO (where FROM > TO) +# Called when indenting back from a previous level, or at the end +# of processing to flush the last recorded sample + +function printStack(FROM,TO) { + + # We ignore certain blocking wait states, in the absence of being + # able to filter these threads from collection, otherwise + # we'll end up with many threads of equal length that represent + # the total time the sample was collected. + # + # Note that we need to collect the information to ensure that the + # timekeeping for the parental functions is appropriately adjusted + # so we just avoid printing it out when that occurs. + _PRINT_IT = !_IGNORE[_NAMES[FROM]] + + # We run through all the names, from the root to the leaf, so that + # we generate a line that flamegraph.pl will like, of the form: + # Thread1234;example`main;example`otherFn 1234 + + for(l = FROM; l>=TO; l--) { + if (_PRINT_IT) { + printf("%s", _NAMES[0]) + for(i=1; i<=l; i++) { + printf(";%s", _NAMES[i]) + } + print " " _TIMES[l] + } + + # We clean up our current state to avoid bugs. + delete _NAMES[l] + delete _TIMES[l] + } +} + +# This is where we process each line, of the form: +# 5130 Thread_8749954 +# + 5130 start_wqthread (in libsystem_pthread.dylib) ... +# + 4282 _pthread_wqthread (in libsystem_pthread.dylib) ... +# + ! 4282 __doworkq_kernreturn (in libsystem_kernel.dylib) ... +# + 848 _pthread_wqthread (in libsystem_pthread.dylib) ... +# + 848 __doworkq_kernreturn (in libsystem_kernel.dylib) ... + +_FOUND_STACK && match($0,/^ [^0-9]*[0-9]/) { + + # We maintain two counters: + # _LEVEL: the high water mark of the indentation level we have seen. + # _NEST: the current indentation level. + # + # We keep track of these two levels such that when the nesting level + # decreases, we print out the current state of where we are. + + _NEST=(RLENGTH-5)/2 + sub(/^[^0-9]*/,"") # Normalise the leading content so we start with time. + _TIME=$1 # The time recorded by 'sample', first integer value. + + # The function name is in one or two parts, depending on what kind of + # function it is. + # + # If it is a standard C or C++ function, it will be of the form: + # exampleFunction + # Example::Function + # + # If it is an Objective-C funtion, it will be of the form: + # -[NSExample function] + # +[NSExample staticFunction] + # -[NSExample function:withParameter] + # +[NSExample staticFunction:withParameter:andAnother] + + _FN1 = $2 + _FN2 = $3 + + # If it is a standard C or C++ function then the following word will + # either be blank, or the text '(in', so we jut use the first one: + + if (_FN2 == "(in" || _FN2 == "") { + _FN =_FN1 + } else { + # Otherwise we concatenate the first two parts with . + _FN = _FN1 "." _FN2 + } + + # Modules are shown with '(in libfoo.dylib)' or '(in AppKit)' + + _MODULE = "" + match($0, /\(in [^)]*\)/) + + if (RSTART > 0 && MODULES) { + + # Strip off the '(in ' (4 chars) and the final ')' char (1 char) + _MODULE = substr($0, RSTART+4, RLENGTH-5) + + # Remove the .dylib function, since it adds no value. + gsub(/\.dylib/, "", _MODULE) + + # The function name is 'module`functionName' + _FN = _MODULE "`" _FN + } + + # Now we have set up the variables, we can decide how to apply it + # If we are descending in the nesting, we don't print anything out: + # a + # ab + # abc + # + # We only print out something when we go back a level, or hit the end: + # abcd + # abe < prints out the stack up until this point, i.e. abcd + + # We store a pair of arrays, indexed by the nesting level: + # + # _TIMES - a list of the time reported to that function + # _NAMES - a list of the function names for each current stack trace + + # If we are backtracking, we need to flush the current output. + if (_NEST <= _LEVEL) { + printStack(_LEVEL,_NEST) + } + + # Record the name and time of the function where we are. + _NAMES[_NEST] = _FN + _TIMES[_NEST] = _TIME + + # We subtract the time we took from our parent so we don't double count. + if (_NEST > 0) { + _TIMES[_NEST-1] -= _TIME + } + + # Raise the high water mark of the level we have reached. + _LEVEL = _NEST +} diff --git a/contrib/tools/flame-graph/stackcollapse-stap.pl b/contrib/tools/flame-graph/stackcollapse-stap.pl new file mode 100755 index 0000000000..bca4046192 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-stap.pl @@ -0,0 +1,84 @@ +#!/usr/bin/perl -w +# +# stackcollapse-stap.pl collapse multiline SystemTap stacks +# into single lines. +# +# Parses a multiline stack followed by a number on a separate line, and +# outputs a semicolon separated stack followed by a space and the number. +# If memory addresses (+0xd) are present, they are stripped, and resulting +# identical stacks are colased with their counts summed. +# +# USAGE: ./stackcollapse.pl infile > outfile +# +# Example input: +# +# 0xffffffff8103ce3b : native_safe_halt+0xb/0x10 [kernel] +# 0xffffffff8101c6a3 : default_idle+0x53/0x1d0 [kernel] +# 0xffffffff81013236 : cpu_idle+0xd6/0x120 [kernel] +# 0xffffffff815bf03e : rest_init+0x72/0x74 [kernel] +# 0xffffffff81aebbfe : start_kernel+0x3ba/0x3c5 [kernel] +# 2404 +# +# Example output: +# +# start_kernel;rest_init;cpu_idle;default_idle;native_safe_halt 2404 +# +# Input may contain many stacks as generated from SystemTap. +# +# Copyright 2011 Joyent, Inc. All rights reserved. +# Copyright 2011 Brendan Gregg. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# +# 16-Feb-2012 Brendan Gregg Created this. + +use strict; + +my %collapsed; + +sub remember_stack { + my ($stack, $count) = @_; + $collapsed{$stack} += $count; +} + +my @stack; + +foreach (<>) { + chomp; + + if (m/^\s*(\d+)+$/) { + remember_stack(join(";", @stack), $1); + @stack = (); + next; + } + + next if (m/^\s*$/); + + my $frame = $_; + $frame =~ s/^\s*//; + $frame =~ s/\+[^+]*$//; + $frame =~ s/.* : //; + $frame = "-" if $frame eq ""; + unshift @stack, $frame; +} + +foreach my $k (sort { $a cmp $b } keys %collapsed) { + printf "$k $collapsed{$k}\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-vsprof.pl b/contrib/tools/flame-graph/stackcollapse-vsprof.pl new file mode 100755 index 0000000000..a13c1daab3 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-vsprof.pl @@ -0,0 +1,98 @@ +#!/usr/bin/perl -w +# +# stackcollapse-vsprof.pl +# +# Parses the CSV file containing a call tree from a visual studio profiler and produces an output suitable for flamegraph.pl. +# +# USAGE: perl stackcollapse-vsprof.pl infile > outfile +# +# WORKFLOW: +# +# This example assumes you have visual studio 2015 installed. +# +# 1. Profile C++ your application using visual studio +# 2. On visual studio, choose export the call tree as csv +# 3. Generate a flamegraph: perl stackcollapse-vsprof CallTreeSummary.csv | perl flamegraph.pl > result_vsprof.svg +# +# INPUT EXAMPLE : +# +# Level,Function Name,Inclusive Samples,Exclusive Samples,Inclusive Samples %,Exclusive Samples %,Module Name, +# 1,"main","8,735",0,100.00,0.00,"an_executable.exe", +# 2,"testing::UnitTest::Run","8,735",0,100.00,0.00,"an_executable.exe", +# 3,"boost::trim_end_iter_select<std::iterator<std::val<std::types<char> > >,boost::is_classifiedF>",306,16,3.50,0.18,"an_executable.exe", +# +# OUTPUT EXAMPLE : +# +# main;testing::UnitTest::Run;boost::trim_end_iter_select<std::iterator<std::val<std::types<char>>>,boost::is_classifiedF> 306 + +use strict; + +sub massage_function_names; +sub parse_integer; +sub print_stack_trace; + +# data initialization +my @stack = (); +my $line_number = 0; +my $previous_samples = 0; + +my $num_args = $#ARGV + 1; +if ($num_args != 1) { + print "$ARGV[0]\n"; + print "Usage : stackcollapse-vsprof.pl <in.cvs> > out.txt\n"; + exit; +} + +my $input_csv_file = $ARGV[0]; +my $line_parser_rx = qr{ + ^\s*(\d+?), # level in the stack + ("[^"]+" | [^,]+), # function name (beware of spaces) + ("[^"]+" | [^,]+), # number of samples (beware of locale number formatting) +}ox; + +open(my $fh, '<', $input_csv_file) or die "Can't read file '$input_csv_file' [$!]\n"; + +while (my $current_line = <$fh>){ + $line_number = $line_number + 1; + + # to discard first line which typically contains headers + next if $line_number == 1; + next if $current_line =~ /^\s*$/o; + + ($current_line =~ $line_parser_rx) or die "Error in regular expression at line $line_number : $current_line\n"; + + my $level = int $1; + my $function = massage_function_names($2); + my $samples = parse_integer($3); + my $stack_len = @stack; + + #print "[DEBUG] $line_number : $level $function $samples $stack_len\n"; + + next if not $level; + ($level <= $stack_len + 1) or die "Error in stack at line $line_number : $current_line\n"; + + if ($level <= $stack_len) { + print_stack_trace(\@stack, $previous_samples); + my $to_remove = $level - $stack_len - 1; + splice(@stack, $to_remove); + } + + $stack_len < 1000 or die "Stack overflow at line $line_number"; + push(@stack, $function); + $previous_samples = $samples; +} +print_stack_trace(\@stack, $previous_samples); + +sub massage_function_names { + return ($_[0] =~ s/\s*|^"|"$//gro); +} + +sub parse_integer { + return int ($_[0] =~ s/[., ]|^"|"$//gro); +} + +sub print_stack_trace { + my ($stack_ref, $sample) = @_; + my $stack_trace = join(";", @$stack_ref); + print "$stack_trace $sample\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse-vtune.pl b/contrib/tools/flame-graph/stackcollapse-vtune.pl new file mode 100644 index 0000000000..9d22731f78 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-vtune.pl @@ -0,0 +1,78 @@ +#!/usr/bin/perl -w
+#
+# stackcollapse-vtune.pl
+#
+# Parses the CSV file containing a call tree from Intel VTune hotspots profiler and produces an output suitable for flamegraph.pl.
+#
+# USAGE: perl stackcollapse-vtune.pl infile > outfile
+#
+# WORKFLOW:
+#
+# This assumes you have Intel VTune installed and on path (using Command Line)
+#
+# 1. Profile C++ application tachyon_find_hotspots (example shipped with Intel VTune 2013):
+#
+# amplxe-cl -collect hotspots -r result_vtune_tachyon -- ./tachyon_find_hotspots
+#
+# 2. Export raw VTune data to csv file:
+# ###for Intel VTune 2013
+# amplxe-cl -R top-down -call-stack-mode all -report-out result_vtune_tachyon.csv -filter "Function Stack" -format csv -csv-delimiter comma -r result_vtune_tachyon
+#
+# ###for Intel VTune 2015 & 2016
+# amplxe-cl -R top-down -call-stack-mode all -column="CPU Time:Self","Module" -report-out result_vtune_tachyon.csv -filter "Function Stack" -format csv -csv-delimiter comma -r result_vtune_tachyon
+#
+# 3. Generate a flamegraph:
+#
+# perl stackcollapse-vtune result_vtune_tachyon.csv | perl flamegraph.pl > result_vtune_tachyon.svg
+#
+# AUTHOR: Rohith Bakkannagari
+
+use strict;
+
+# data initialization
+my @stack = ();
+my $rowCounter = 0; #flag for row number
+
+my $numArgs = $#ARGV + 1;
+if ($numArgs != 1)
+{
+print "$ARGV[0]\n";
+print "Usage : stackcollapse-vtune.pl <out.cvs> > out.txt\n";
+exit;
+}
+
+my $inputCSVFile = $ARGV[0];
+
+open(my $fh, '<', $inputCSVFile) or die "Can't read file '$inputCSVFile' [$!]\n";
+
+while (my $currLine = <$fh>){
+ $rowCounter = $rowCounter + 1;
+ # to discard first row which typically contains headers
+ next if $rowCounter == 1;
+ chomp $currLine;
+ #VTune - sometimes the call stack information is enclosed in double quotes (?). To remove double quotes.
+ $currLine =~ s/\"//g;
+
+ ### for Intel VTune 2013
+ #$currLine =~ /(\s*)(.*),(.*),[0-9]*\.?[0-9]+[%],([0-9]*\.?[0-9]+)/ or die "Error in regular expression on the current line\n";
+ #my $func = $3.'!'.$2;
+ #my $depth = length ($1);
+ #my $selfTime = $4*1000; # selfTime in msec
+
+ ### for Intel VTune 2015 & 2016
+ $currLine =~ /(\s*)(.*?),([0-9]*\.?[0-9]+?),(.*)/ or die "Error in regular expression on the current line $currLine\n";
+ my $func = $4.'!'.$2;
+ my $depth = length ($1);
+ my $selfTime = $3*1000; # selfTime in msec
+
+ my $tempString = '';
+ $stack [$depth] = $func;
+ foreach my $i (0 .. $depth - 1) {
+ $tempString = $tempString.$stack[$i].";";
+ }
+ $tempString = $tempString.$func." $selfTime\n";
+
+ if ($selfTime != 0){
+ print "$tempString";
+ }
+}
diff --git a/contrib/tools/flame-graph/stackcollapse-xdebug.php b/contrib/tools/flame-graph/stackcollapse-xdebug.php new file mode 100755 index 0000000000..6548903db9 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse-xdebug.php @@ -0,0 +1,197 @@ +#!/usr/bin/php +# +# Copyright 2018 Miriam Lauter (lauter.miriam@gmail.com). All rights reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License +# as published by the Free Software Foundation; either version 2 +# of the License, or (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software Foundation, +# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. +# +# (http://www.gnu.org/copyleft/gpl.html) +# +# 13-Apr-2018 Miriam Lauter Created this. + +<?php +ini_set('error_log', null); +$optind = null; +$args = getopt("htc", ["help"], $optind); +if (isset($args['h']) || isset($args['help'])) { + usage(); +} + +function usage($exit = 0) { + echo <<<EOT +stackcollapse-php.php collapse php function traces into single lines. + +Parses php samples generated by xdebug with xdebug.trace_format = 1 +and outputs stacks as single lines, with methods separated by semicolons, +and then a space and an occurrence count. For use with flamegraph.pl. +See https://github.com/brendangregg/FlameGraph. + +USAGE: ./stackcollapse-php.php [OPTIONS] infile > outfile + -h --help Show this message + -t Weight stack counts by duration using the time index in the trace (default) + -c Invocation counts only. Simply count stacks in the trace and sum duplicates, don't weight by duration. + +Example input: +For more info on xdebug and generating traces see +https://xdebug.org/docs/execution_trace. + +Version: 2.0.0RC4-dev +TRACE START [2007-05-06 18:29:01] +1 0 0 0.010870 114112 {main} 1 ../trace.php 0 +2 1 0 0.032009 114272 str_split 0 ../trace.php 8 +2 1 1 0.032073 116632 +2 2 0 0.033505 117424 ret_ord 1 ../trace.php 10 +3 3 0 0.033531 117584 ord 0 ../trace.php 5 +3 3 1 0.033551 117584 +... +TRACE END [2007-05-06 18:29:01] + +Example output: + +- c +{main};str_split 1 +{main};ret_ord;ord 6 + +-t +{main} 23381 +{main};str_split 64 +{main};ret_ord 215 +{main};ret_ord;ord 106 + +EOT; + + exit($exit); +} + +function collapseStack(array $stack, string $func_name_key): string { + return implode(';', array_column($stack, $func_name_key)); +} + +function addCurrentStackToStacks(array $stack, float $dur, array &$stacks) { + $collapsed = implode(';', $stack); + $duration = SCALE_FACTOR * $dur; + + if (array_key_exists($collapsed, $stacks)) { + $stacks[$collapsed] += $duration; + } else { + $stacks[$collapsed] = $duration; + } +} + +function isEOTrace(string $l) { + $pattern = "/^(\\t|TRACE END)/"; + return preg_match($pattern, $l); +} + +$filename = $argv[$optind] ?? null; +if ($filename === null) { + usage(1); +} + +$do_time = !isset($args['c']); + +// First make sure our file is consistently formatted with only one \t delimiting each field +$out = []; +$retval = null; +exec("sed -in 's/\t\+/\t/g' " . escapeshellarg($filename), $out, $retval); +if ($retval !== 0) { + usage(1); +} + +$handle = fopen($filename, 'r'); + +if ($handle === false) { + echo "Unable to open $filename \n\n"; + usage(1); +} + +// Loop till we find TRACE START +while ($l = fgets($handle)) { + if (strpos($l, "TRACE START") === 0) { + break; + } +} + +const SCALE_FACTOR = 1000000; +$stacks = []; +$current_stack = []; +$was_exit = false; +$prev_start_time = 0; + +if ($do_time) { + // Weight counts by duration + // Xdebug trace time indices have 6 sigfigs of precision + // We have a perfect trace, but let's instead pretend that + // this was collected by sampling at 10^6 Hz + // then each millionth of a second this stack took to execute is 1 count + while ($l = fgets($handle)) { + if (isEOTrace($l)) { + break; + } + + $parts = explode("\t", $l); + list($level, $fn_no, $is_exit, $time) = $parts; + + if ($is_exit) { + if (empty($current_stack)) { + echo "[WARNING] Found function exit without corresponding entrance. Discarding line. Check your input.\n"; + continue; + } + + addCurrentStackToStacks($current_stack, $time - $prev_start_time, $stacks); + array_pop($current_stack); + } else { + $func_name = $parts[5]; + + if (!empty($current_stack)) { + addCurrentStackToStacks($current_stack, $time - $prev_start_time, $stacks); + } + + $current_stack[] = $func_name; + } + $prev_start_time = $time; + } +} else { + // Counts only + while ($l = fgets($handle)) { + if (isEOTrace($l)) { + break; + } + + $parts = explode("\t", $l); + list($level, $fn_no, $is_exit) = $parts; + + if ($is_exit === "1") { + if (!$was_exit) { + $collapsed = implode(";", $current_stack); + if (array_key_exists($collapsed, $stacks)) { + $stacks[$collapsed]++; + } else { + $stacks[$collapsed] = 1; + } + } + + array_pop($current_stack); + $was_exit = true; + } else { + $func_name = $parts[5]; + $current_stack[] = $func_name; + $was_exit = false; + } + } +} + +foreach ($stacks as $stack => $count) { + echo "$stack $count\n"; +} diff --git a/contrib/tools/flame-graph/stackcollapse.pl b/contrib/tools/flame-graph/stackcollapse.pl new file mode 100755 index 0000000000..1e00c52136 --- /dev/null +++ b/contrib/tools/flame-graph/stackcollapse.pl @@ -0,0 +1,109 @@ +#!/usr/bin/perl -w +# +# stackcollapse.pl collapse multiline stacks into single lines. +# +# Parses a multiline stack followed by a number on a separate line, and +# outputs a semicolon separated stack followed by a space and the number. +# If memory addresses (+0xd) are present, they are stripped, and resulting +# identical stacks are colased with their counts summed. +# +# USAGE: ./stackcollapse.pl infile > outfile +# +# Example input: +# +# unix`i86_mwait+0xd +# unix`cpu_idle_mwait+0xf1 +# unix`idle+0x114 +# unix`thread_start+0x8 +# 1641 +# +# Example output: +# +# unix`thread_start;unix`idle;unix`cpu_idle_mwait;unix`i86_mwait 1641 +# +# Input may contain many stacks, and can be generated using DTrace. The +# first few lines of input are skipped (see $headerlines). +# +# Copyright 2011 Joyent, Inc. All rights reserved. +# Copyright 2011 Brendan Gregg. All rights reserved. +# +# CDDL HEADER START +# +# The contents of this file are subject to the terms of the +# Common Development and Distribution License (the "License"). +# You may not use this file except in compliance with the License. +# +# You can obtain a copy of the license at docs/cddl1.txt or +# http://opensource.org/licenses/CDDL-1.0. +# See the License for the specific language governing permissions +# and limitations under the License. +# +# When distributing Covered Code, include this CDDL HEADER in each +# file and include the License file at docs/cddl1.txt. +# If applicable, add the following below this CDDL HEADER, with the +# fields enclosed by brackets "[]" replaced with your own identifying +# information: Portions Copyright [yyyy] [name of copyright owner] +# +# CDDL HEADER END +# +# 14-Aug-2011 Brendan Gregg Created this. + +use strict; + +my $headerlines = 3; # number of input lines to skip +my $includeoffset = 0; # include function offset (except leafs) +my %collapsed; + +sub remember_stack { + my ($stack, $count) = @_; + $collapsed{$stack} += $count; +} + +my $nr = 0; +my @stack; + +foreach (<>) { + next if $nr++ < $headerlines; + chomp; + + if (m/^\s*(\d+)+$/) { + my $count = $1; + my $joined = join(";", @stack); + + # trim leaf offset if these were retained: + $joined =~ s/\+[^+]*$// if $includeoffset; + + remember_stack($joined, $count); + @stack = (); + next; + } + + next if (m/^\s*$/); + + my $frame = $_; + $frame =~ s/^\s*//; + $frame =~ s/\+[^+]*$// unless $includeoffset; + + # Remove arguments from C++ function names: + $frame =~ s/(::.*)[(<].*/$1/; + + $frame = "-" if $frame eq ""; + + my @inline; + for (split /\->/, $frame) { + my $func = $_; + + # Strip out L and ; included in java stacks + $func =~ tr/\;/:/; + $func =~ s/^L//; + $func .= "_[i]" if scalar(@inline) > 0; #inlined + + push @inline, $func; + } + + unshift @stack, @inline; +} + +foreach my $k (sort { $a cmp $b } keys %collapsed) { + print "$k $collapsed{$k}\n"; +} diff --git a/contrib/tools/flame-graph/ya.make b/contrib/tools/flame-graph/ya.make index 1a5e7a28c2..fe923481df 100644 --- a/contrib/tools/flame-graph/ya.make +++ b/contrib/tools/flame-graph/ya.make @@ -1,2 +1,6 @@ VERSION(2020-07-30-a258e78f17abdf2ce21c2515cfe8306a44774e2a) +SUBSCRIBER( + g:contrib + yazevnul +) |