diff options
author | Måns Rullgård <mans@mansr.com> | 2010-06-26 22:23:35 +0000 |
---|---|---|
committer | Måns Rullgård <mans@mansr.com> | 2010-06-26 22:23:35 +0000 |
commit | 2829ce4b402a2be9d4c01982ffd29cab6be8e09a (patch) | |
tree | 619a98f4ae16ceab98e385cc9deab85e3ad25c6b /doc | |
parent | a788196e2041b431731b3441b27d4b5e8bc8df2e (diff) | |
download | ffmpeg-2829ce4b402a2be9d4c01982ffd29cab6be8e09a.tar.gz |
Remove PPC perf counter support
This functionality is better accessed through tools like oprofile.
Originally committed as revision 23808 to svn://svn.ffmpeg.org/ffmpeg/trunk
Diffstat (limited to 'doc')
-rw-r--r-- | doc/ffmpeg_powerpc_performance_evaluation_howto.txt | 172 |
1 files changed, 0 insertions, 172 deletions
diff --git a/doc/ffmpeg_powerpc_performance_evaluation_howto.txt b/doc/ffmpeg_powerpc_performance_evaluation_howto.txt deleted file mode 100644 index 2eb4ee71a5..0000000000 --- a/doc/ffmpeg_powerpc_performance_evaluation_howto.txt +++ /dev/null @@ -1,172 +0,0 @@ -FFmpeg & evaluating performance on the PowerPC Architecture HOWTO - -(c) 2003-2004 Romain Dolbeau <romain@dolbeau.org> - - - -I - Introduction - -The PowerPC architecture and its SIMD extension AltiVec offer some -interesting tools to evaluate performance and improve the code. -This document tries to explain how to use those tools with FFmpeg. - -The architecture itself offers two ways to evaluate the performance of -a given piece of code: - -1) The Time Base Registers (TBL) -2) The Performance Monitor Counter Registers (PMC) - -The first ones are always available, always active, but they're not very -accurate: the registers increment by one every four *bus* cycles. On -my 667 Mhz tiBook (ppc7450), this means once every twenty *processor* -cycles. So we won't use that. - -The PMC are much more useful: not only can they report cycle-accurate -timing, but they can also be used to monitor many other parameters, -such as the number of AltiVec stalls for every kind of instruction, -or instruction cache misses. The downside is that not all processors -support the PMC (all G3, all G4 and the 970 do support them), and -they're inactive by default - you need to activate them with a -dedicated tool. Also, the number of available PMC depends on the -procesor: the various 604 have 2, the various 75x (aka. G3) have 4, -and the various 74xx (aka G4) have 6. - -*WARNING*: The PowerPC 970 is not very well documented, and its PMC -registers are 64 bits wide. To properly notify the code, you *must* -tune for the 970 (using --tune=970), or the code will assume 32 bit -registers. - - -II - Enabling FFmpeg PowerPC performance support - -This needs to be done by hand. First, you need to configure FFmpeg as -usual, but add the "--powerpc-perf-enable" option. For instance: - -##### -./configure --prefix=/usr/local/ffmpeg-svn --cc=gcc-3.3 --tune=7450 --powerpc-perf-enable -##### - -This will configure FFmpeg to install inside /usr/local/ffmpeg-svn, -compiling with gcc-3.3 (you should try to use this one or a newer -gcc), and tuning for the PowerPC 7450 (i.e. the newer G4; as a rule of -thumb, those at 550Mhz and more). It will also enable the PMC. - -You may also edit the file "config.h" to enable the following line: - -##### -// #define ALTIVEC_USE_REFERENCE_C_CODE 1 -##### - -If you enable this line, then the code will not make use of AltiVec, -but will use the reference C code instead. This is useful to compare -performance between two versions of the code. - -Also, the number of enabled PMC is defined in "libavcodec/ppc/dsputil_ppc.h": - -##### -#define POWERPC_NUM_PMC_ENABLED 4 -##### - -If you have a G4 CPU, you can enable all 6 PMC. DO NOT enable more -PMC than available on your CPU! - -Then, simply compile FFmpeg as usual (make && make install). - - - -III - Using FFmpeg PowerPC performance support - -This FFmeg can be used exactly as usual. But before exiting, FFmpeg -will dump a per-function report that looks like this: - -##### -PowerPC performance report - Values are from the PMC registers, and represent whatever the - registers are set to record. - Function "gmc1_altivec" (pmc1): - min: 231 - max: 1339867 - avg: 558.25 (255302) - Function "gmc1_altivec" (pmc2): - min: 93 - max: 2164 - avg: 267.31 (255302) - Function "gmc1_altivec" (pmc3): - min: 72 - max: 1987 - avg: 276.20 (255302) -(...) -##### - -In this example, PMC1 was set to record CPU cycles, PMC2 was set to -record AltiVec Permute Stall Cycles, and PMC3 was set to record AltiVec -Issue Stalls. - -The function "gmc1_altivec" was monitored 255302 times, and the -minimum execution time was 231 processor cycles. The max and average -aren't much use, as it's very likely the OS interrupted execution for -reasons of its own :-( - -With the exact same settings and source file, but using the reference C -code we get: - -##### -PowerPC performance report - Values are from the PMC registers, and represent whatever the - registers are set to record. - Function "gmc1_altivec" (pmc1): - min: 592 - max: 2532235 - avg: 962.88 (255302) - Function "gmc1_altivec" (pmc2): - min: 0 - max: 33 - avg: 0.00 (255302) - Function "gmc1_altivec" (pmc3): - min: 0 - max: 350 - avg: 0.03 (255302) -(...) -##### - -592 cycles, so the fastest AltiVec execution is about 2.5x faster than -the fastest C execution in this example. It's not perfect but it's not -bad (well I wrote this function so I can't say otherwise :-). - -Once you have that kind of report, you can try to improve things by -finding what goes wrong and fixing it; in the example above, one -should try to diminish the number of AltiVec stalls, as this *may* -improve performance. - - - -IV) Enabling the PMC in Mac OS X - -This is easy. Use "Monster" and "monster". Those tools come from -Apple's CHUD package, and can be found hidden in the developer web -site & FTP site. "MONster" is the graphical application, use it to -generate a config file specifying what each register should -monitor. Then use the command-line application "monster" to use that -config file, and enjoy the results. - -Note that "MONster" can be used for many other things, but it's -documented by Apple, it's not my subject. - -If you are using CHUD 4.4.2 or later, you'll notice that MONster is -no longer available. It's been superseeded by Shark, where -configuration of PMCs is available as a plugin. - - - -V) Enabling the PMC on Linux - -On linux you may use oprofile from http://oprofile.sf.net, depending on the -version and the cpu you may need to apply a patch[1] to access a set of the -possibile counters from the userspace application. You can always define them -using the kernel interface /dev/oprofile/* . - -[1] http://dev.gentoo.org/~lu_zero/development/oprofile-g4-20060423.patch - --- -Romain Dolbeau <romain@dolbeau.org> -Luca Barbato <lu_zero@gentoo.org> |