Software Methods and Tools

Kenneth M. Anderson <kena@cs.colorado.edu>

Lecture 29: Profiling

Today's Lecture

Need Volunteers!

Software Performance Problems

80/20 Rule

Profiling

Software Profiling

Hardware Profiling

gprof

Using gprof

Example Flat Profile

granularity: each sample hit covers 4 byte(s) for 5.56% of 0.18 seconds

%
time
cumulative
seconds
self
seconds

calls
self
ms/call
total
ms/call

name
83.3 0.15 0.15 203 0.74 0.88 _find_alphabetical_location
16.7 0.18 0.03 15925 0.00 0.00 _stringToUpper
0.0 0.18 0.00 204 0.00 0.00 _format_currency
0.0 0.18 0.00 203 0.00 0.00 _calculate_wages
0.0 0.18 0.00 203 0.00 0.89 _insert_alphabetical
0.0 0.18 0.00 1 0.00 0.00 _free_queue
0.0 0.18 0.00 1 0.00 180.00 _main
0.0 0.18 0.00 1 0.00 0.00 _print_paycheck_summary
0.0 0.18 0.00 1 0.00 180.00 _read_employees

Flat Profile Columns

% time the percentage of the total running time of the program used by this function.
cumulative seconds a running sum of the number of seconds accounted for by this function and those listed above it.
self seconds the number of seconds accounted for by this function alone. This is the major sort for this listing.
calls the number of times this function was invoked, if this function is profiled, else blank.
self ms/call the average number of milliseconds spent in this function per call, if this function is profiled, else blank.
total ms/call the average number of milliseconds spent in this function and its descendants per call, if this function is profiled, else blank.
name the name of the function. This is the minor sort for this listing.

Call Graph (modified)

granularity: each sample hit covers 4 byte(s) for 5.56% of 0.18 seconds

index %time self descendants called/total
called+self
called/total
    parents
name
    children
index
    0.00 0.18 203/203     _read_employees [3]
[1] 100.0 0.00 0.18 203 _insert_alphabetical [1]
    0.15 0.03 203/203     _find_alphabetical_location [5]
    0.00 0.00 808/15925     _stringToUpper [6]
 
    0.15 0.03 203/203     _insert_alphabetical [1]
[5] 99.2 0.15 0.03 203 _find_alphabetical_location [5]
    0.03 0.00 15117/15925     _stringToUpper [6]
 
    0.00 0.00 808/15925     _insert_alphabetical [1]
    0.03 0.00 15117/15925     _find_alphabetical_location [5]
[6] 16.7 0.03 0.00 15925 _stringToUpper [6]

Call Graph Description

Call Graph Columns

Functions

index A unique number given to each element of the table
% time percentage of the “total” time that was spent in this function and its children
self total amount of time spent in this function
descendants total amount of time propagated into this function by its children
called number of times the function was called (plus recursive calls)
name name of function

Parents

self amount of time that was propagated directly from the function into this parent
descendants amount of time propagated from the function's children into this parent
called number of times this parent called the function / total number of times this parent was called
descendants total amount of time propagated into this function by its children
called number of times the function was called (plus recursive calls)
name name of parent; If the parents of a function cannot be determined, the word <spontaneous> is printed in this field

Children

self amount of time propagated directly from the child into the function
descendants amount of time propagated from the child's children to the function
called number of times this parent called the function / total number of times this parent was called
descendants total amount of time propagated into this function by its children
called number of times the function called this child / total number of times the child was called
name name of child

Improving Performance

Demo: Graphics Application and gprof

Coming Up Next