Python comes with a built-in way of collecting profile information, documented at https://docs.python.org/3/library/profile.html.
In a nutshell, it boils down to:
python3 -m cProfile -o profile.out command.py arguments…
This post is an attempt to document what's in profile.out
, since the python
documentation does not cover it. I have tried to figure this out by looking at
the sources of the pstats
module,
the sources of gprof2dot
,
and trying to make sense of the decoded structure, and navigate it.
Decoding Python profile data
The data collected by python's profile
/cProfile
is a data structure
encoding using the marshal
module1.
Loading the data is simple:
>>> import marshal >>> with open("profile.out", "rb") as fd: >>> data = marshal.load(fd)
Structure of Python profile data
Decoded profile data is a dictionary indexed by tuples representing functions or scopes:
# File name, Line number, Function or scope name Func = Tuple[str, int, str]
For example:
("./staticsite/page.py", 1, "<module>") ("./staticsite/features/markdown.py", 138, "load_dir") ('~', 0, "<method 'items' of 'collections.OrderedDict' objects>") ('/usr/lib/python3.7/pathlib.py', 1156, 'stat')
('~', 0, …)
represents a built-in function.
The signature of the decoded dict seems to be:
Dict[ # Function of scope Func, Tuple[ # Aggregated statistics int, int, float, float, # Callers Dict[ # Caller function or scope Func, # Call statistics collected for this caller → callee Tuple[int, int, float, float] ] ] ]
The four int, int, float, float
statistics are normally unpacked as
cc, nc, tt, ct
, and stand for:
cc
: number of primitive (non recursive) calls. Sometimes listed aspcalls
.nc
: total number of calls (inculding recursive), sometimes listed asncalls
.tt
: total time, sometimes listed astottime
: total time spent in the given function, and excluding time made in calls to sub-functions.ct
: cumulative time, Sometimes listed ascumtime
: cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.
Phrases in italic are quotations from profile's documentation.
Building a call graph
This is some example code to turn the collected statistics into a call graph structure:
class Node: """ Node in the caller → callee graph """ def __init__(self, func: Func): self.fname, self.lineno, self.scope = func self.callees: List["Call"] = [] self.callers: List["Call"] = [] self.cc = 0 self.nc = 0 self.tt = 0.0 self.ct = 0.0 def __str__(self): # Builtin function if self.fname == "~" and self.lineno == 0: return f"[builtin]:{self.scope}" # Shorten file names from system libraries self.shortname = self.fname for path in sorted(sys.path, key=lambda x: -len(x)): if not path: continue if self.fname.startswith(path): return f"[sys]{self.fname[len(path):]}:{self.lineno}:{self.scope}" # File in the local project return f"{self.fname}:{self.lineno}:{self.scope}" class Call: """ Arc in the caller → callee graph """ def __init__(self, caller: Node, callee: Node, cc: int, nc: int, tt: float, ct: float): self.caller = caller self.callee = callee self.cc = cc self.nc = nc self.tt = tt self.ct = ct class Graph: """ Graph of callers and callees. Each node in the graph represents a function, with its aggregated statistics. Each arc in the graph represents each collected caller → callee statistics """ def __init__(self, stats: Stats): # Index of all nodes in the graph self.nodes: Dict[Func, Node] = {} # Total execution time self.total_time = 0.0 # Build the graph for callee, (cc, nc, tt, ct, callers) in stats.items(): self.total_time += tt # Get the callee and fill its aggregated stats ncallee = self.node(callee) ncallee.cc = cc ncallee.nc = nc ncallee.tt = tt ncallee.ct = ct # Create caller → callee arcs for caller, (cc, nc, tt, ct) in callers.items(): ncaller = self.node(caller) call = Call(ncaller, ncallee, cc, nc, tt, ct) ncallee.callers.append(call) ncaller.callees.append(call) def node(self, fun: Func) -> Node: """ Lookup or create a node """ res = self.nodes.get(fun) if res is None: res = Node(fun) self.nodes[fun] = res return res @classmethod def load(cls, pathname: str) -> "Graph": """ Builds a Graph from profile statistics saved on a file """ with open(pathname, "rb") as fd: return cls(marshal.load(fd)) @classmethod def from_pstats(cls, stats: pstats.Stats) -> "Graph": """ Builds a Graph from an existing pstats.Stats structure """ return cls(stats.stats)
Other python profiling links
- pyinstrument: call stack profiler for Python
- austin: Python frame stack sampler for CPython
- py-spy: Sampling profiler for Python programs
- FlameGraph: stack trace visualizer