aboutsummaryrefslogtreecommitdiffstats
path: root/contrib/python/beniget/README.rst
blob: 85746cab88b83d86e10508ebf107f8340b6defbf (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
Gast, Beniget!
==============

Beniget is a collection of Compile-time analyse on Python Abstract Syntax Tree(AST).
It's a building block to write static analyzer or compiler for Python.

Beniget relies on `gast <https://pypi.org/project/gast/>`_ to provide a cross
version abstraction of the AST, effectively working across all Python 3 versions greater than 3.6.

API
---

Basically Beniget provides three analyse:

- ``beniget.Ancestors`` that maps each node to the list of enclosing nodes;
- ``beniget.DefUseChains`` that maps each node to the list of definition points in that node;
- ``beniget.UseDefChains`` that maps each node to the list of possible definition of that node.

See sample usages and/or run ``pydoc beniget`` for more information :-).


Sample Usages
-------------

Detect unused imports
*********************

This is a very basic usage: look for def without any use, and warn about them, focusing on imported values.

.. code:: python

    >>> import beniget, gast as ast

    # parse some simple statements
    >>> code = "from math import cos, sin; print(cos(3))"
    >>> module = ast.parse(code)

    # compute the def-use chains at module level
    >>> duc = beniget.DefUseChains()
    >>> duc.visit(module)

    # grab the import statement
    >>> imported = module.body[0].names

    # inspect the users of each imported name
    >>> for name in imported:
    ...   ud = duc.chains[name]
    ...   if not ud.users():
    ...     print("Unused import: {}".format(ud.name()))
    Unused import: sin

*NOTE*: Due to the dynamic nature of Python, one can fool this analysis by
calling the ``eval`` function, eventually through an indirection, or by performing a lookup
into ``globals()``.

Find all functions marked with a given decorator
************************************************

Let's assume we've got a ``@nice`` decorator applied to some functions. We can traverse the users
of this decorator to find which functions are decorated.

.. code:: python

    # parse some simple statements
    >>> code = """
    ... nice = lambda x: x
    ... @nice
    ... def aw(): pass
    ... def some(): pass"""
    >>> module = ast.parse(code)

    # compute the def-use chains at module level
    >>> duc = beniget.DefUseChains()
    >>> duc.visit(module)

    # analysis to find parent of a node
    >>> ancestors = beniget.Ancestors()
    >>> ancestors.visit(module)

    # find the nice definition
    >>> nice = [d for d in duc.locals[module] if d.name() == "nice"][0]

    # walkthrough its users
    >>> for use in nice.users():
    ...   # we're interested in the parent of the decorator
    ...   parents = ancestors.parents(use.node)
    ...   # direct parent of the decorator is the function
    ...   fdef = parents[-1]
    ...   print(fdef.name)
    aw

Gather attributes of ``self``
*****************************

This analysis gathers all attributes of a class, by going through all methods and checking
the users of the first method parameter, investigating the one used in attribute lookup.

.. code:: python

    >>> import gast as ast
    >>> import beniget

    >>> class Attributes(ast.NodeVisitor):
    ...
    ...     def __init__(self, module_node):
    ...         # compute the def-use of the module
    ...         self.chains = beniget.DefUseChains()
    ...         self.chains.visit(module_node)
    ...         self.users = set()  # all users of `self`
    ...         self.attributes = set()  # attributes of current class
    ...
    ...     def visit_ClassDef(self, node):
    ...         # walk methods and fill users of `self`
    ...         for stmt in node.body:
    ...             if isinstance(stmt, ast.FunctionDef):
    ...                 self_def = self.chains.chains[stmt.args.args[0]]
    ...                 self.users.update(use.node for use in self_def.users())
    ...         self.generic_visit(node)
    ...
    ...     def visit_Attribute(self, node):
    ...         # any attribute of `self` is registered
    ...         if node.value in self.users:
    ...             self.attributes.add(node.attr)

    >>> code = "class My(object):\n def __init__(self, x): self.x = x"
    >>> module = ast.parse(code)
    >>> classdef = module.body[0]
    >>> attr = Attributes(module)
    >>> attr.visit(classdef)
    >>> list(attr.attributes)
    ['x']

*NOTE*: This is *not* an alias analysis, so assigning ``self`` to another variable, or
setting it in a tuple is not captured by this analysis. It's still possible to write such an
a analysis using def-use chains though ;-)

Compute the identifiers captured by a function
**********************************************

In Python, inner functions (and lambdas) can capture identifiers defined in the outer scope.
This analysis computes such identifiers by registering each identifier defined in the function,
then walking through all loaded identifier and checking whether it's local or not.

.. code:: python

    >>> import gast as ast
    >>> import beniget
    >>> class Capture(ast.NodeVisitor):
    ...
    ...     def __init__(self, module_node):
    ...         # initialize def-use chains
    ...         self.chains = beniget.DefUseChains()
    ...         self.chains.visit(module_node)
    ...         self.users = set()  # users of local definitions
    ...         self.captured = set()  # identifiers that don't belong to local users
    ...
    ...     def visit_FunctionDef(self, node):
    ...         # initialize the set of node using a local variable
    ...         for def_ in self.chains.locals[node]:
    ...             self.users.update(use.node for use in def_.users())
    ...         self.generic_visit(node)
    ...
    ...     def visit_Name(self, node):
    ...         # register load of identifiers not locally definied
    ...         if isinstance(node.ctx, ast.Load):
    ...             if node not in self.users:
    ...                 self.captured.add(node.id)

    >>> code = 'def foo(x):\n def bar(): return x\n return bar'
    >>> module = ast.parse(code)
    >>> inner_function = module.body[0].body[0]
    >>> capture = Capture(module)
    >>> capture.visit(inner_function)
    >>> list(capture.captured)
    ['x']

Compute the set of instructions required to compute a function
**************************************************************

This is actually very similar to the computation of the closure, but this time
let's use the UseDef chains combined with the ancestors.

.. code:: python

    >>> import gast as ast
    >>> import beniget
    >>> class CaptureX(ast.NodeVisitor):
    ...
    ...     def __init__(self, module_node, fun):
    ...         self.fun = fun
    ...         # initialize use-def chains
    ...         du = beniget.DefUseChains()
    ...         du.visit(module_node)
    ...         self.chains = beniget.UseDefChains(du)
    ...         self.ancestors = beniget.Ancestors()
    ...         self.ancestors.visit(module_node)
    ...         self.external = list()
    ...         self.visited_external = set()
    ...
    ...     def visit_Name(self, node):
    ...         # register load of identifiers not locally defined
    ...         if isinstance(node.ctx, ast.Load):
    ...             uses = self.chains.chains[node]
    ...             for use in uses:
    ...                 try:
    ...                     parents = self.ancestors.parents(use.node)
    ...                 except KeyError:
    ...                     return # a builtin
    ...                 if self.fun not in parents:
    ...                         parent = self.ancestors.parentStmt(use.node)
    ...                         if parent not in self.visited_external:
    ...                             self.visited_external.add(parent)
    ...                             self.external.append(parent)
    ...                             self.rec(parent)
    ...
    ...     def rec(self, node):
    ...         "walk definitions to find their operands's def"
    ...         if isinstance(node, ast.Assign):
    ...             self.visit(node.value)
    ...         # TODO: implement this for AugAssign etc


    >>> code = 'a = 1; b = [a, a]; c = len(b)\ndef foo():\n return c'
    >>> module = ast.parse(code)
    >>> function = module.body[3]
    >>> capturex = CaptureX(module, function)
    >>> capturex.visit(function)
    >>> # the three top level assignments have been captured!
    >>> list(map(type, capturex.external))
    [<class 'gast.gast.Assign'>, <class 'gast.gast.Assign'>, <class 'gast.gast.Assign'>]

Acknowledgments
---------------

Beniget is in Pierre Augier's debt, for he triggered the birth of beniget and provided
countless meaningful bug reports and advices. Trugarez!