contrib/libs/openssl/crypto/engine/README


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211

Notes: 2001-09-24 
----------------- 
 
This "description" (if one chooses to call it that) needed some major updating 
so here goes. This update addresses a change being made at the same time to 
OpenSSL, and it pretty much completely restructures the underlying mechanics of 
the "ENGINE" code. So it serves a double purpose of being a "ENGINE internals 
for masochists" document *and* a rather extensive commit log message. (I'd get 
lynched for sticking all this in CHANGES or the commit mails :-). 
 
ENGINE_TABLE underlies this restructuring, as described in the internal header 
"eng_local.h", implemented in eng_table.c, and used in each of the "class" files; 
tb_rsa.c, tb_dsa.c, etc. 
 
However, "EVP_CIPHER" underlies the motivation and design of ENGINE_TABLE so 
I'll mention a bit about that first. EVP_CIPHER (and most of this applies 
equally to EVP_MD for digests) is both a "method" and a algorithm/mode 
identifier that, in the current API, "lingers". These cipher description + 
implementation structures can be defined or obtained directly by applications, 
or can be loaded "en masse" into EVP storage so that they can be catalogued and 
searched in various ways, ie. two ways of encrypting with the "des_cbc" 
algorithm/mode pair are; 
 
(i) directly; 
     const EVP_CIPHER *cipher = EVP_des_cbc(); 
     EVP_EncryptInit(&ctx, cipher, key, iv); 
     [ ... use EVP_EncryptUpdate() and EVP_EncryptFinal() ...] 
 
(ii) indirectly; 
     OpenSSL_add_all_ciphers(); 
     cipher = EVP_get_cipherbyname("des_cbc"); 
     EVP_EncryptInit(&ctx, cipher, key, iv); 
     [ ... etc ... ] 
 
The latter is more generally used because it also allows ciphers/digests to be 
looked up based on other identifiers which can be useful for automatic cipher 
selection, eg. in SSL/TLS, or by user-controllable configuration. 
 
The important point about this is that EVP_CIPHER definitions and structures are 
passed around with impunity and there is no safe way, without requiring massive 
rewrites of many applications, to assume that EVP_CIPHERs can be reference 
counted. One an EVP_CIPHER is exposed to the caller, neither it nor anything it 
comes from can "safely" be destroyed. Unless of course the way of getting to 
such ciphers is via entirely distinct API calls that didn't exist before. 
However existing API usage cannot be made to understand when an EVP_CIPHER 
pointer, that has been passed to the caller, is no longer being used. 
 
The other problem with the existing API w.r.t. to hooking EVP_CIPHER support 
into ENGINE is storage - the OBJ_NAME-based storage used by EVP to register 
ciphers simultaneously registers cipher *types* and cipher *implementations* - 
they are effectively the same thing, an "EVP_CIPHER" pointer. The problem with 
hooking in ENGINEs is that multiple ENGINEs may implement the same ciphers. The 
solution is necessarily that ENGINE-provided ciphers simply are not registered, 
stored, or exposed to the caller in the same manner as existing ciphers. This is 
especially necessary considering the fact ENGINE uses reference counts to allow 
for cleanup, modularity, and DSO support - yet EVP_CIPHERs, as exposed to 
callers in the current API, support no such controls. 
 
Another sticking point for integrating cipher support into ENGINE is linkage. 
Already there is a problem with the way ENGINE supports RSA, DSA, etc whereby 
they are available *because* they're part of a giant ENGINE called "openssl". 
Ie. all implementations *have* to come from an ENGINE, but we get round that by 
having a giant ENGINE with all the software support encapsulated. This creates 
linker hassles if nothing else - linking a 1-line application that calls 2 basic 
RSA functions (eg. "RSA_free(RSA_new());") will result in large quantities of 
ENGINE code being linked in *and* because of that DSA, DH, and RAND also. If we 
continue with this approach for EVP_CIPHER support (even if it *was* possible) 
we would lose our ability to link selectively by selectively loading certain 
implementations of certain functionality. Touching any part of any kind of 
crypto would result in massive static linkage of everything else. So the 
solution is to change the way ENGINE feeds existing "classes", ie. how the 
hooking to ENGINE works from RSA, DSA, DH, RAND, as well as adding new hooking 
for EVP_CIPHER, and EVP_MD. 
 
The way this is now being done is by mostly reverting back to how things used to 
work prior to ENGINE :-). Ie. RSA now has a "RSA_METHOD" pointer again - this 
was previously replaced by an "ENGINE" pointer and all RSA code that required 
the RSA_METHOD would call ENGINE_get_RSA() each time on its ENGINE handle to 
temporarily get and use the ENGINE's RSA implementation. Apart from being more 
efficient, switching back to each RSA having an RSA_METHOD pointer also allows 
us to conceivably operate with *no* ENGINE. As we'll see, this removes any need 
for a fallback ENGINE that encapsulates default implementations - we can simply 
have our RSA structure pointing its RSA_METHOD pointer to the software 
implementation and have its ENGINE pointer set to NULL. 
 
A look at the EVP_CIPHER hooking is most explanatory, the RSA, DSA (etc) cases 
turn out to be degenerate forms of the same thing. The EVP storage of ciphers, 
and the existing EVP API functions that return "software" implementations and 
descriptions remain untouched. However, the storage takes more meaning in terms 
of "cipher description" and less meaning in terms of "implementation". When an 
EVP_CIPHER_CTX is actually initialised with an EVP_CIPHER method and is about to 
begin en/decryption, the hooking to ENGINE comes into play. What happens is that 
cipher-specific ENGINE code is asked for an ENGINE pointer (a functional 
reference) for any ENGINE that is registered to perform the algo/mode that the 
provided EVP_CIPHER structure represents. Under normal circumstances, that 
ENGINE code will return NULL because no ENGINEs will have had any cipher 
implementations *registered*. As such, a NULL ENGINE pointer is stored in the 
EVP_CIPHER_CTX context, and the EVP_CIPHER structure is left hooked into the 
context and so is used as the implementation. Pretty much how things work now 
except we'd have a redundant ENGINE pointer set to NULL and doing nothing. 
 
Conversely, if an ENGINE *has* been registered to perform the algorithm/mode 
combination represented by the provided EVP_CIPHER, then a functional reference 
to that ENGINE will be returned to the EVP_CIPHER_CTX during initialisation. 
That functional reference will be stored in the context (and released on 
cleanup) - and having that reference provides a *safe* way to use an EVP_CIPHER 
definition that is private to the ENGINE. Ie. the EVP_CIPHER provided by the 
application will actually be replaced by an EVP_CIPHER from the registered 
ENGINE - it will support the same algorithm/mode as the original but will be a 
completely different implementation. Because this EVP_CIPHER isn't stored in the 
EVP storage, nor is it returned to applications from traditional API functions, 
there is no associated problem with it not having reference counts. And of 
course, when one of these "private" cipher implementations is hooked into 
EVP_CIPHER_CTX, it is done whilst the EVP_CIPHER_CTX holds a functional 
reference to the ENGINE that owns it, thus the use of the ENGINE's EVP_CIPHER is 
safe. 
 
The "cipher-specific ENGINE code" I mentioned is implemented in tb_cipher.c but 
in essence it is simply an instantiation of "ENGINE_TABLE" code for use by 
EVP_CIPHER code. tb_digest.c is virtually identical but, of course, it is for 
use by EVP_MD code. Ditto for tb_rsa.c, tb_dsa.c, etc. These instantiations of 
ENGINE_TABLE essentially provide linker-separation of the classes so that even 
if ENGINEs implement *all* possible algorithms, an application using only 
EVP_CIPHER code will link at most code relating to EVP_CIPHER, tb_cipher.c, core 
ENGINE code that is independent of class, and of course the ENGINE 
implementation that the application loaded. It will *not* however link any 
class-specific ENGINE code for digests, RSA, etc nor will it bleed over into 
other APIs, such as the RSA/DSA/etc library code. 
 
ENGINE_TABLE is a little more complicated than may seem necessary but this is 
mostly to avoid a lot of "init()"-thrashing on ENGINEs (that may have to load 
DSOs, and other expensive setup that shouldn't be thrashed unnecessarily) *and* 
to duplicate "default" behaviour. Basically an ENGINE_TABLE instantiation, for 
example tb_cipher.c, implements a hash-table keyed by integer "nid" values. 
These nids provide the uniquenness of an algorithm/mode - and each nid will hash 
to a potentially NULL "ENGINE_PILE". An ENGINE_PILE is essentially a list of 
pointers to ENGINEs that implement that particular 'nid'. Each "pile" uses some 
caching tricks such that requests on that 'nid' will be cached and all future 
requests will return immediately (well, at least with minimal operation) unless 
a change is made to the pile, eg. perhaps an ENGINE was unloaded. The reason is 
that an application could have support for 10 ENGINEs statically linked 
in, and the machine in question may not have any of the hardware those 10 
ENGINEs support. If each of those ENGINEs has a "des_cbc" implementation, we 
want to avoid every EVP_CIPHER_CTX setup from trying (and failing) to initialise 
each of those 10 ENGINEs. Instead, the first such request will try to do that 
and will either return (and cache) a NULL ENGINE pointer or will return a 
functional reference to the first that successfully initialised. In the latter 
case it will also cache an extra functional reference to the ENGINE as a 
"default" for that 'nid'. The caching is acknowledged by a 'uptodate' variable 
that is unset only if un/registration takes place on that pile. Ie. if 
implementations of "des_cbc" are added or removed. This behaviour can be 
tweaked; the ENGINE_TABLE_FLAG_NOINIT value can be passed to 
ENGINE_set_table_flags(), in which case the only ENGINEs that tb_cipher.c will 
try to initialise from the "pile" will be those that are already initialised 
(ie. it's simply an increment of the functional reference count, and no real 
"initialisation" will take place). 
 
RSA, DSA, DH, and RAND all have their own ENGINE_TABLE code as well, and the 
difference is that they all use an implicit 'nid' of 1. Whereas EVP_CIPHERs are 
actually qualitatively different depending on 'nid' (the "des_cbc" EVP_CIPHER is 
not an interoperable implementation of "aes_256_cbc"), RSA_METHODs are 
necessarily interoperable and don't have different flavours, only different 
implementations. In other words, the ENGINE_TABLE for RSA will either be empty, 
or will have a single ENGINE_PILE hashed to by the 'nid' 1 and that pile 
represents ENGINEs that implement the single "type" of RSA there is. 
 
Cleanup - the registration and unregistration may pose questions about how 
cleanup works with the ENGINE_PILE doing all this caching nonsense (ie. when the 
application or EVP_CIPHER code releases its last reference to an ENGINE, the 
ENGINE_PILE code may still have references and thus those ENGINEs will stay 
hooked in forever). The way this is handled is via "unregistration". With these 
new ENGINE changes, an abstract ENGINE can be loaded and initialised, but that 
is an algorithm-agnostic process. Even if initialised, it will not have 
registered any of its implementations (to do so would link all class "table" 
code despite the fact the application may use only ciphers, for example). This 
is deliberately a distinct step. Moreover, registration and unregistration has 
nothing to do with whether an ENGINE is *functional* or not (ie. you can even 
register an ENGINE and its implementations without it being operational, you may 
not even have the drivers to make it operate). What actually happens with 
respect to cleanup is managed inside eng_lib.c with the "engine_cleanup_***" 
functions. These functions are internal-only and each part of ENGINE code that 
could require cleanup will, upon performing its first allocation, register a 
callback with the "engine_cleanup" code. The other part of this that makes it 
tick is that the ENGINE_TABLE instantiations (tb_***.c) use NULL as their 
initialised state. So if RSA code asks for an ENGINE and no ENGINE has 
registered an implementation, the code will simply return NULL and the tb_rsa.c 
state will be unchanged. Thus, no cleanup is required unless registration takes 
place. ENGINE_cleanup() will simply iterate across a list of registered cleanup 
callbacks calling each in turn, and will then internally delete its own storage 
(a STACK). When a cleanup callback is next registered (eg. if the cleanup() is 
part of a graceful restart and the application wants to cleanup all state then 
start again), the internal STACK storage will be freshly allocated. This is much 
the same as the situation in the ENGINE_TABLE instantiations ... NULL is the 
initialised state, so only modification operations (not queries) will cause that 
code to have to register a cleanup. 
 
What else? The bignum callbacks and associated ENGINE functions have been 
removed for two obvious reasons; (i) there was no way to generalise them to the 
mechanism now used by RSA/DSA/..., because there's no such thing as a BIGNUM 
method, and (ii) because of (i), there was no meaningful way for library or 
application code to automatically hook and use ENGINE supplied bignum functions 
anyway. Also, ENGINE_cpy() has been removed (although an internal-only version 
exists) - the idea of providing an ENGINE_cpy() function probably wasn't a good 
one and now certainly doesn't make sense in any generalised way. Some of the 
RSA, DSA, DH, and RAND functions that were fiddled during the original ENGINE 
changes have now, as a consequence, been reverted back. This is because the 
hooking of ENGINE is now automatic (and passive, it can internally use a NULL 
ENGINE pointer to simply ignore ENGINE from then on). 
 
Hell, that should be enough for now ... comments welcome.