-
Notifications
You must be signed in to change notification settings - Fork 5
/
Ghidra-Sleigh-and-Pcode-notes.txt
387 lines (307 loc) · 18.7 KB
/
Ghidra-Sleigh-and-Pcode-notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
=====================[ Ghidra's Sleigh and P-code notes ]=====================
We encourage you to reproduce these steps in Ghidra and Sleigh. Remember that fluency only
comes with practice. With complex tools, everybody stumbles around a lot---the difference
between the expert user and a novice is how quickly they try things and recover.
Today we saw one of the most exciting designs of Ghidra and RE tools: the Sleigh
language for describing processor instruction sets, and translating them to a
universal set of virtual operations, P-code. This design not only enables us to
quickly add an instruction encoding (and operational semantics) to Ghidra's ARMv4t
disassembler, but also allows one to quickly create decompilers---much more complex
than disassemblers!---for new processors, by making the decompiler work off of the
universal P-code, into which specific ISAs are translated. Although some details
are lost, it's decompilers almost for free! (see notes below)
--------------[ Teaching Ghidra's disassembler to recognize a GBA instruction ]-------
Recall that we ran into a problem with one of the ROM's functions: it contained
several instructions that weren't recognized as valid. Disassembly failed, and
with it also failed decompilation and data flow analyses.
The failing instructions are at 08000e20, 08000e3c, 08000e48, and 08000e5c:
**************************************************************
* *
* FUNCTION *
**************************************************************
undefined process_screen_string()
undefined r0:1 <RETURN>
process_screen_string XREF[10]: 08442860(c), 08442894(c),
0844291c(c), 08442950(c),
08442a0c(c), 08442abc(c),
08442afc(c),
FUN_0844897c:0844897c(c),
084489b4(c), 084489ec(c)
08000e14 04 30 bd e4 ldrt r3,[sp],#0x4
08000e18 00 b0 a0 e1 mov r11,r0
08000e1c 04 00 bd e4 ldrt r0,[sp],#0x4
08000e20 b2 ?? B2h // Ouch!
08000e21 20 ?? 20h
08000e22 fb ?? FBh
08000e23 e0 ?? E0h
08000e24 ff c0 12 e2 ands r12,r2,#0xff
08000e28 1e ff 2f 01 bxeq lr
LAB_08000e2c XREF[1]: 08000e64(j)
08000e2c 22 94 a0 e1 mov r9,r2, lsr #0x8
08000e30 b0 40 53 e0 ldrh r4,[r3],#0x0
08000e34 3f 4b 04 e2 and r4,r4,#0xfc00
08000e38 09 40 84 e0 add r4,r4,r9
08000e3c b2 ?? B2h // Ouch!!
08000e3d 40 ?? 40h @
08000e3e e3 ?? E3h
08000e3f e0 ?? E0h
08000e40 01 c0 5c e2 subs r12,r12,#0x1
08000e44 1e ff 2f 01 bxeq lr // Ouch!!!
08000e48 b2 ?? B2h
08000e49 20 ?? 20h
08000e4a fb ?? FBh
08000e4b e0 ?? E0h
08000e4c ff 90 02 e2 and r9,r2,#0xff
08000e50 b0 40 53 e0 ldrh r4,[r3],#0x0
08000e54 3f 4b 04 e2 and r4,r4,#0xfc00
08000e58 09 40 84 e0 add r4,r4,r9
08000e5c b2 ?? B2h // Where's my instruction??
08000e5d 40 ?? 40h @
08000e5e e3 ?? E3h
08000e5f e0 ?? E0h
08000e60 01 c0 5c e2 subs r12,r12,#0x1
08000e64 f0 ff ff ca bgt LAB_08000e2c
08000e68 1e ff 2f e1 bx lr
On the other hand, our GBA emulator shows these instructions as valid, the first one
ldrh r2, [r11], #0x2 , the second one
strh r4, [r3], #0x2, and so on.
At the same time, there's an ldrh instruction at 0x08000e30 that gets disassembled correctly.
Something isn't right---it seems that the disassembler misses a form of this load instruction,
and another of its corresponding store instruction, strh.
Looking up ldrh in the ARMv4t instruction manual (Ghidra's "Tools >> Processor Manual"),
we can see that that the instruction matches the bit layout (p. 464, "A8.6.74 LDRH (immediate, ARM)").
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
cond | 0 0 0| P| U| 1| W| 1| Rn | Rt | imm4H |1 0 1 1| imm4L
Our first "bad" instruction is "e0 fb 20 b2" (remember: it's little-endian, but the GBA emulator
shows us the correctly ordered 4-byte word). The encoding looks dense, but in fact it's
really simple: the P, U, and W flags control variants of the instruction, as per pseudo-code
on this and the next page. For example, U controls whether the offset "imm" is added or subtracted.
So "cond" is 0xe (Always/unconditional, cf. p.320, "A8.3 Conditional execution"), P is 0
(the ARM post-indexing form rather that Thumb's pre-indexing, the address in Rn will be used
for the memory load, the imm offset will be added to Rn thereafter), U is 1 (imm offset is added,
not subtracted), W is 1 (the base register Rn will be updated to Rn + imm), Rn is 0xb (r11),
Rt is 2, imm4 is 0 and immL is 2. All seems to be in order.
We note that the pseudocode in the manual directs us to the LDRHT form of the instruction,
p.470, but that instruction operates the same way, except for some memory protection details
likely not relevant for GBA.
Suggested exercise: manually disassemble the STRH instructions from the above sample.
----------------[ Defining ISA parsing and semantics in Sleigh ]----------------
Fortunately, Ghidra's Sleigh language and tool have a very readable and easily adjustable
description of the ARMv4t assembly. These will allow us to teach Ghidra to parse the "bad"
instructions with a few simple edits (once we understand the structure of the language).
Processor/ISA descriptions are found in ghidra_10.1.1_PUBLIC/Ghidra/Processors/ARM/data/languages/.
We are specifically interested in ARMv4t, but the file ARM.ldefs defines all the ARM variations
that you can selected in the "Languages" field when loading a Raw Binary image.
Looking inside this file, you'll see it reference processor specs (ARMt_v45.pspec for ARMv4t),
and the (compiled) ISA specs (e.g., ARM4t_le.sla). The latter is an XML file that can
be conveniently diff-ed, but not so conveniently read or written. This .sla file is compiled
by the tool ~/ghidra_10.1.1_PUBLIC/support/sleigh from ARM4t_le.slaspec .
The file ARM4t_le.slaspec is actually just a wrapper around ARM.sinc, which in turn
wraps ARMinstructions.sinc, where both the bitwise layout and semantics of actual
ARM instructions is defined:
% cat ARM4t_le.slaspec
@define ENDIAN "little"
@define T_VARIANT ""
@include "ARM.sinc"
% tail ARM.sinc
@include "ARMinstructions.sinc"
# THUMB instructions
@ifdef T_VARIANT
@include "ARMTHUMBinstructions.sinc"
@endif
When we look for ldrh in this file, we find a reasonable-looking definition. The operational
semantics description ends with zero-extending 2 bytes retrieved from an address, as expected:
:ldrh^COND Rd,addrmode3 is $(AMODE) & COND & c2527=0 & L20=1 & c0407=11 & Rd & addrmode3
{
build COND;
build addrmode3;
Rd = zext( *:2 addrmode3);
}
The first line is a pattern for the disassembler to match to produce an ldrh instruction from
the binary bytes. The bitwise predicates to test are that the address is in the ARM mode ($AMODE),
the COND prefix looks OK, and several bit spans of the instructions match, such as c2527, L20, and c0407.
These are defined at the top of the file, and are quite obviously just bit ranges:
define token instrArm (32)
cond=(28,31)
I25=(25,25)
P24=(24,24) // that's our P flag in the ldrh instruction encoding
H24=(24,24)
L24=(24,24)
U23=(23,23) // that's our U flag
B22=(22,22)
N22=(22,22)
S22=(22,22)
op=(21,24)
W21=(21,21) // ..and W
S20=(20,20)
L20=(20,20)
Rn=(16,19) // ..and the base register number
<skip>
Rd=(12,15) // ..and the destination register, a.k.a. Rt in the processor manual
<skip>
c2527=(25,27)
<skip>
c2122=(21,22)
<skip>
c0407=(4,7)
... etc.
These bit ranges all seem to match perfectly. So the tricky part that doesn't match
our "e0 fb 20 b2" instruction is likely in the addrmode3.
Looking at the definitions of addrmode3 at the top of the file, there are multiple
patterns to match based on the values of the P, U, and W bits (W is included in c2122
in these definitions).
For example, in the pre-index mode (P==1) for positive and negative offsets, without writeback
to the base register Rn:
addrmode3: [rn,"#"^off8] is P24=1 & U23=1 & c2122=2 & rn & immedH & c0707=1 & c0404=1 & immedL
[ off8=(immedH<<4)|immedL;]
{
local tmp = rn + off8; export tmp;
}
addrmode3: [rn,"#"^noff8] is P24=1 & U23=0 & c2122=2 & rn & immedH & c0707=1 & c0404=1 & immedL
[ noff8=-((immedH<<4)|immedL);]
{
local tmp = rn + noff8; export tmp;
}
Notably, P24 has 10 cases for P=1 but only 8 for P=0 (find these in the code!):
% grep P24=0 ARMinstructions.sinc | grep addrmode3 | wc
8 156 810
% grep P24=1 ARMinstructions.sinc | grep addrmode3 | wc
10 180 934
So the reason for the "bad" instruction may be that we are missing some bitwise cases in these Sleigh
definitions.
Looking at the pseudocode for LDRH and LDRHT in the processor manual, it seems that the following
additional definitions would cover our "bad" instructions. Note that P is 0, but W (inside c2122) is 1,
just as we have it.
###### SB: preindex, with writeback, add
addrmode3: [rn],"#"^noff8 is P24=0 & U23=1 & c2122=3 & rn & immedH & c0707=1 & c0404=1 & immedL
[ noff8=(immedH<<4)|immedL;]
{
local tmp=rn; rn=rn + noff8; export tmp;
}
###### SB: preindex, with writeback, substract
addrmode3: [rn],"#"^noff8 is P24=0 & U23=0 & c2122=3 & rn & immedH & c0707=1 & c0404=1 & immedL
[ noff8=-((immedH<<4)|immedL);]
{
local tmp=rn; rn=rn + noff8; export tmp;
}
Note that this code gives the postindex semantics: the original address in Rn is saved and used for the
memory access, and (only) then the value in Rn is updated.
This seems to work!
% ~/ghidra_10.1.1_PUBLIC/support/sleigh ARM4t_le.slaspec
openjdk version "17.0.1" 2021-10-19
OpenJDK Runtime Environment Homebrew (build 17.0.1+1)
OpenJDK 64-Bit Server VM Homebrew (build 17.0.1+1, mixed mode)
INFO Using log config file: jar:file:/Users/user/ghidra_10.1.1_PUBLIC/Ghidra/Framework/Generic/lib/Generic.jar!/generic.log4j.xml (LoggingInitialization)
INFO Using log file: /Users/user/.ghidra/.ghidra_10.1.1_PUBLIC/application.log (LoggingInitialization)
WARN 150 NOP constructors found (SleighCompile)
WARN Use -n switch to list each individually (SleighCompile)
WARN 2 unnecessary extensions/truncations were converted to copies (SleighCompile)
WARN Use -u switch to list each individually (SleighCompile)
A new ARM4t_le.sla is produced. We quit and restart Ghidra, reload the GBA project, and suddenly we
have the full disassembly _and_ decompilation (you may need to press "d" on these instructions,
but now it works!):
**************************************************************
* *
* FUNCTION *
**************************************************************
undefined process_screen_string()
undefined r0:1 <RETURN>
process_screen_string XREF[10]: 08442860(c), 08442894(c),
0844291c(c), 08442950(c),
08442a0c(c), 08442abc(c),
08442afc(c),
FUN_0844897c:0844897c(c),
084489b4(c), 084489ec(c)
08000e14 04 30 bd e4 ldrt r3,[sp],#0x4
08000e18 00 b0 a0 e1 mov r11,r0
08000e1c 04 00 bd e4 ldrt r0,[sp],#0x4
08000e20 b2 20 fb e0 ldrh r2,[r11],#0x2 // Yay!
08000e24 ff c0 12 e2 ands r12,r2,#0xff
08000e28 1e ff 2f 01 bxeq lr
LAB_08000e2c XREF[1]: 08000e64(j)
08000e2c 22 94 a0 e1 mov r9,r2, lsr #0x8
08000e30 b0 40 53 e0 ldrh r4,[r3],#0x0
08000e34 3f 4b 04 e2 and r4,r4,#0xfc00
08000e38 09 40 84 e0 add r4,r4,r9
08000e3c b2 40 e3 e0 strh r4,[r3],#0x2 // Yay!!
08000e40 01 c0 5c e2 subs r12,r12,#0x1
08000e44 1e ff 2f 01 bxeq lr
08000e48 b2 20 fb e0 ldrh r2,[r11],#0x2
08000e4c ff 90 02 e2 and r9,r2,#0xff
08000e50 b0 40 53 e0 ldrh r4,[r3],#0x0
08000e54 3f 4b 04 e2 and r4,r4,#0xfc00
08000e58 09 40 84 e0 add r4,r4,r9
08000e5c b2 40 e3 e0 strh r4,[r3],#0x2 // Yay!!!
08000e60 01 c0 5c e2 subs r12,r12,#0x1
08000e64 f0 ff ff ca bgt LAB_08000e2c
08000e68 1e ff 2f e1 bx lr
-----------------[ Free decompilers for new processors! ]------------------
Disassemblers are tricky, but program analysis tools and decompilers are much
harder---years in expert developer effort, prior to Ghidra. Ghidra's truly revolutionary
contribution was to make it much faster, within the reach of a dedicated developer's
hobby project.
Some stories of such rapid development:
https://guedou.github.io/talks/2019_BeeRump/slides.pdf -- "Implementing a New CPU Architecture for Ghidra",
@guedou (note the minimal examples of what worked)
https://swarm.ptsecurity.com/creating-a-ghidra-processor-module-in-sleigh-using-v8-bytecode-as-an-example/
-- "Creating a Ghidra processor module in SLEIGH using V8 bytecode as an example", Natalya Tlyapova
https://www.reddit.com/r/ghidra/comments/f5lk42/my_experience_writing_processor_modules/
https://habr.com/en/post/443318/ (applied to a Web Assembly challenge)
https://github.com/VGKintsugi/Ghidra-SegaSaturn-Processor
There's also a mock-up processor tutorial:
https://spinsel.dev/2020/06/17/ghidra-brainfuck-processor-1.html
(and many more)
----------------[ P-code ]----------------
Ghidra disassemblers produce not just a visual representation of the raw binary code they parse,
but also an executable operational semantics representation of the ISA, in P-code. Each raw assembly
instruction is translated into one or more---typically several---P-code instructions, which
fairly closely represent the pseudo-code of processor manuals.
At the core, P-code represents the common elementary blocks of CPU logic, such as integer
arithmetic at various widths, boolean logic operations, and control flow transfers such as
direct and indirect jumps. Raw P-code also includes CALL and RETURN opcodes that are
semantically equivalent to jumps---unlike typical ISA calls and returns, they don't save
the return address, handle the call stack, etc., leaving all that to other explicit elementary
P-code instructions----but preserve the programmer's/compilers intent that those control
transfers are indeed intended to be calls or returns. These pseudo-instructions are a favorite
for Ghidra scripts and program analyses.
Basic P-code corresponding to raw disassembly (and explaining exactly what the particular
instructions do, so far as Ghidra knows from its Sleigh description of the CPU and ISA)
can be displayed by clicking on the "jenga" button and enabling the P-code field, as follows:
https://reverseengineering.stackexchange.com/questions/21297/can-ghidra-show-me-the-p-code-generated-for-an-instruction?noredirect=1
At the same time, P-code includes additional opcodes like MULTIEQUAL and USERDEFINED that
are inserted and used by advanced program analyses such as backward program path traversal
plugins (see below).
A good concise description of P-code can be found at
https://spinsel.dev/assets/2020-06-17-ghidra-brainfuck-processor-1/ghidra_docs/language_spec/html/pcoderef.html
Suggested exercise: Display, read, and understand P-code for BL, BX, ADD, LDRT, and other instructions.
----------------------[ Program analysis with P-code ]----------------------
P-code is exposed to Ghidra scripts and plugins for automating analysis of binary programs.
A very useful blogpost about using P-code for program analysis is
https://www.riverloopsecurity.com/blog/2019/05/pcode/
(accompanied by https://github.com/0xAlexei/INFILTRATE2019/blob/master/PCodeMallocDemo/MallocTrace.java)
(I mentioned it in class)
----------------------[ P-code is executable! ]----------------------
P-code is executable and can be run in an emulator. Executing P-code would be a good way
to check that the added instructions for the GBA example actually match the GBA
VisualGameBoy-m emulator's idea of what the recovered instructions actually do.
Some more information about P-code emulation:
https://medium.com/@cetfor/emulating-ghidras-pcode-why-how-dd736d22dfb
(Look at Emulate Program description in Ghidra Release notes:
https://htmlpreview.github.io/?https://github.com/NationalSecurityAgency/ghidra/blob/Ghidra_10.1.2_build/Ghidra/Configurations/Public_Release/src/global/docs/WhatsNew.html:
"""
"Pure Emulation
There's a new action Emulate Program (next to the Debug Program button) to launch the
current program in Ghidra's p-code emulator. This is not a new "connector." Rather, it
starts a blank trace with the current program mapped in. The user can then step using the
usual "Emulate Step" actions in the "Threads" window. In general, this is sufficient to
run simple experiments or step through local regions of code. To modify emulated machine
state, use the "Watches" window. At the moment, no other provider can modify emulated
machine state.
This is also very useful in combination with the "P-code Stepper" window (this plugin must
be added manually via File->Configure). A language developer can, for example, assemble an
instruction that needs testing, start emulating with the cursor at that instruction, and
then step individual p-code ops in the "P-code Stepper" window.
"""
)
------
We will continue with examples of Ghidra scripts and plugins.