Functions, Types, and Structures
Disassembly is bytes-as-instructions. To turn it into something you can reason about, you annotate it with structure: function boundaries, parameter types, local variable layouts, and the C structs that the code is reading and writing. Radare2 has a complete type system for this, and using it well changes both the readability of disassembly and the quality of decompiler output.
Function commands
You met af (analyse function) in Part I. The full family:
| Command | What it does |
|---|---|
af | analyse function at current/given address |
af- | delete function |
af+ name | create function with explicit name |
afn name | rename current function |
afi | print function info (size, blocks, vars, callees) |
afv | list local variables |
afs | print/set function signature |
afS | set function signature from string |
afc | get/set calling convention |
afcl | list available calling conventions |
afb | list basic blocks |
afx | list xrefs from this function |
aft | propagate types from signature into body |
afta | analyse types (full pass) |
afm name | merge into named function |
A common cleanup workflow: r2 found a function but missed the call that goes to it from elsewhere. Manually:
[0x...]> af @ 0x08005000 # define
[0x...]> afn handle_packet # name it
[0x...]> afs int handle_packet(uint8_t *buf, size_t len) # signature
[0x...]> aft # propagate types into the bodyNow decompiling shows handle_packet(buf, len) everywhere, with buf as uint8_t* and len as size_t.
Calling conventions
A wrong calling convention means decompilation guesses parameters wrong. R2 ships with a calling convention database; list:
[0x...]> afcl
arm32
arm64
arm32fastcall
... (architecture-dependent)Set:
[0x...]> afc arm32 @ sym.foo
[0x...]> e anal.cc = arm32 # global default for new functionsFor embedded ARM, the relevant ones are:
| Convention | When to use |
|---|---|
arm32 | AAPCS — args in r0..r3 then stack, return in r0 |
arm32fastcall | the few cases where the compiler used a non-standard cc |
armcdecl | callee-cleanup variant (rare) |
arm64 | AArch64 AAPCS — args in x0..x7 |
For Cortex-M, the floating-point ABI matters: with FP-soft, all FP args go through integer registers; with FP-hard (most Cortex-M4F+), they use s0..s15. R2's arm32 default assumes soft-float; if your binary is hard-float, function signatures with float parameters will be wrong. Override per function:
[0x...]> afs int dsp_fft(float *in, float *out, int n) @ sym.dsp_fftLocal variables
R2 detects stack and register variables during analysis. List them in the current function:
[0x...]> afv
arg int arg1 @ r0
arg int arg2 @ r1
var int var_4h @ sp+0x4
var int var_8h @ sp+0x8Rename a variable for readability:
[0x...]> afvn buf var_4h
[0x...]> afvn count var_8hThe renames flow through the whole function disassembly and through the decompiler output. This is one of the most impactful things you can do for a function you will spend more than 5 minutes on.
Set a type for a variable:
[0x...]> afvt buf "uint8_t *"
[0x...]> afvt count size_tThen re-run decompilation (pdg) and the C output uses your types.
The type system
R2 has its own type definition language, accessible through t* commands.
| Command | Action |
|---|---|
t | list all types |
tk | list types as SDB keys |
to | open / parse a header file |
tos | parse a string of C source |
t- | delete a type |
tp | print a struct/type at an address |
tl | link a type to an address (variable annotation) |
tc | print all types as C definitions |
ts | structure operations |
tu | union operations |
te | enum operations |
tf | function-type operations |
tn | typename operations |
Define a struct from C source:
[0x...]> "tos struct usart_regs { uint32_t SR; uint32_t DR; uint32_t BRR; \
uint32_t CR1; uint32_t CR2; uint32_t CR3; uint32_t GTPR; };"(Note the double-quoted form because the command spans multiple words.)
Or load from a header file:
[0x...]> to /usr/include/arpa/inet.h
[0x...]> to ./stm32f4_regs.hOnce a type is known, reference it:
[0x...]> tp usart_regs @ 0x40011000 # print as struct at address
[0x...]> tl usart_regs = 0x40011000 # link the type to that addressThe link makes references to fields visible everywhere in the disassembly:
str r1, [r3] ; usart_regs.SR
str r2, [r3, #4] ; usart_regs.DRWithout the link you would see [r3] and [r3, #4] and have to do the offset arithmetic in your head.
Importing vendor headers
The big payoff: load the vendor's CMSIS / SDK headers and r2 understands the entire peripheral block.
[0x...]> to /opt/STM32CubeF4/Drivers/CMSIS/Device/ST/STM32F4xx/Include/stm32f407xx.h
[0x...]> tl USART_TypeDef = 0x40011000
[0x...]> tl GPIO_TypeDef = 0x40020000
... etc.Now the disassembly reads as actual struct field accesses. Every register access has a name. The decompiler output looks like the original C.
Warning
Vendor headers are huge and contain a lot of preprocessor magic that r2's to parser cannot fully chew. If to fails, run the header through cpp first:
$ cpp -P -DSTM32F407xx -nostdinc -I /opt/STM32CubeF4/Drivers/CMSIS/Include \
stm32f407xx.h > stm32_flat.hThen to stm32_flat.h. You may need to delete a few unsupported constructs (__attribute__ extensions r2 does not parse) by hand.
Enums
Enums are how you make integer constants readable. Define:
[0x...]> "te enum gpio_mode { input=0, output=1, alt=2, analog=3 };"Link a function parameter to the enum so its values render as names:
[0x...]> afvt mode "enum gpio_mode" @ sym.gpio_set_modeNow a call site mov r1, #2; bl gpio_set_mode shows gpio_set_mode(r0, gpio_mode.alt) in the decompiler output.
Function signatures and aft
afs sets a function's signature. aft propagates types from a function's signature into its body — when you set arg1 as uint8_t *buf, aft follows reads and writes through that pointer and types those locations too.
The right order is:
- Set the function signature:
afs. - Run
aftfor that function (oraftato do the whole binary). - Decompile (
pdg) and check that types flow.
afta (the all-binary type pass) is part of aaaa. It is slow on big binaries but pays for itself: types propagate transitively, so naming one root function correctly cleans up many descendants.
Structures from indirect access patterns
When you see code like:
ldr r1, [r0, #0]
ldr r2, [r0, #4]
ldrb r3, [r0, #8]
str r3, [r0, #9]you are looking at struct field access. R2 can sometimes infer the structure layout, but you usually do better by defining it yourself:
[0x...]> "tos struct packet { uint32_t magic; uint32_t length; uint8_t flags; uint8_t kind; };"
[0x...]> afvt packet "struct packet *" @ sym.handle_packet
[0x...]> aft @ sym.handle_packetAfter aft, the disassembly shows packet.magic, packet.length, packet.flags, packet.kind instead of offsets. Decompiler output is correspondingly clearer.
Saving types
Types live in the project (saved by Ps). They can also be exported as C source:
[0x...]> tc # all types as C
[0x...]> tc struct packet # one type
[0x...]> tc > types.h # to a fileCommit types.h to your firmware-RE git repo. Reload in another session with to types.h. This is the cleanest way to share the structural understanding of a binary across people or sessions.