angr初探
type
status
date
slug
tags
summary
category
icon
password
angr的文档还是写的比较有意思而且详细的,适合当睡前读物,点名批评某二进制分析工具的文档...…
参考链接:
虚拟环境
angr
官方推荐在虚拟环境中运行,防止与外部包冲突(例如keystone
和keystone-engine
冲突,这里用的是keystone-engine
)使用样例
导入模块
命令行使用时可以导入monkeyhex
转化为十六进制输出
project的基础属性
对基本块的操作
模拟状态 SimState
模拟管理器 Simulation Managers
Analyses
(未必准确的解释,后续慢慢验证)
The Loader
Symbols and Relocations
CLE寻找符号地址
The Symbol object has three ways of reporting its address:
.rebased_addr
is its address in the global address space. This is what is shown in the print output.
.linked_addr
is its address relative to the prelinked base of the binary. This is the address reported in, for example,readelf(1)
.
.relative_addr
is its address relative to the object base. This is known in the literature (particularly the Windows literature) as an RVA (relative virtual address).
Loading Options
If you are loading something with
angr.Project
and you want to pass an option to the cle.Loader
instance that Project implicitly creates, you can just pass the keyword argument directly to the Project constructor, and it will be passed on to CLE. You should look at the CLE API docs. if you want to know everything that could possibly be passed in as an option, but we will go over some important and frequently used options here.We’ve discussed
auto_load_libs
already - it enables or disables CLE’s attempt to automatically resolve shared library dependencies, and is on by default. Additionally, there is the opposite, except_missing_libs
, which, if set to true, will cause an exception to be thrown whenever a binary has a shared library dependency that cannot be resolved.You can pass a list of strings to
force_load_libs
and anything listed will be treated as an unresolved shared library dependency right out of the gate, or you can pass a list of strings to skip_libs
to prevent any library of that name from being resolved as a dependency. Additionally, you can pass a list of strings (or a single string) to ld_path
, which will be used as an additional search path for shared libraries, before any of the defaults: the same directory as the loaded program, the current working directory, and your system libraries.If you want to specify some options that only apply to a specific binary object, CLE will let you do that too. The parameters
main_opts
and lib_opts
do this by taking dictionaries of options. main_opts
is a mapping from option names to option values, while lib_opts
is a mapping from library name to dictionaries mapping option names to option values.The options that you can use vary from backend to backend, but some common ones are:
backend
- which backend to use, as either a class or a name
base_addr
- a base address to use
entry_point
- an entry point to use
arch
- the name of an architecture to use
Example:
Symbolic Function Summaries
hook
可以尝试
然后可以进行hook修改寄存器等操作
符号样例
Bitvectors
可以通过ASTs分析
Symbolic Constraints
Performing comparison operations between any two similarly-typed ASTs will yield another AST - not a bitvector, but now a symbolic boolean.
判断
Constraint Solving
Floating point numbers
This is nice, but sometimes we need to be able to work directly with the representation of the float as a bitvector. You can interpret bitvectors as floats and vice versa, with the methods
raw_to_bv
and raw_to_fp
:More Solving Methods
eval
will give you one possible solution to an expression, but what if you want several? What if you want to ensure that the solution is unique? The solver provides you with several methods for common solving patterns:solver.eval(expression)
will give you one possible solution to the given expression.
solver.eval_one(expression)
will give you the solution to the given expression, or throw an error if more than one solution is possible.
solver.eval_upto(expression, n)
will give you up to n solutions to the given expression, returning fewer than n if fewer than n are possible.
solver.eval_atleast(expression, n)
will give you n solutions to the given expression, throwing an error if fewer than n are possible.
solver.eval_exact(expression, n)
will give you n solutions to the given expression, throwing an error if fewer or more than are possible.
solver.min(expression)
will give you the minimum possible solution to the given expression.
solver.max(expression)
will give you the maximum possible solution to the given expression.
Additionally, all of these methods can take the following keyword arguments:
extra_constraints
can be passed as a tuple of constraints. These constraints will be taken into account for this evaluation, but will not be added to the state.
cast_to
can be passed a data type to cast the result to. Currently, this can only beint
andbytes
, which will cause the method to return the corresponding representation of the underlying data. For example,state.solver.eval(state.solver.BVV(0x41424344, 32), cast_to=bytes)
will returnb'ABCD'
.
Machine State - memory, registers, and so on
quick examples:
这里开始使用更简单的执行方法:
state.step()
,其会进行一步符号执行并且返回名为angr.engines.successors.SimSuccessors
的对象,并且会提供若干可以被分类成不同执行路径的继承状态,关注该对象的 .successors
属性,其是一个包含所有“normal” successors of a given step的list。该list会包含所有constraint的正误状态作为新的constraint
(这里的example应该是用一个strcmp作为constraint)
可以use
state.posix.stdin.load(0, state.posix.stdin.size)
to retrieve a bitvector representing all the content read from stdin so farState Presets
project.factory.
.blank_state()
constructs a “blank slate” blank state, with most of its data left uninitialized. When accessing uninitialized data, an unconstrained symbolic value will be returned.
.entry_state()
constructs a state ready to execute at the main binary’s entry point.
.full_init_state()
constructs a state that is ready to execute through any initializers that need to be run before the main binary’s entry point, for example, shared library constructors or preinitializers. When it is finished with these it will jump to the entry point.
.call_state()
constructs a state ready to execute a given function.
使用方法:
- 传入起始地址:
All of these constructors can take an
addr
argument to specify the exact address to start.- 传入参数:
If you’re executing in an environment that can take command line arguments or an environment, you can pass a list of arguments through
args
and a dictionary of environment variables through env
into entry_state
and full_init_state
. The values in these structures can be strings or bitvectors, and will be serialized into the state as the arguments and environment to the simulated execution. The default args
is an empty list, so if the program you’re analyzing expects to find at least an argv[0]
, you should always provide that!- 可以传入符号
If you’d like to have
argc
be symbolic, you can pass a symbolic bitvector as argc
to the entry_state
and full_init_state
constructors. Be careful, though: if you do this, you should also add a constraint to the resulting state that your value for argc cannot be larger than the number of args you passed into args
.- 传入函数参数
To use the call state, you should call it with
.call_state(addr, arg1, arg2, ...)
, where addr
is the address of the function you want to call and argN
is the Nth argument to that function, either as a Python integer, string, or array, or a bitvector. If you want to have memory allocated and actually pass in a pointer to an object, you should wrap it in an PointerWrapper, i.e. angr.PointerWrapper("point to me!")
. The results of this API can be a little unpredictable, but we’re working on it. To specify the calling convention used for a function with
call_state
, you can pass a SimCC
instance as the cc
argument.:raw-html-m2r: We try to pick a sane default, but for special cases you will need to help angr out.对内存操作
对内存地址批量操作
对寄存器操作
state.registers
: Intermediate Representation - angr documentationangr/archinfo: Classes with architecture-specific information useful to other projects. (github.com)
State Options
Plugins
State Plugins
For example, the normal
memory
plugin simulates a flat memory space, but analyses can choose to enable the “abstract memory” plugin, which uses alternate data types for addresses to simulate free-floating memory mappings independent of address, to provide state.memory
. Conversely, plugins can reduce code complexity: state.memory
and state.registers
are actually two different instances of the same plugin, since the registers are emulated with an address space as well.The globals plugin
state.globals
is an extremely simple plugin: it implements the interface of a standard Python dict, allowing you to store arbitrary data on a state.The history plugin
state.history
is a very important plugin storing historical data about the path a state has taken during execution. It is actually a linked list of several history nodes, each one representing a single round of execution—you can traverse this list with state.history.parent.parent
etc.To make it more convenient to work with this structure, the history also provides several efficient iterators over the history of certain values. In general, these values are stored as
history.recent_NAME
and the iterator over them is just history.NAME
. For example, for addr in state.history.bbl_addrs: print hex(addr)
will print out a basic block address trace for the binary, while state.history.recent_bbl_addrs
is the list of basic blocks executed in the most recent step, state.history.parent.recent_bbl_addrs
is the list of basic blocks executed in the previous step, etc. If you ever need to quickly obtain a flat list of these values, you can access .hardcopy
, e.g. state.history.bbl_addrs.hardcopy
. Keep in mind though, index-based accessing is implemented on the iterators.Here is a brief listing of some of the values stored in the history:
history.descriptions
is a listing of string descriptions of each of the rounds of execution performed on the state.
history.bbl_addrs
is a listing of the basic block addresses executed by the state. There may be more than one per round of execution, and not all addresses may correspond to binary code - some may be addresses at which SimProcedures are hooked.
history.jumpkinds
is a listing of the disposition of each of the control flow transitions in the state’s history, as VEX enum strings.
history.jump_guards
is a listing of the conditions guarding each of the branches that the state has encountered.
history.events
is a semantic listing of “interesting events” which happened during execution, such as the presence of a symbolic jump condition, the program popping up a message box, or execution terminating with an exit code.
history.actions
is usually empty, but if you add theangr.options.refs
options to the state, it will be populated with a log of all the memory, register, and temporary value accesses performed by the program.
The callstack plugin
angr will track the call stack for the emulated program. On every call instruction, a frame will be added to the top of the tracked callstack, and whenever the stack pointer drops below the point where the topmost frame was called, a frame is popped. This allows angr to robustly store data local to the current emulated function.
Similar to the history, the callstack is also a linked list of nodes, but there are no provided iterators over the contents of the nodes - instead you can directly iterate over
state.callstack
to get the callstack frames for each of the active frames, in order from most recent to oldest. If you just want the topmost frame, this is state.callstack
.callstack.func_addr
is the address of the function currently being executed
callstack.call_site_addr
is the address of the basic block which called the current function
callstack.stack_ptr
is the value of the stack pointer from the beginning of the current function
callstack.ret_addr
is the location that the current function will return to if it returns
I/O
Copying and Merging
A state supports very fast copies, so that you can explore different possibilities:
States can also be merged together.
Simulation Managers
描述:
Stepping
.step()
: 前进一个basic block.run()
:执行到所有deadended,并且获得所有deadended states(例如到达exit syscall,此时该state会被从active stash
中移除并放入deadended states
)Stash Management
.move()
:from_stash
to_stash
filter_func (optional, default:everything)
stash的类型为list,可以通过如下方式访问:
所以link呢
Stash types
Stash | Description |
active | This stash contains the states that will be stepped by default, unless an alternate stash is specified. |
deadended | A state goes to the deadended stash when it cannot continue the execution for some reason, including no more valid instructions, unsat state of all of its successors, or an invalid instruction pointer. |
pruned | When using LAZY_SOLVES , states are not checked for satisfiability unless absolutely necessary. When a state is found to be unsat in the presence of LAZY_SOLVES , the state hierarchy is traversed to identify when, in its history, it initially became unsat. All states that are descendants of that point (which will also be unsat, since a state cannot become un-unsat) are pruned and put in this stash. |
unconstrained | If the save_unconstrained option is provided to the SimulationManager constructor, states that are determined to be unconstrained (i.e., with the instruction pointer controlled by user data or some other source of symbolic data) are placed here. |
unsat | If the save_unsat option is provided to the SimulationManager constructor, states that are determined to be unsatisfiable (i.e., they have constraints that are contradictory, like the input having to be both “AAAA” and “BBBB” at the same time) are placed here. |
errored | If, during execution, an error is raised, then the state will be wrapped in an ErrorRecord object, which contains the state and the error it raised, and then the record will be inserted into errored . launch a debug shell at the site of the error with record.debug() . |
Exploration
.explore()
:find
argument(指令的结束地址或结束地址列表,或函数根据某种标准的返回状态)当满足后,放入
found
stash,然后结束符号执行,可以同样声明avoid
condition(格式与find
相同)num_find
指定return前找到多少数量的find
(default = 1,如果所有active stash的state被全部执行则同样return)eg.
Exploration Techniques
angr ships with several pieces of canned functionality that let you customize the behavior of a simulation manager, called exploration techniques. The archetypical example of why you would want an exploration technique is to modify the pattern in which the state space of the program is explored - the default “step everything at once” strategy is effectively breadth-first search, but with an exploration technique you could implement, for example, depth-first search. However, the instrumentation power of these techniques is much more flexible than that - you can totally alter the behavior of angr’s stepping process. Writing your own exploration techniques will be covered in a later chapter.
To use an exploration technique, call
simgr.use_technique(tech)
, where tech is an instance of an ExplorationTechnique subclass. angr’s built-in exploration techniques can be found under angr.exploration_techniques
.Here’s a quick overview of some of the built-in ones:
- DFS: Depth first search, as mentioned earlier. Keeps only one state active at once, putting the rest in the
deferred
stash until it deadends or errors.
- Explorer: This technique implements the
.explore()
functionality, allowing you to search for and avoid addresses.
- LengthLimiter: Puts a cap on the maximum length of the path a state goes through.
- LoopSeer: Uses a reasonable approximation of loop counting to discard states that appear to be going through a loop too many times, putting them in a
spinning
stash and pulling them out again if we run out of otherwise viable states.
- ManualMergepoint: Marks an address in the program as a merge point, so states that reach that address will be briefly held, and any other states that reach that same point within a timeout will be merged together.
- MemoryWatcher: Monitors how much memory is free/available on the system between simgr steps and stops exploration if it gets too low.
- Oppologist: The “operation apologist” is an especially fun gadget - if this technique is enabled and angr encounters an unsupported instruction, for example a bizzare and foreign floating point SIMD op, it will concretize all the inputs to that instruction and emulate the single instruction using the unicorn engine, allowing execution to continue.
- Spiller: When there are too many states active, this technique can dump some of them to disk in order to keep memory consumption low.
- Threading: Adds thread-level parallelism to the stepping process. This doesn’t help much because of Python’s global interpreter locks, but if you have a program whose analysis spends a lot of time in angr’s native-code dependencies (unicorn, z3, libvex) you can seem some gains.
- Tracer: An exploration technique that causes execution to follow a dynamic trace recorded from some other source. The dynamic tracer repository has some tools to generate those traces.
- Veritesting: An implementation of a [CMU paper](https://users.ece.cmu.edu/~dbrumley/pdf/Avgerinos et al._2014_Enhancing Symbolic Execution with Veritesting.pdf) on automatically identifying useful merge points. This is so useful, you can enable it automatically with
veritesting=True
in the SimulationManager constructor! Note that it frequenly doesn’t play nice with other techniques due to the invasive way it implements static symbolic execution.
Look at the API documentation for the
SimulationManager
and ExplorationTechnique
classes for more information.Simulation and Instrumentation
Attribute | Guard Condition | Instruction Pointer | Description |
successors | True (can be symbolic, but constrained to True) | Can be symbolic (but 256 solutions or less; see unconstrained_successors ). | A normal, satisfiable successor state to the state processed by the engine. The instruction pointer of this state may be symbolic (i.e., a computed jump based on user input), so the state might actually represent several potential continuations of execution going forward. |
unsat_successors | False (can be symbolic, but constrained to False). | Can be symbolic. | Unsatisfiable successors. These are successors whose guard conditions can only be false (i.e., jumps that cannot be taken, or the default branch of jumps that must be taken). |
flat_successors | True (can be symbolic, but constrained to True). | Concrete value. | As noted above, states in the successors list can have symbolic instruction pointers. This is rather confusing, as elsewhere in the code (i.e., in SimEngineVEX.process , when it’s time to step that state forward), we make assumptions that a single program state only represents the execution of a single spot in the code. To alleviate this, when we encounter states in successors with symbolic instruction pointers, we compute all possible concrete solutions (up to an arbitrary threshold of 256) for them, and make a copy of the state for each such solution. We call this process “flattening”. These flat_successors are states, each of which has a different, concrete instruction pointer. For example, if the instruction pointer of a state in successors was X+5 , where X had constraints of X > 0x800000 and X <= 0x800010 , we would flatten it into 16 different flat_successors states, one with an instruction pointer of 0x800006 , one with 0x800007 , and so on until 0x800015 . |
unconstrained_successors | True (can be symbolic, but constrained to True). | Symbolic (with more than 256 solutions). | During the flattening procedure described above, if it turns out that there are more than 256 possible solutions for the instruction pointer, we assume that the instruction pointer has been overwritten with unconstrained data (i.e., a stack overflow with user data). This assumption is not sound in general. Such states are placed in unconstrained_successors and not in successors . |
all_successors | Anything | Can be symbolic. | This is successors + unsat_successors + unconstrained_successors . |
Breakpoints
Event type | Event meaning |
mem_read | Memory is being read. |
mem_write | Memory is being written. |
address_concretization | A symbolic memory access is being resolved. |
reg_read | A register is being read. |
reg_write | A register is being written. |
tmp_read | A temp is being read. |
tmp_write | A temp is being written. |
expr | An expression is being created (i.e., a result of an arithmetic operation or a constant in the IR). |
statement | An IR statement is being translated. |
instruction | A new (native) instruction is being translated. |
irsb | A new basic block is being translated. |
constraints | New constraints are being added to the state. |
exit | A successor is being generated from execution. |
fork | A symbolic execution state has forked into multiple states. |
symbolic_variable | A new symbolic variable is being created. |
call | A call instruction is hit. |
return | A ret instruction is hit. |
simprocedure | A simprocedure (or syscall) is executed. |
dirty | A dirty IR callback is executed. |
syscall | A syscall is executed (called in addition to the simprocedure event). |
engine_process | A SimEngine is about to process some code. |
These events expose different attributes:
Event type | Attribute name | Attribute availability | Attribute meaning |
mem_read | mem_read_address | BP_BEFORE or BP_AFTER | The address at which memory is being read. |
mem_read | mem_read_expr | BP_AFTER | The expression at that address. |
mem_read | mem_read_length | BP_BEFORE or BP_AFTER | The length of the memory read. |
mem_read | mem_read_condition | BP_BEFORE or BP_AFTER | The condition of the memory read. |
mem_write | mem_write_address | BP_BEFORE or BP_AFTER | The address at which memory is being written. |
mem_write | mem_write_length | BP_BEFORE or BP_AFTER | The length of the memory write. |
mem_write | mem_write_expr | BP_BEFORE or BP_AFTER | The expression that is being written. |
mem_write | mem_write_condition | BP_BEFORE or BP_AFTER | The condition of the memory write. |
reg_read | reg_read_offset | BP_BEFORE or BP_AFTER | The offset of the register being read. |
reg_read | reg_read_length | BP_BEFORE or BP_AFTER | The length of the register read. |
reg_read | reg_read_expr | BP_AFTER | The expression in the register. |
reg_read | reg_read_condition | BP_BEFORE or BP_AFTER | The condition of the register read. |
reg_write | reg_write_offset | BP_BEFORE or BP_AFTER | The offset of the register being written. |
reg_write | reg_write_length | BP_BEFORE or BP_AFTER | The length of the register write. |
reg_write | reg_write_expr | BP_BEFORE or BP_AFTER | The expression that is being written. |
reg_write | reg_write_condition | BP_BEFORE or BP_AFTER | The condition of the register write. |
tmp_read | tmp_read_num | BP_BEFORE or BP_AFTER | The number of the temp being read. |
tmp_read | tmp_read_expr | BP_AFTER | The expression of the temp. |
tmp_write | tmp_write_num | BP_BEFORE or BP_AFTER | The number of the temp written. |
tmp_write | tmp_write_expr | BP_AFTER | The expression written to the temp. |
expr | expr | BP_BEFORE or BP_AFTER | The IR expression. |
expr | expr_result | BP_AFTER | The value (e.g. AST) which the expression was evaluated to. |
statement | statement | BP_BEFORE or BP_AFTER | The index of the IR statement (in the IR basic block). |
instruction | instruction | BP_BEFORE or BP_AFTER | The address of the native instruction. |
irsb | address | BP_BEFORE or BP_AFTER | The address of the basic block. |
constraints | added_constraints | BP_BEFORE or BP_AFTER | The list of constraint expressions being added. |
call | function_address | BP_BEFORE or BP_AFTER | The name of the function being called. |
exit | exit_target | BP_BEFORE or BP_AFTER | The expression representing the target of a SimExit. |
exit | exit_guard | BP_BEFORE or BP_AFTER | The expression representing the guard of a SimExit. |
exit | exit_jumpkind | BP_BEFORE or BP_AFTER | The expression representing the kind of SimExit. |
symbolic_variable | symbolic_name | BP_AFTER | The name of the symbolic variable being created. The solver engine might modify this name (by appending a unique ID and length). Check the symbolic_expr for the final symbolic expression. |
symbolic_variable | symbolic_size | BP_AFTER | The size of the symbolic variable being created. |
symbolic_variable | symbolic_expr | BP_AFTER | The expression representing the new symbolic variable. |
address_concretization | address_concretization_strategy | BP_BEFORE or BP_AFTER | The SimConcretizationStrategy being used to resolve the address. This can be modified by the breakpoint handler to change the strategy that will be applied. If your breakpoint handler sets this to None, this strategy will be skipped. |
address_concretization | address_concretization_action | BP_BEFORE or BP_AFTER | The SimAction object being used to record the memory action. |
address_concretization | address_concretization_memory | BP_BEFORE or BP_AFTER | The SimMemory object on which the action was taken. |
address_concretization | address_concretization_expr | BP_BEFORE or BP_AFTER | The AST representing the memory index being resolved. The breakpoint handler can modify this to affect the address being resolved. |
address_concretization | address_concretization_add_constraints | BP_BEFORE or BP_AFTER | Whether or not constraints should/will be added for this read. |
address_concretization | address_concretization_result | BP_AFTER | The list of resolved memory addresses (integers). The breakpoint handler can overwrite these to effect a different resolution result. |
syscall | syscall_name | BP_BEFORE or BP_AFTER | The name of the system call. |
simprocedure | simprocedure_name | BP_BEFORE or BP_AFTER | The name of the simprocedure. |
simprocedure | simprocedure_addr | BP_BEFORE or BP_AFTER | The address of the simprocedure. |
simprocedure | simprocedure_result | BP_AFTER | The return value of the simprocedure. You can also override it in BP_BEFORE, which will cause the actual simprocedure to be skipped and for your return value to be used instead. |
simprocedure | simprocedure | BP_BEFORE or BP_AFTER | The actual SimProcedure object. |
dirty | dirty_name | BP_BEFORE or BP_AFTER | The name of the dirty call. |
dirty | dirty_handler | BP_BEFORE | The function that will be run to handle the dirty call. You can override this. |
dirty | dirty_args | BP_BEFORE or BP_AFTER | The address of the dirty. |
dirty | dirty_result | BP_AFTER | The return value of the dirty call. You can also override it in BP_BEFORE, which will cause the actual dirty call to be skipped and for your return value to be used instead. |
engine_process | sim_engine | BP_BEFORE or BP_AFTER | The SimEngine that is processing. |
engine_process | successors | BP_BEFORE or BP_AFTER | The SimSuccessors object defining the result of the engine. |
eg. 每当程序状态
s
执行内存读取时,angr
都会在读取完成后立即调用track_reads
,打印出读取的值和发生读取的内存地址。声明函数作为condition
Caution about mem_read
breakpoint
The
mem_read
breakpoint gets triggered anytime there are memory reads by either the executing program or the binary analysis. If you are using breakpoint on mem_read
and also using state.mem
to load data from memory addresses, then know that the breakpoint will be fired as you are technically reading memory.So if you want to load data from memory and not trigger any
mem_read
breakpoint you have had set up, then use state.memory.load
with the keyword arguments disable_actions=True
and inspect=False
.This is also true for
state.find
and you can use the same keyword arguments to prevent mem_read
breakpoints from firing.Analyses
the idea is that all the analyses appear under
project.analyses
(for example, project.analyses.CFGFast()
) and can be called as functions, returning analysis result instances.Name | Description |
CFGFast | Constructs a fast Control Flow Graph of the program |
CFGEmulated | Constructs an accurate Control Flow Graph of the program |
VFG | Performs VSA on every function of the program, creating a Value Flow Graph and detecting stack variables |
DDG | Calculates a Data Dependency Graph, allowing one to determine what statements a given value depends on |
BackwardSlice | Computes a Backward Slice of a program with respect to a certain target |
Identifier | Identifies common library functions in CGC binaries |
More! | angr has quite a few analyses, most of which work! If you’d like to know how to use one, please submit an issue requesting documentation. |
Resilience
Analyses can be written to be resilient, and catch and log basically any error. These errors, depending on how they’re caught, are logged to the
errors
or named_errors
attribute of the analysis. However, you might want to run an analysis in “fail fast” mode, so that errors are not handled. To do this, the argument fail_fast=True
can be passed into the analysis constructor.Symbolic Execution
为什么这里是todo...
angr_ctf
官方给的样例,文档里还有很多真实ctf比赛的例题orz,初探就做到这里吧
环境配置
添加环境变量:
支持编译32位程序包:
生成可执行程序
00_angr_find
直接找对应标准输出的输入即可
01_angr_avoid
main函数的节点过多
可以看到
avoid_me
函数被大量调用这里需要让angr走到
avoid_me
函数后就剪枝可以使用函数传入所有需要avoid的状态
02_angr_find_condition
和上面一样
03_angr_symbolic_registers
和上面一样可以打通
不过官方exp是打算让分段打(yysy,看起来没啥用,也就是省去了初始化的一些时间,不会优化太多):
04_angr_symbolic_stack
老exp还是可以打通.....不过官方exp是要求把栈模拟一下的,贴一下先
分析一下:
主要是这里:
实际上就是找到栈上的参数,用于做初始化状态,再进行符号执行
05_angr_symbolic_memory
对应到全局变量的方法
06_angr_symbolic_dynamic_memory
不过这题主要还是教你把堆模拟(分配未分配过的内存即可)
07_angr_symbolic_file
还是贴一个官方的吧,其实就是教怎么模拟文件
ps. 找到issue了:
Scaffold and solution challenge 07 are not working with latest angr, because SimFile class changed.
This is working code with latest version of angr for the filesystem part:
TODO:改了issue仍旧打不通
08_angr_constraints
原理:
TODO:这里尝试过在check的jnz地址处对zf寄存器状态做剪枝,也跑不出来,后续看看为啥
官方给的解法是手动获取模拟比较(设置终止状态在真正的check之前,然后手动设置比较),将其转化为constraint,约束求解得到最终结果,贴一个吧先:
09_angr_hooks
一开始写的exp(注意给password初始值写上,开始的时候忘了)可以打通
然后看看官方怎么hook的
结论是这么写将中间那个过不去的函数转化为手动的check,那比上一个的做法还简洁点点
主要就是这里:
直接贴一个:
10_angr_simprocedures
将check部分的basic block拆了很多很多,看源码是加了不透明谓词和分发块:
ida做了的伪代码做了代码优化,看起来其实和前面两个题差不多
但是下面这种写法不行,不知道是为啥不能这样hook,这样hook的话约束不出来解:
(先hook找输入的地址,然后手写check,但是约束不出来,跑出来空解)
看看官方的吧:
定义一个类,遇到check函数后直接hook并且跳过
简写偷一个,方便看:
11_angr_sim_scanf
看起来比上个还过分一点
看起来是很多
scanf
了这个从源程序看起来可以直接手动check秒掉,还是小写一个
(然后其实check也是不必要的,就这样hook就行)
来看看这题想考怎么个事
结果
其实可以打通的,结论是angr在不断进步~~,楽~~
不过这题还是想跟你说说hook系统函数的方法,其实也就是hook一下符号
12_angr_veritesting
???为什么又hook不成功了(这里因为代码又变了,想做一些新的尝试)
↓失败的代码
这里进行一点debug看看...
修改代码如下:
打印的内容如下:
实际上,这里去重之后就是正确答案了,说明这里出现了路径爆炸,在某个地方被分化,stash内路径数量翻倍了,导致每次bfs路径都幂指数上升了
猜测是因为输入的不确定,在赋值的时候就需要hook掉,把这段补上(打印stash的长度看眼):
看一下目前hook后剩下的代码(ida中patch替代)
晚上睡前想着,会不会是循环的时候
jle
给分化出来了stash,但是早起看了眼前面的程序,也是循环应该没有问题的加上上面又hook的代码,输出大约如下:
结果stash长度并没有变化,但是仍旧有路径分化的现象,这就有些闹鬼了
打在上面看看:
结论是stash还是9,为啥9啊,整个程序才9个
basic block
,结束地址前面已经不存在分支了这里直接从scanf后面开始
init blank_state
也是一样的结果,说明不是scanf
或者环境变量传入的问题问了r1mao学长,发现这里对
stash
和state
的理解有点问题了:state会分在不同types的stash,如果这里打印
simulation.active
的话,得到的就是当前的state数量,和执行的重复数量结果是一致的打印一下stash 和 state的结构
实际上是这样子的
所以应该这样子用:
检查一下是不是
jle
惹的锅同理,只剩一个地方了...
这下赛博鬼抓到了...hook以后机器指令没跳过去...
ok,找到了,语法错误,也没报...
exp:
还是看看官方吧,虽然被折磨了一把...
这里是想教你用
veritesting
,这样子veritesting=True
即可13_angr_static_binary
可以看到原本的
strcmp
函数被分解成了很复杂的样子,会导致angr陷进去出不来了,其他库函数也一样这里是编译时加了静态链接的参数导致的
把系统函数hook掉换成libc和glibc里的标准符号即可
14_angr_shared_library
坏了,这题目编译不明白了开始
从so里面导入的check:
直接看官方exp吧(也就是做了个模拟,不过这个挺经典的感觉,后面单独模拟函数会很用得上):
上一篇
das202310
下一篇
ISITDTUCTF
Loading...