.relative_addr is its address relative to the object base. This is known in the literature (particularly the Windows literature) as an RVA (relative virtual address).

Loading Options

If you are loading something with angr.Project and you want to pass an option to the cle.Loader instance that Project implicitly creates, you can just pass the keyword argument directly to the Project constructor, and it will be passed on to CLE. You should look at the CLE API docs. if you want to know everything that could possibly be passed in as an option, but we will go over some important and frequently used options here.

We’ve discussed auto_load_libs already - it enables or disables CLE’s attempt to automatically resolve shared library dependencies, and is on by default. Additionally, there is the opposite, except_missing_libs, which, if set to true, will cause an exception to be thrown whenever a binary has a shared library dependency that cannot be resolved.

You can pass a list of strings to force_load_libs and anything listed will be treated as an unresolved shared library dependency right out of the gate, or you can pass a list of strings to skip_libs to prevent any library of that name from being resolved as a dependency. Additionally, you can pass a list of strings (or a single string) to ld_path, which will be used as an additional search path for shared libraries, before any of the defaults: the same directory as the loaded program, the current working directory, and your system libraries.

If you want to specify some options that only apply to a specific binary object, CLE will let you do that too. The parameters main_opts and lib_opts do this by taking dictionaries of options. main_opts is a mapping from option names to option values, while lib_opts is a mapping from library name to dictionaries mapping option names to option values.

The options that you can use vary from backend to backend, but some common ones are:

backend - which backend to use, as either a class or a name

base_addr - a base address to use

entry_point - an entry point to use

arch - the name of an architecture to use

Example:

Symbolic Function Summaries

angr/angr/procedures at master · angr/angr (github.com)

hook

可以尝试

然后可以进行hook修改寄存器等操作

符号样例

Bitvectors

可以通过ASTs分析

Symbolic Constraints

Performing comparison operations between any two similarly-typed ASTs will yield another AST - not a bitvector, but now a symbolic boolean.

判断

Constraint Solving

Floating point numbers

This is nice, but sometimes we need to be able to work directly with the representation of the float as a bitvector. You can interpret bitvectors as floats and vice versa, with the methods raw_to_bv and raw_to_fp:

More Solving Methods

eval will give you one possible solution to an expression, but what if you want several? What if you want to ensure that the solution is unique? The solver provides you with several methods for common solving patterns:

solver.eval(expression) will give you one possible solution to the given expression.

solver.eval_one(expression) will give you the solution to the given expression, or throw an error if more than one solution is possible.

solver.eval_upto(expression, n) will give you up to n solutions to the given expression, returning fewer than n if fewer than n are possible.

solver.eval_atleast(expression, n) will give you n solutions to the given expression, throwing an error if fewer than n are possible.

solver.eval_exact(expression, n) will give you n solutions to the given expression, throwing an error if fewer or more than are possible.

solver.min(expression) will give you the minimum possible solution to the given expression.

solver.max(expression) will give you the maximum possible solution to the given expression.

Additionally, all of these methods can take the following keyword arguments:

extra_constraints can be passed as a tuple of constraints. These constraints will be taken into account for this evaluation, but will not be added to the state.

cast_to can be passed a data type to cast the result to. Currently, this can only be int and bytes, which will cause the method to return the corresponding representation of the underlying data. For example, state.solver.eval(state.solver.BVV(0x41424344, 32), cast_to=bytes) will return b'ABCD'.

Machine State - memory, registers, and so on

quick examples:

这里开始使用更简单的执行方法：state.step()，其会进行一步符号执行并且返回名为angr.engines.successors.SimSuccessors的对象，并且会提供若干可以被分类成不同执行路径的继承状态，关注该对象的 .successors 属性，其是一个包含所有“normal” successors of a given step的list。

该list会包含所有constraint的正误状态作为新的constraint

(这里的example应该是用一个strcmp作为constraint)

可以use state.posix.stdin.load(0, state.posix.stdin.size) to retrieve a bitvector representing all the content read from stdin so far

State Presets

project.factory.

.blank_state() constructs a “blank slate” blank state, with most of its data left uninitialized. When accessing uninitialized data, an unconstrained symbolic value will be returned.

.entry_state() constructs a state ready to execute at the main binary’s entry point.

.full_init_state() constructs a state that is ready to execute through any initializers that need to be run before the main binary’s entry point, for example, shared library constructors or preinitializers. When it is finished with these it will jump to the entry point.

.call_state() constructs a state ready to execute a given function.

使用方法：

传入起始地址：

All of these constructors can take an addr argument to specify the exact address to start.

传入参数：

If you’re executing in an environment that can take command line arguments or an environment, you can pass a list of arguments through args and a dictionary of environment variables through env into entry_state and full_init_state. The values in these structures can be strings or bitvectors, and will be serialized into the state as the arguments and environment to the simulated execution. The default args is an empty list, so if the program you’re analyzing expects to find at least an argv[0], you should always provide that!

可以传入符号

If you’d like to have argc be symbolic, you can pass a symbolic bitvector as argc to the entry_state and full_init_state constructors. Be careful, though: if you do this, you should also add a constraint to the resulting state that your value for argc cannot be larger than the number of args you passed into args.

传入函数参数

To use the call state, you should call it with .call_state(addr, arg1, arg2, ...), where addr is the address of the function you want to call and argN is the Nth argument to that function, either as a Python integer, string, or array, or a bitvector. If you want to have memory allocated and actually pass in a pointer to an object, you should wrap it in an PointerWrapper, i.e. angr.PointerWrapper("point to me!"). The results of this API can be a little unpredictable, but we’re working on it.

To specify the calling convention used for a function with call_state, you can pass a SimCC instance as the cc argument.:raw-html-m2r: We try to pick a sane default, but for special cases you will need to help angr out.

对内存操作

对内存地址批量操作

对寄存器操作

state.registers ： Intermediate Representation - angr documentation

angr/archinfo: Classes with architecture-specific information useful to other projects. (github.com)

State Options

https://docs.angr.io/en/latest/appendix/options.html#list-of-state-options

Plugins

State Plugins

implement new kinds of data storage

For example, the normal memory plugin simulates a flat memory space, but analyses can choose to enable the “abstract memory” plugin, which uses alternate data types for addresses to simulate free-floating memory mappings independent of address, to provide state.memory. Conversely, plugins can reduce code complexity: state.memory and state.registers are actually two different instances of the same plugin, since the registers are emulated with an address space as well.

The globals plugin

state.globals is an extremely simple plugin: it implements the interface of a standard Python dict, allowing you to store arbitrary data on a state.

The history plugin

state.history is a very important plugin storing historical data about the path a state has taken during execution. It is actually a linked list of several history nodes, each one representing a single round of execution—you can traverse this list with state.history.parent.parent etc.

To make it more convenient to work with this structure, the history also provides several efficient iterators over the history of certain values. In general, these values are stored as history.recent_NAME and the iterator over them is just history.NAME. For example, for addr in state.history.bbl_addrs: print hex(addr) will print out a basic block address trace for the binary, while state.history.recent_bbl_addrs is the list of basic blocks executed in the most recent step, state.history.parent.recent_bbl_addrs is the list of basic blocks executed in the previous step, etc. If you ever need to quickly obtain a flat list of these values, you can access .hardcopy, e.g. state.history.bbl_addrs.hardcopy. Keep in mind though, index-based accessing is implemented on the iterators.

Here is a brief listing of some of the values stored in the history:

history.descriptions is a listing of string descriptions of each of the rounds of execution performed on the state.

history.bbl_addrs is a listing of the basic block addresses executed by the state. There may be more than one per round of execution, and not all addresses may correspond to binary code - some may be addresses at which SimProcedures are hooked.

history.jumpkinds is a listing of the disposition of each of the control flow transitions in the state’s history, as VEX enum strings.

history.jump_guards is a listing of the conditions guarding each of the branches that the state has encountered.

history.events is a semantic listing of “interesting events” which happened during execution, such as the presence of a symbolic jump condition, the program popping up a message box, or execution terminating with an exit code.

history.actions is usually empty, but if you add the angr.options.refs options to the state, it will be populated with a log of all the memory, register, and temporary value accesses performed by the program.

The callstack plugin

angr will track the call stack for the emulated program. On every call instruction, a frame will be added to the top of the tracked callstack, and whenever the stack pointer drops below the point where the topmost frame was called, a frame is popped. This allows angr to robustly store data local to the current emulated function.

Similar to the history, the callstack is also a linked list of nodes, but there are no provided iterators over the contents of the nodes - instead you can directly iterate over state.callstack to get the callstack frames for each of the active frames, in order from most recent to oldest. If you just want the topmost frame, this is state.callstack.

callstack.func_addr is the address of the function currently being executed

callstack.call_site_addr is the address of the basic block which called the current function

callstack.stack_ptr is the value of the stack pointer from the beginning of the current function

callstack.ret_addr is the location that the current function will return to if it returns

I/O

Working with File System, Sockets, and Pipes

Copying and Merging

A state supports very fast copies, so that you can explore different possibilities:

States can also be merged together.

Simulation Managers

描述：

Stepping

.step()：前进一个basic block

.run()：执行到所有deadended，并且获得所有deadended states（例如到达exit syscall，此时该state会被从active stash中移除并放入deadended states）

Stash Management

.move()：from_stash to_stash filter_func (optional, default:everything)

stash的类型为list，可以通过如下方式访问：

所以link呢

Stash types

Stash	Description
active	This stash contains the states that will be stepped by default, unless an alternate stash is specified.
deadended	A state goes to the deadended stash when it cannot continue the execution for some reason, including no more valid instructions, unsat state of all of its successors, or an invalid instruction pointer.
pruned	When using `LAZY_SOLVES`, states are not checked for satisfiability unless absolutely necessary. When a state is found to be unsat in the presence of `LAZY_SOLVES`, the state hierarchy is traversed to identify when, in its history, it initially became unsat. All states that are descendants of that point (which will also be unsat, since a state cannot become un-unsat) are pruned and put in this stash.
unconstrained	If the `save_unconstrained` option is provided to the SimulationManager constructor, states that are determined to be unconstrained (i.e., with the instruction pointer controlled by user data or some other source of symbolic data) are placed here.
unsat	If the `save_unsat` option is provided to the SimulationManager constructor, states that are determined to be unsatisfiable (i.e., they have constraints that are contradictory, like the input having to be both “AAAA” and “BBBB” at the same time) are placed here.
errored	If, during execution, an error is raised, then the state will be wrapped in an `ErrorRecord` object, which contains the state and the error it raised, and then the record will be inserted into `errored`. launch a debug shell at the site of the error with `record.debug()`.

Exploration

.explore()：find argument(指令的结束地址或结束地址列表,或函数根据某种标准的返回状态)

当满足后，放入found stash，然后结束符号执行，可以同样声明avoid condition（格式与find相同）

num_find指定return前找到多少数量的find（default = 1，如果所有active stash的state被全部执行则同样return）

eg.

其他样例：angr examples - angr documentation

Exploration Techniques

angr ships with several pieces of canned functionality that let you customize the behavior of a simulation manager, called exploration techniques. The archetypical example of why you would want an exploration technique is to modify the pattern in which the state space of the program is explored - the default “step everything at once” strategy is effectively breadth-first search, but with an exploration technique you could implement, for example, depth-first search. However, the instrumentation power of these techniques is much more flexible than that - you can totally alter the behavior of angr’s stepping process. Writing your own exploration techniques will be covered in a later chapter.

To use an exploration technique, call simgr.use_technique(tech), where tech is an instance of an ExplorationTechnique subclass. angr’s built-in exploration techniques can be found under angr.exploration_techniques.

Here’s a quick overview of some of the built-in ones:

DFS: Depth first search, as mentioned earlier. Keeps only one state active at once, putting the rest in the deferred stash until it deadends or errors.

Explorer: This technique implements the .explore() functionality, allowing you to search for and avoid addresses.

LengthLimiter: Puts a cap on the maximum length of the path a state goes through.

LoopSeer: Uses a reasonable approximation of loop counting to discard states that appear to be going through a loop too many times, putting them in a spinning stash and pulling them out again if we run out of otherwise viable states.

ManualMergepoint: Marks an address in the program as a merge point, so states that reach that address will be briefly held, and any other states that reach that same point within a timeout will be merged together.

MemoryWatcher: Monitors how much memory is free/available on the system between simgr steps and stops exploration if it gets too low.

Oppologist: The “operation apologist” is an especially fun gadget - if this technique is enabled and angr encounters an unsupported instruction, for example a bizzare and foreign floating point SIMD op, it will concretize all the inputs to that instruction and emulate the single instruction using the unicorn engine, allowing execution to continue.

Spiller: When there are too many states active, this technique can dump some of them to disk in order to keep memory consumption low.

Threading: Adds thread-level parallelism to the stepping process. This doesn’t help much because of Python’s global interpreter locks, but if you have a program whose analysis spends a lot of time in angr’s native-code dependencies (unicorn, z3, libvex) you can seem some gains.

Tracer: An exploration technique that causes execution to follow a dynamic trace recorded from some other source. The dynamic tracer repository has some tools to generate those traces.

Veritesting: An implementation of a [CMU paper](https://users.ece.cmu.edu/~dbrumley/pdf/Avgerinos et al._2014_Enhancing Symbolic Execution with Veritesting.pdf) on automatically identifying useful merge points. This is so useful, you can enable it automatically with veritesting=True in the SimulationManager constructor! Note that it frequenly doesn’t play nice with other techniques due to the invasive way it implements static symbolic execution.

Look at the API documentation for the SimulationManager and ExplorationTechnique classes for more information.

Simulation and Instrumentation

Attribute	Guard Condition	Instruction Pointer	Description
`successors`	True (can be symbolic, but constrained to True)	Can be symbolic (but 256 solutions or less; see `unconstrained_successors`).	A normal, satisfiable successor state to the state processed by the engine. The instruction pointer of this state may be symbolic (i.e., a computed jump based on user input), so the state might actually represent several potential continuations of execution going forward.
`unsat_successors`	False (can be symbolic, but constrained to False).	Can be symbolic.	Unsatisfiable successors. These are successors whose guard conditions can only be false (i.e., jumps that cannot be taken, or the default branch of jumps that must be taken).
`flat_successors`	True (can be symbolic, but constrained to True).	Concrete value.	As noted above, states in the `successors` list can have symbolic instruction pointers. This is rather confusing, as elsewhere in the code (i.e., in `SimEngineVEX.process`, when it’s time to step that state forward), we make assumptions that a single program state only represents the execution of a single spot in the code. To alleviate this, when we encounter states in `successors` with symbolic instruction pointers, we compute all possible concrete solutions (up to an arbitrary threshold of 256) for them, and make a copy of the state for each such solution. We call this process “flattening”. These `flat_successors` are states, each of which has a different, concrete instruction pointer. For example, if the instruction pointer of a state in `successors` was `X+5`, where `X` had constraints of `X > 0x800000` and `X <= 0x800010`, we would flatten it into 16 different `flat_successors` states, one with an instruction pointer of `0x800006`, one with `0x800007`, and so on until `0x800015`.
`unconstrained_successors`	True (can be symbolic, but constrained to True).	Symbolic (with more than 256 solutions).	During the flattening procedure described above, if it turns out that there are more than 256 possible solutions for the instruction pointer, we assume that the instruction pointer has been overwritten with unconstrained data (i.e., a stack overflow with user data). This assumption is not sound in general. Such states are placed in `unconstrained_successors` and not in `successors`.
`all_successors`	Anything	Can be symbolic.	This is `successors + unsat_successors + unconstrained_successors`.

Breakpoints

Event type	Event meaning
mem_read	Memory is being read.
mem_write	Memory is being written.
address_concretization	A symbolic memory access is being resolved.
reg_read	A register is being read.
reg_write	A register is being written.
tmp_read	A temp is being read.
tmp_write	A temp is being written.
expr	An expression is being created (i.e., a result of an arithmetic operation or a constant in the IR).
statement	An IR statement is being translated.
instruction	A new (native) instruction is being translated.
irsb	A new basic block is being translated.
constraints	New constraints are being added to the state.
exit	A successor is being generated from execution.
fork	A symbolic execution state has forked into multiple states.
symbolic_variable	A new symbolic variable is being created.
call	A call instruction is hit.
return	A ret instruction is hit.
simprocedure	A simprocedure (or syscall) is executed.
dirty	A dirty IR callback is executed.
syscall	A syscall is executed (called in addition to the simprocedure event).
engine_process	A SimEngine is about to process some code.

These events expose different attributes:

Event type	Attribute name	Attribute availability	Attribute meaning
mem_read	mem_read_address	BP_BEFORE or BP_AFTER	The address at which memory is being read.
mem_read	mem_read_expr	BP_AFTER	The expression at that address.
mem_read	mem_read_length	BP_BEFORE or BP_AFTER	The length of the memory read.
mem_read	mem_read_condition	BP_BEFORE or BP_AFTER	The condition of the memory read.
mem_write	mem_write_address	BP_BEFORE or BP_AFTER	The address at which memory is being written.
mem_write	mem_write_length	BP_BEFORE or BP_AFTER	The length of the memory write.
mem_write	mem_write_expr	BP_BEFORE or BP_AFTER	The expression that is being written.
mem_write	mem_write_condition	BP_BEFORE or BP_AFTER	The condition of the memory write.
reg_read	reg_read_offset	BP_BEFORE or BP_AFTER	The offset of the register being read.
reg_read	reg_read_length	BP_BEFORE or BP_AFTER	The length of the register read.
reg_read	reg_read_expr	BP_AFTER	The expression in the register.
reg_read	reg_read_condition	BP_BEFORE or BP_AFTER	The condition of the register read.
reg_write	reg_write_offset	BP_BEFORE or BP_AFTER	The offset of the register being written.
reg_write	reg_write_length	BP_BEFORE or BP_AFTER	The length of the register write.
reg_write	reg_write_expr	BP_BEFORE or BP_AFTER	The expression that is being written.
reg_write	reg_write_condition	BP_BEFORE or BP_AFTER	The condition of the register write.
tmp_read	tmp_read_num	BP_BEFORE or BP_AFTER	The number of the temp being read.
tmp_read	tmp_read_expr	BP_AFTER	The expression of the temp.
tmp_write	tmp_write_num	BP_BEFORE or BP_AFTER	The number of the temp written.
tmp_write	tmp_write_expr	BP_AFTER	The expression written to the temp.
expr	expr	BP_BEFORE or BP_AFTER	The IR expression.
expr	expr_result	BP_AFTER	The value (e.g. AST) which the expression was evaluated to.
statement	statement	BP_BEFORE or BP_AFTER	The index of the IR statement (in the IR basic block).
instruction	instruction	BP_BEFORE or BP_AFTER	The address of the native instruction.
irsb	address	BP_BEFORE or BP_AFTER	The address of the basic block.
constraints	added_constraints	BP_BEFORE or BP_AFTER	The list of constraint expressions being added.
call	function_address	BP_BEFORE or BP_AFTER	The name of the function being called.
exit	exit_target	BP_BEFORE or BP_AFTER	The expression representing the target of a SimExit.
exit	exit_guard	BP_BEFORE or BP_AFTER	The expression representing the guard of a SimExit.
exit	exit_jumpkind	BP_BEFORE or BP_AFTER	The expression representing the kind of SimExit.
symbolic_variable	symbolic_name	BP_AFTER	The name of the symbolic variable being created. The solver engine might modify this name (by appending a unique ID and length). Check the symbolic_expr for the final symbolic expression.
symbolic_variable	symbolic_size	BP_AFTER	The size of the symbolic variable being created.
symbolic_variable	symbolic_expr	BP_AFTER	The expression representing the new symbolic variable.
address_concretization	address_concretization_strategy	BP_BEFORE or BP_AFTER	The SimConcretizationStrategy being used to resolve the address. This can be modified by the breakpoint handler to change the strategy that will be applied. If your breakpoint handler sets this to None, this strategy will be skipped.
address_concretization	address_concretization_action	BP_BEFORE or BP_AFTER	The SimAction object being used to record the memory action.
address_concretization	address_concretization_memory	BP_BEFORE or BP_AFTER	The SimMemory object on which the action was taken.
address_concretization	address_concretization_expr	BP_BEFORE or BP_AFTER	The AST representing the memory index being resolved. The breakpoint handler can modify this to affect the address being resolved.
address_concretization	address_concretization_add_constraints	BP_BEFORE or BP_AFTER	Whether or not constraints should/will be added for this read.
address_concretization	address_concretization_result	BP_AFTER	The list of resolved memory addresses (integers). The breakpoint handler can overwrite these to effect a different resolution result.
syscall	syscall_name	BP_BEFORE or BP_AFTER	The name of the system call.
simprocedure	simprocedure_name	BP_BEFORE or BP_AFTER	The name of the simprocedure.
simprocedure	simprocedure_addr	BP_BEFORE or BP_AFTER	The address of the simprocedure.
simprocedure	simprocedure_result	BP_AFTER	The return value of the simprocedure. You can also override it in BP_BEFORE, which will cause the actual simprocedure to be skipped and for your return value to be used instead.
simprocedure	simprocedure	BP_BEFORE or BP_AFTER	The actual SimProcedure object.
dirty	dirty_name	BP_BEFORE or BP_AFTER	The name of the dirty call.
dirty	dirty_handler	BP_BEFORE	The function that will be run to handle the dirty call. You can override this.
dirty	dirty_args	BP_BEFORE or BP_AFTER	The address of the dirty.
dirty	dirty_result	BP_AFTER	The return value of the dirty call. You can also override it in BP_BEFORE, which will cause the actual dirty call to be skipped and for your return value to be used instead.
engine_process	sim_engine	BP_BEFORE or BP_AFTER	The SimEngine that is processing.
engine_process	successors	BP_BEFORE or BP_AFTER	The SimSuccessors object defining the result of the engine.

eg. 每当程序状态s执行内存读取时，angr都会在读取完成后立即调用track_reads，打印出读取的值和发生读取的内存地址。

声明函数作为condition

Caution about `mem_read` breakpoint

The mem_read breakpoint gets triggered anytime there are memory reads by either the executing program or the binary analysis. If you are using breakpoint on mem_read and also using state.mem to load data from memory addresses, then know that the breakpoint will be fired as you are technically reading memory.

So if you want to load data from memory and not trigger any mem_read breakpoint you have had set up, then use state.memory.load with the keyword arguments disable_actions=True and inspect=False.

This is also true for state.find and you can use the same keyword arguments to prevent mem_read breakpoints from firing.

Analyses

Writing Analyses - angr documentation

the idea is that all the analyses appear under project.analyses (for example, project.analyses.CFGFast()) and can be called as functions, returning analysis result instances.

Name	Description
CFGFast	Constructs a fast Control Flow Graph of the program
CFGEmulated	Constructs an accurate Control Flow Graph of the program
VFG	Performs VSA on every function of the program, creating a Value Flow Graph and detecting stack variables
DDG	Calculates a Data Dependency Graph, allowing one to determine what statements a given value depends on
BackwardSlice	Computes a Backward Slice of a program with respect to a certain target
Identifier	Identifies common library functions in CGC binaries
More!	angr has quite a few analyses, most of which work! If you’d like to know how to use one, please submit an issue requesting documentation.

Resilience

Analyses can be written to be resilient, and catch and log basically any error. These errors, depending on how they’re caught, are logged to the errors or named_errors attribute of the analysis. However, you might want to run an analysis in “fail fast” mode, so that errors are not handled. To do this, the argument fail_fast=True can be passed into the analysis constructor.

Symbolic Execution

为什么这里是todo...

Symbolic Execution - angr documentation

angr_ctf

官方给的样例，文档里还有很多真实ctf比赛的例题orz，初探就做到这里吧

环境配置

添加环境变量：

支持编译32位程序包：

生成可执行程序

00_angr_find

直接找对应标准输出的输入即可

01_angr_avoid

main函数的节点过多

可以看到avoid_me函数被大量调用

这里需要让angr走到avoid_me函数后就剪枝

可以使用函数传入所有需要avoid的状态

02_angr_find_condition

和上面一样

03_angr_symbolic_registers

和上面一样可以打通

不过官方exp是打算让分段打（yysy，看起来没啥用，也就是省去了初始化的一些时间，不会优化太多）：

04_angr_symbolic_stack

老exp还是可以打通.....不过官方exp是要求把栈模拟一下的，贴一下先

分析一下：

主要是这里：

实际上就是找到栈上的参数，用于做初始化状态，再进行符号执行

05_angr_symbolic_memory

对应到全局变量的方法

06_angr_symbolic_dynamic_memory

~~最早的仍旧能打通~~

不过这题主要还是教你把堆模拟（分配未分配过的内存即可）

07_angr_symbolic_file

~~最早的仍旧能打通，而且官方的反而打不通ee~~

还是贴一个官方的吧，其实就是教怎么模拟文件

ps. 找到issue了：

Scaffold and solution challenge 07 are not working with latest angr, because SimFile class changed.

This is working code with latest version of angr for the filesystem part:

TODO：改了issue仍旧打不通

08_angr_constraints

~~老exp打不通了，好耶（什）~~

原理：

TODO：这里尝试过在check的jnz地址处对zf寄存器状态做剪枝，也跑不出来，后续看看为啥

官方给的解法是手动获取模拟比较（设置终止状态在真正的check之前，然后手动设置比较），将其转化为constraint，约束求解得到最终结果，贴一个吧先：

09_angr_hooks

~~他说要hook，但是我强行给他分段打通楽（什）~~

一开始写的exp（注意给password初始值写上，开始的时候忘了）可以打通

然后看看官方怎么hook的

结论是这么写将中间那个过不去的函数转化为手动的check，那比上一个的做法还简洁点点

主要就是这里：

直接贴一个：

10_angr_simprocedures

将check部分的basic block拆了很多很多，看源码是加了不透明谓词和分发块：

ida做了的伪代码做了代码优化，看起来其实和前面两个题差不多

但是下面这种写法不行，不知道是为啥不能这样hook，这样hook的话约束不出来解：

（先hook找输入的地址，然后手写check，但是约束不出来，跑出来空解）

看看官方的吧：

定义一个类，遇到check函数后直接hook并且跳过

简写偷一个，方便看：

~~所以为啥我的跑不通~~

11_angr_sim_scanf

看起来比上个还过分一点

看起来是很多scanf了

这个从源程序看起来可以直接手动check秒掉，还是小写一个

（然后其实check也是不必要的，就这样hook就行）

来看看这题想考怎么个事

结果

其实可以打通的，结论是angr在不断进步~~，楽~~

不过这题还是想跟你说说hook系统函数的方法，其实也就是hook一下符号

12_angr_veritesting

？？？为什么又hook不成功了（这里因为代码又变了，想做一些新的尝试）

↓失败的代码

这里进行一点debug看看...

修改代码如下：

打印的内容如下：

实际上，这里去重之后就是正确答案了，说明这里出现了路径爆炸，在某个地方被分化，stash内路径数量翻倍了，导致每次bfs路径都幂指数上升了

猜测是因为输入的不确定，在赋值的时候就需要hook掉，把这段补上（打印stash的长度看眼）：

看一下目前hook后剩下的代码（ida中patch替代）

晚上睡前想着，会不会是循环的时候jle给分化出来了stash，但是早起看了眼前面的程序，也是循环应该没有问题的

加上上面又hook的代码，输出大约如下：

结果stash长度并没有变化，但是仍旧有路径分化的现象，这就有些闹鬼了

打在上面看看：

结论是stash还是9，为啥9啊，整个程序才9个basic block，结束地址前面已经不存在分支了

这里直接从scanf后面开始init blank_state也是一样的结果，说明不是scanf或者环境变量传入的问题

问了r1mao学长，发现这里对stash和state的理解有点问题了：

state会分在不同types的stash，如果这里打印simulation.active的话，得到的就是当前的state数量，和执行的重复数量结果是一致的

打印一下stash 和 state的结构实际上是这样子的

所以应该这样子用：

检查一下是不是jle惹的锅

同理，只剩一个地方了...

这下赛博鬼抓到了...hook以后机器指令没跳过去...

ok，找到了，语法错误，也没报...

exp：

还是看看官方吧，虽然被折磨了一把...

这里是想教你用veritesting，这样子veritesting=True即可

13_angr_static_binary

可以看到原本的strcmp函数被分解成了很复杂的样子，会导致angr陷进去出不来了，其他库函数也一样

这里是编译时加了静态链接的参数导致的

把系统函数hook掉换成libc和glibc里的标准符号即可

14_angr_shared_library

坏了，这题目编译不明白了开始

~~直接gh上拿了...懒得研究了...~~

从so里面导入的check：

直接看官方exp吧（也就是做了个模拟，不过这个挺经典的感觉，后面单独模拟函数会很用得上）：

das202310

ISITDTUCTF

Last update: 2023-10-24

angr初探

参考链接：

虚拟环境

使用样例

导入模块

命令行使用时可以导入monkeyhex转化为十六进制输出

project的基础属性

对基本块的操作

模拟状态 SimState

模拟管理器 Simulation Managers

Analyses

The Loader

Symbols and Relocations

Loading Options

Symbolic Function Summaries

hook

符号样例

Bitvectors

Symbolic Constraints

Constraint Solving

Floating point numbers

More Solving Methods

Machine State - memory, registers, and so on

State Presets

对内存操作

对寄存器操作

State Options

Plugins

State Plugins

The globals plugin

The history plugin

The callstack plugin

I/O

Copying and Merging

Simulation Managers

Stepping

Stash Management

Stash types

Exploration

Exploration Techniques

Simulation and Instrumentation

Breakpoints

Caution about mem_read breakpoint

Analyses

Resilience

Symbolic Execution

为什么这里是todo...

angr_ctf

官方给的样例，文档里还有很多真实ctf比赛的例题orz，初探就做到这里吧

环境配置

00_angr_find

01_angr_avoid

02_angr_find_condition

03_angr_symbolic_registers

04_angr_symbolic_stack

05_angr_symbolic_memory

06_angr_symbolic_dynamic_memory

07_angr_symbolic_file

08_angr_constraints

09_angr_hooks

10_angr_simprocedures

11_angr_sim_scanf

12_angr_veritesting

13_angr_static_binary

14_angr_shared_library

命令行使用时可以导入`monkeyhex`转化为十六进制输出

模拟状态 `SimState`

模拟管理器 `Simulation Managers`

Caution about `mem_read` breakpoint