-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Request] lazy backend memory reader #345
Comments
It sounds very useful and easy to implement, but we do not have the manpower right now to implement it as a priority feature. Do you know enough about cle to implement it by yourself? We can offer help as needed! |
Sadly I don't know much about cle. I am willing to try though. |
Maybe you can already do what you want! It seems to me that |
My process address space is not large enough. The dump is really big. I need to be able to "hook" the read functionality so only needed memory is read. |
What file format are you doing this for? |
Currently this is a minidump. However I want that to be abstracted away so I could use it from live process memory. |
So it looks from my incredibly brief interrogation that the library we use to parse minidump files does in fact support the interface that you're looking for. What you will need to do is to create a Be warned that because you are storing a file descriptor, this will leak file descriptor references (cle is entirely designed to have zero file descriptors open after loading is done, there used to be a |
In terms of live process memory, you probably want to look into symbion. I'm not a huge fan of its design, but there are a lot of people using it. |
I don't want to specify segments beforehand. I want one large segment that start at address 0 with size of 0xffffffffffffffff that every read from will let me run custom code. It needs to be a new |
oh. uh, I guess that's technically something we can do, though it will mess a huge amount of the static analysis up which assumes that it can enumerate a list of mapped addresses. Let me put something together for you. |
@rhelmot that will be incredible, thanks you so much! |
Take a look at this! https://github.com/angr/cle/compare/feat/lazy I have not tested it even in making sure it imports, but it should be the right framework for what you want to do. |
I'm not really sure on how to use this.
Also, it looks like on Line 354 in a77bcdc
It should be isinstance(backer, Clemory) and not type(backer) is Clemory .
When I use this code
It does not work (I would expect that the load would return a concrete value). However it looks like even the
I would be glad to have more assistance :) |
I was really hoping you would be able to take it from here... nonetheless, I have pushed more changes to the branch such that your example now works. |
Seems like the
It appears to enter an infinite loop in line 139 with Also, is there a way to make the whole 64 bit address space available ?(and not just the lower half). I have tried replacing the |
There is probably a bug somewhere in the stuff I wrote. Feel free to make whatever changes you need to make it work; I am out of cycles to help with this. You will have a very bad time getting angr to accept you mapping the entire 64 bit address space. angr needs to map some additional object files into free slots of the memory map in order to support things like call_state and simprocedures, and will complain very loudly if it can't do that. |
Hey, I worked on it a bit in my branch here. |
Some thoughts:
Are you looking to contribute this back upstream at some point? |
I do want to contribute this upstream, yes.
|
Hey,
I want a backend class which will allow me to let the cle engine (and angr) get the needed bytes lazily.
This way I will be able to have sparse address space (similar to the minidump) without loading all of it beforehand.
I want this backend to have one parameter which will be a
read
function with address and size parameters.When angr or cle wants to read memory it will read from the cached copy if it exists, otherwise the
read
function will be called for that address and a cached copy will be created.When angr (cle) wants to write somewhere the original bytes will be read into a cached copy and then the write operation will happen. If the cache already exists then angr will just write there.
I have a huge minidump file and loading all the segments at initialization causes an out of memory error for my python.
In the end I want to use the angr engine with this backend and use the common normal functionality (like
explore
and such).Thanks in advance and looking forward to hearing your opinion on this :)
The text was updated successfully, but these errors were encountered: