2015-10-16 04:13 PM
Good afternoon all!
I apologize if this topic has been beaten to death previously, but I am trying to figure out how to use SA's REST API to extract files from network collect (like attachments in unencrypted email, files downloaded in HTTP, etc.). So far I haven't found any documentation on the URL that will actually extract the files.
My current method is going along the lines of:
1) issue the query "http://x.x.x.x:50105/sdk?msg=query&id1=0&id2=0&size=10&query=select+ip.src,ip.dst,attachment+where+time=%27"+time1+"%27-%27"+time2+"%27+AND+attachment+exists&force-content-type=application/json" Here I have some python code that will automatically put in the correct time parameters based off of however long I'm looking. Because I put in '0' for id1 and id2, i pull the values for them in the results to get the real starting id1/id2 that I should be using.
2) Then I iterate through all of the id values querying "http://x.x.x.x:50105/sdk?msg=query&id1={legitId1}&id2={legitId2}&size=100&query=select+ip.src,ip.dst,attachment+where+time=%27"+time1+"%27-%27"+time2+"%27+AND+attachment+exists&force-content-type=application/json" This will pull 100 results at a time where i can grab some info about each of the sessions that has an attachment.
3) Here I'd *like* to be able to take all of the session id's that had attachments associated with them from the above queries, and go back and use them to pull the files from the sessions, but I'm not sure what syntax i'd need to be able to do that.
Of course, if I seem to be going about this the wrong way, I am completely open to suggestion on a better way to do this.
Thank you all for your help!
kyle
2015-10-19 02:11 PM
As an aside I was able to find the answer based off of some deeper looking (namely on [DEAD LINK /ideas/3564#comment-19368]https://community.emc.com/ideas/3564#comment-19368). For anyone else's benefit, below are the URLs that ended up working for me:
Use that to get the session id's of up to 100 sessions that have a pdf extension associated with it, then grab one of the session id's, and throw it in:
2015-10-19 12:50 PM
Okay, so I *think* I might be getting closer, but still not exactly successful (that I know of). What I can do now is take a query, like:
to get a list of session IDs that should have a pdf file associated with the session. Then I'll take one of the session IDs (2022647 in this case) and plug it into the following REST query, trying to run:
http://x.x.x.x:50105/sdk?msg=content&session=2022647&render=files&base64=0&includeFileTypes=pdf
The output from this is a large base64 blob that looks something like:
<?xml version="1.0" encoding="utf-8"?>
<response flags="1073807361">
<blob encoding="base64">GoYgbggAAAAEAAAAAAAAAAAAAADw3B4AAAAAAAEAAAAIAAAAMAAAAAAAAAABAAAAAAAAAAQCXQUA
AAAALQJdBQAAAADn/wAHAAAAAOf/AAcAAAAAJCUAAEwqAAAXAAAAZQCJAPDcHgAAAAAAAgAAAAgA
AABSCAAAAAAAACoAAAAAAAAAc2Vzc2lvbmlkAAAAAAAAAAAAAADw3B...
{snip for brevity}
...BAAAhiQAAAAAAAAAAA==</blob>
However, I don't seem to be able to use base64 to decode the blob, as well as the fact that I tried to purposefully mess up a couple queries to see what would happen if something went wrong (like requesting a session ID that didn't have a pdf in it as well as an invalid 'render' type) and both times I also got a base64 encoded blob like above.
Am I going down the right path with this? Can anyone suggest how to go about actually getting the files using REST? Or is the file somehow inside the base64 blob that I need to extract via another means?
Thank you for any help that can be provided.
2015-10-19 02:11 PM
As an aside I was able to find the answer based off of some deeper looking (namely on [DEAD LINK /ideas/3564#comment-19368]https://community.emc.com/ideas/3564#comment-19368). For anyone else's benefit, below are the URLs that ended up working for me:
Use that to get the session id's of up to 100 sessions that have a pdf extension associated with it, then grab one of the session id's, and throw it in: