2012-12-10 08:57 PM
Been fighting Gremlins with "408 Request Timeout" when accessing the NetWitness REST API from Python 2.7. Recently had an epiphany that explains the variability of the observed behavior, but doesn't make sense. Hoping for some insight/ideas.
The python code first runs a query to gather the SessionIDs for activity matching the Search terms. Then depending on options selected, will iterate through the Sessionid list to gather all of the related Metadata, NWD Session files, PCAPs, and/or File Artifacts.
An example log:
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Maximum returned session threshold has been overridden, new threshold: 10000
(a) 2012-Dec-10 01:39:11 [main] Output format field xml set to: 1
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Output format field pcap set to: 0
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Output format field nwd set to: 1
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Output format field artifacts set to: 1
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Searchmeta field set to: alias.host
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Searchbool field set to: =
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Searchdata field set to: 'badguys.com'
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Meta field set to: action,alias.host,alias.ip,ip.src,ip.dst
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Start date field set to: 2012-Sep-08
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] End date field set to: 2012-Dec-09
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] Start time field set to: 00:00:00
(a) 2012-Dec-10 01:39:11 [main:request 121210000024] End time field set to: 23:59:59
2012-Dec-10 01:39:11 [main:request 121210000024] New request started
(F) 2012-Dec-10 01:39:11 [runqueries:request 121210000024] alias.host%20=%20'badguys.com'Search Result
2012-Dec-10 01:39:11 [sessionsquery:request 121210000024] Sessionid query started /sdk?msg=query&query=select+sessionid+where+alias.host%20=%20'badguys.com'%26%26time=%272012-Sep-08%2000:00:00%27-%272012-Dec-09%2023:59:59%27&size=1000&id1=0&id2=0
2012-Dec-10 01:39:13 [sessionsquery:request 121210000024] Sessionid query completed /sdk?msg=query&query=select+sessionid+where+alias.host%20=%20'badguys.com'%26%26time=%272012-Sep-08%2000:00:00%27-%272012-Dec-09%2023:59:59%27&size=1000&id1=0&id2=0
2012-Dec-10 01:39:13 [xmlquery:request 121210000024] XML metadata selection query started /sdk?msg=query&query=select+sessionid,filetype,extension,filename,action,alias.host,alias.ip,ip.src,ip.dst+where+alias.host%20=%20'badguys.com'%26%26time=%272012-Sep-08%2000:00:00%27-%272012-Dec-09%2023:59:59%27&size=1000&id1=0&id2=0
2012-Dec-10 01:39:16 [xmlquery:request 121210000024] XML metadata selection query completed
2012-Dec-10 01:39:16 [artifactsquery:request 121210000024] File artifacts query started
2012-Dec-10 01:39:16 [artifactsquery:request 121210000024] File artifact query completed
2012-Dec-10 01:39:16 [zipresults:request 121210000024] Zipping temp folder
2012-Dec-10 01:39:16 [zipresults:request 121210000024] Finished zipping temp folder
2012-Dec-10 01:39:16 [zipresults:request 121210000024] Request completed
2012-Dec-10 01:39:16 [zipresults:request 121210000024] Request processing totaled 5.2389998436 seconds
While each option presents it's own "Request Timeout " challenges, this post focuses on apparent Metadata field selection sensitivity.
Problem Summary
408 Request Timeouts appear randomly, without any obvious relation to:
The first Eureka was when things started working very reliably after inadvertently removing 'time' from one of the standard Selection Clauses:
From: Select = 'time,action,alias.host,alias.ip,ip.src,ip.dst'
To: Select = 'action,alias.host,alias.ip,ip.src,ip.dst'
Initial testing revealed that other metadata selection terms have similar impacts ( and possibly the combination and/or ordering of terms?).
Including Session Time in our XML Metadata Extract processes is critical to almost every Use Case, trumped only by reliabilioty.
Any insights, thoughts, stratgeies, debugging tips, etc. are sincerely appreciated. Also interested in any general discussions, techniques, code samples, etc on dealing with NW REST API Session Timeouts in generall, in Use Cases where queries can return very large data sets, and/or when applying complex search terms to extended periods containing high volumes of traffic (Python 2.7/3.2 context very helpful)
2012-12-11 08:23 AM
Try adding an &expiry=0 to the end of your requests.
Hope that helps!
Thank you,
Rui
2012-12-11 08:23 AM
Try adding an &expiry=0 to the end of your requests.
Hope that helps!
Thank you,
Rui
2012-12-11 09:52 AM
Ironicallly I found the other related post revealing the magical '&expiry=0' both here and in the NW Splunk release notes late last night. Prior frustrating searches on NW REST API/SDK, python, etc. only revealed 10,000 useful posts on how to detect/handle the Request timeoue 408 error (not actually '/solve it!'
I've been testing against my problem sets (even daring to request *all* of the metadata I actually wanted!) and so far the otherworldly encantation of '&expiry=0'!!!! has excised all my Request Timeout Gremlins. Thanks Rui and all who have posted.
Solved!
Followup:
The solution reveals another mystery: previously failed requests would take at least 30 Seconds to return the 408 error (implying it was taking that long to return the data?). After applying the "expir ye gremlins!!', many of these same requests now complete successfully in seconds?
Can anyone elaborate on the use of this parameter in terms of how the NW REST API treats these requsts differently and potential impacts to Netwitness services, brokers, concentrators, decoders, etc? I'm hiitng Brokers with 1,000s of automated requests/hour. Understand the basic concept of changing the default timeout.
Also any musings on 'why?' selecting time metadata caused anomalous behavior or why the same request completes quicker when adding &expiry=0?
2012-12-11 11:05 AM
Great!
I will leave the "Why?" to those with a bit more background knowledge or insight into the inner workings, I believe there are a few around 🙂
Thank you,
Rui
2016-11-20 09:16 PM
This is a typical content query statement for me where
/sdk/content?session=<session id>&maxSize=0&render=pcap&expiry=0
Thanks.