I have learned a significant amount about queries and aggregations for ElasticSearch. For example, below is the query that counts the number of records for a date range.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def GetCountRecords(client, from_date, to_date, query = None): | |
""" | |
Get the number of records (documents) from a date range | |
""" | |
s = Search(using=client, index='gracc-osg-*') \ | |
.filter('range', **{'@timestamp': {'from': from_date, 'to': to_date}}) \ | |
.params(search_type="count") | |
response = s.execute() | |
return response.hits.total |
The other query I designed is to aggregate the number of records per probe. This query is designed to help us understand differences in specific probe's reporting behavior.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Create the search and aggreagations (A) | |
s = Search(using=es, index='gracc-osg-*') | |
a = A('terms', field='ProbeName', size=0) | |
s.aggs.bucket('day_range', 'range', field='@timestamp', | |
ranges = [ | |
{'from': 'now-1d', 'to': 'now'}, | |
{'from': 'now-2d', 'to': 'now-1d'} | |
]) \ | |
.bucket('probenames', a) | |
response = s.execute() |
Next, we create a bucket called day_range which is of type range. It aggregates in two ranges, the last 24 hours and the 24 hours previous to that. Next, we attach our ProbeName aggregation "A" defined above. In return we get an aggregation for each of the ranges, for each of the probes, how many records exist for that probe.
This nested aggregation is a powerful feature that will be used in the summarization of the records.
No comments:
Post a Comment