My documents are stored in the format sysstat.host.statistics.timestamp[].cpu-load-all.cpu[].usr
, where timestamp
is a 30 element array, and cpu
is a 1-64 element array.
If I grab the timestamp
field,
timestampCursor = HOST_USAGE.find(
{'sysstat.host.nodename': host},
{'sysstat.host.statistics.timestamp': 1})
How can I then access sysstat.host.statistics.timestamp[*].cpu-load-all.cpu[0].usr
, cleanly? Do I have to access each field by indexing each array, and so multiple iterations over each array-field?
Yes, you have to access each field by indexing each array, and so multiple iterations over each array-field.
for doc in timestampCursor:
sysstat = doc['sysstat']
for ts in sysstat['host']['statistics']['timestamp']:
for cpu in ts['cpu-load-all']['cpu']:
usr = cpu['usr']
# Now, sum or average the 'usr' values, or whatever
# you intend to do.
Alternatively, to aggregate the data server-side, you can use $unwind with $sum or $average or some other grouping operation with the MongoDB Aggregation Framework.
HOST_USAGE.aggregate([{
'$match': {'sysstat.host.nodename': 1}
}, {
# Rename the field for brevity.
'$project': {'ts': '$sysstat.host.statistics.timestamp'}
}, {
'$unwind': '$ts'
}, {
'$unwind': '$ts.cpu-load-all.cpu'
}, {
'$group': {
'_id': 0,
'all-usr': {'$sum': '$ts.cpu-load-all.cpu.usr'}
}
}])))