What I need to do is to either update the counters inside the only one nested object or insert the entire doc containing the array with init'd counters.
Example of doc:
{
"_id":{"$oid":"61f020f40e2f03f0b93b69f1"},
"v":"xxxxxxxxxxxxxxxx",
"list":[{"p":{"$oid":"61f020f40e2f03f0b93b69f0"},"c":1,"hom":0,"het":1}]
}
All the docs will have for sure a "list" array with at least and at most one nested object. And the pair (v,list.p)
is unique.
So, starting from a previous find, for each element of the cursor, I'm calling the upsert function in js, which actually finds the doc, then inserts or updates the collection whether it is found.
function upsert(doc){
_hom = doc.field1.field2 == "hom" ? 1 : 0;
_het = doc.field1.field2 == "het" ? 1 : 0;
filter = {"v": doc.anId, "list.p" : "61f020f40e2f03f0b93b69f0"};
project = {"list.c":1, "list.hom":1, "list.het":1, "_id":0};
res = db.collection.find(filter, project);
if(!res.hasNext()){
doc = {"v": doc.anId, "list" : [{"p" : "61f020f40e2f03f0b93b69f0", "c" : 1, "hom" : _hom, "het" : _het}]}
db.collection.insertOne(doc)
} else {
list = res.next().list[0];
_c = list.c + 1;
_hom += list.hom;
_het += list.het;
update = {$set : {"list" : [{"p" : "61f020f40e2f03f0b93b69f0", "c" : _c, "hom" : _hom, "het" : _het}]}}
db.collection.updateOne(filter, update)
}
}
So imagine something like :
db.anotherCollection.find({...},{...}).forEach(doc => upsert(doc))
It does work, but it's pretty slow. Is there any other way to do this? I read online that upserting an array to do something like this is not possible, cause you either need to push, pull, based on query, ecc... But I need to update existing counters of a nested object, if doc is found, or rather insert an entire doc, if doc is not found.
Mongo 4.4.6
So this can't really be done in a singe db call, you could however simplify the code and reduce the average amount of db calls done in total.
Specifically speaking you we can instead of find -> update
just execute an update, if the update is "successful" then we can just return, otherwise we insert. This means whenever the list
already exists you have a redundant find
query that is not needed.
Here is the code:
function upsert(doc){
_hom = doc.field1.field2 == "hom" ? 1 : 0;
_het = doc.field1.field2 == "het" ? 1 : 0;
filter = {"v": doc.anId, "list.p" : "61f020f40e2f03f0b93b69f0"};
update = {$inc : {"list.$.c": 1}}
updateRes = db.collection.updateOne(filter, update)
if(updateRes.matchedCount == 0){
doc = {"v": doc.anId, "list" : [{"p" : "61f020f40e2f03f0b93b69f0", "c" : 1, "hom" : _hom, "het" : _het}]}
db.collection.insertOne(doc)
}
}
Now the amount of improved performance is dependent on your "hit" rate, I would assume for most processes it should be high which means it'll be significant. The performances for "misses" is not affected as the update "acts" the same as a find if no matching document exists.