This is a follow up to my earlier question JavaScript arrays: how do I compact excess length?
I just discovered that all javascript engines I have (Chromium, Node.js, Firefox) are extremely inefficient with sparse arrays! Example:
[,1,2,3,,,,,9,,,].find((x,i,a) => console.log(x,i))
It turns out that all the holes are also hit by this search. But in a real world use case of large sparse arrays this is extremely inefficient!
One might argue that formally one should require that find would return undefined for missing values instead of skipping them and therefore never being able to return them. But that seems to be such a rarely useful property of these iterating functions and instead the spec should say that holes in sparse arrays are to be skipped.
It seems to come down to the iterating with the in
vs. the of
operator.
So might there be a way of redefining or modifying the iteration of these iterating functions to be based only on indexes actually there? Especially would like to find first index and last index in array, and also next index given an index (that may or may not be the index of a hole).
The length property on an Array takes the last element's index and adds one. So if you have an array with holes between index 0 through 100, and an element at index 101, the length will return 101, as it's the last index + 1.
The above happens regardless of how many elements are in the Array.
This is a documented feature. If you want to ignore empty spaces, just check if the item at whatever index is falsy, and then don't do the logic. You're more than welcome to modify prototypes to do this, at your own peril of course.
The answer is: yes, find, findIndex, findLast, findLastIndex are indeed not efficient on large sparse arrays, but no, that isn't true for all the iterating array functions.
In fact, forEach, filter, map, and reduce, even some and every, do not call the callback function on the holes in the sparse arrays, which I tested in Chromium, Firefox, and Node.JS. Therefore if the intention of using findLastIndex is to replace naive use of the array length property which is unreliable (and therefore generally useless if often practically useful when ignoring many sparse array scenarios) then the following can be done:
(function(arr) {
try {
return arr.reduceRight(
function(r,x,i,a) {
console.log(r,x,i,a);
throw i;
}, -1);
} catch(r) {
if(typeof r == 'number')
return r;
else
throw r;
}
})([,,,3,,,6,,,9,,,])
I don't know if there is a gentler way of breaking out of such an iterator function loop other than throwing the result value, but it works and it cuts short what would otherwise continue after the result has already been found.
In case of find(Index) from the beginning (instead of the end), where a sparse array could also cause millions of iterations before even the first element is found, the same can be achieved with forEach:
let testArray = [];
testArray[20000000] = "first";
testArray[40000000] = "last";
(function(arr) {
try {
arr.forEach(
function(x,i,a) {
console.log(x,i,a);
throw i;
});
return -1;
} catch(r) {
if(typeof r == "number")
return r;
else
throw r;
}
})(testArray)
As you run this with different index values (creating different size initial sparseness) you notice that (again on all three, Chromium, Firefox, and Node.JS) the implementation is inefficient as it seems to internally iterate instead of maintaining a direct pointer to the first actual index, last actual index, and potentially even next-pointers to skip over large holes. But this is OK, those are optimizations that can be added, at least the specification doesn't require that the call-back is invoked on the holes of a sparse array as it seems to be the case with find (which I think is a bad idea).
The following goes into my own general toolbox from now on:
Array.prototype.findActualLastIndex = function(f) {
try {
return this.reduceRight(
function(r, x, i, a) {
if(f(x, i, a))
throw i;
else
return r;
}, -1);
} catch(r) {
if(typeof r == 'number')
return r;
else
throw r;
}
}
Array.prototype.findActualIndex = function(f) {
try {
return this.reduce(
function(r, x, i, a) {
if(f(x, i, a))
throw i;
else
return r;
}, -1);
} catch(r) {
if(typeof r == 'number')
return r;
else
throw r;
}
}
Hope this can help others. It's not running optimally with huge spare regions at the ends, but it is the best we can get out of the present version of javascript engines.