So here is a problem, I am given an integer array, whose number is all distinct, let's say it is

```
int[] data = {21, 34, 12, 88, 54, 73};
```

now that I would like to see if a subarray, or a range, contains a number is in a range(which is also given). In other words, I want to see if a range of the array contains a number that is in a range. For instance, if I have a function `check(int a, int b, int l, int r)`

where `a`

and `b`

is the range of the array and `l`

and `r`

is the range of the number.

So for the array above, `check(0, 2, 20, 50)`

should return `true`

since from `index = 0 to 2`

, there is `21, 34, 12`

and there is two numbers,`21, 34`

, is in range of `20 to 50`

.

So another example would be `check(2, 3, 20, 80)`

should return `false`

since there,`12, 88`

, is no number in range of 20, 80.

I'm thinking about using Segment Tree, since as I know, RMQ(range minimum query) can be solved by using Segment Tree, thus I think Segment Tree would also work on this problem; however, all of the `"get" function`

of Segment Tree is `"single"`

(Perhaps not the best word), so, I would want to know what nodes should the Segment Tree hold. Is there any algorithm that can answer each query in `O(log(n))`

while the `"build" time`

is not `O(n^2)`

, where `n`

is the size of the array?

Note: Using Segment Tree is just my own thought, any other approach is appreciated.

·
Santiago Trujillo

`O(N)`

is easy:

```
public static boolean check(int[] data, int a, int b, int l, int r) {
return Arrays.stream(data, a, b + 1).anyMatch(n -> n >= l && n <= r);
}
```

I suspect that any more big-O efficient approach would spend enough time building the needed data structure that it's not worth the effort unless you're doing a *lot* of lookups on a huge dataset. Even then, maybe a parallel version of the above might be good enough.

·
Santiago Trujillo
Report

It's a bit exotic, but a persistent red-black tree, or a persistent variant of any other self-balancing tree, would work.

A persistent data structure allows one to (time- and space-)efficiently take "snapshots" of the structure at different times, and then query those snapshots later, receiving results based on the structure's state as of the snapshot time. For this use case, the particular query we would want to do would be to count all the contained elements within a given range (which can be performed in `O(log n)`

if each node is annotated with the number of its descendants).

In this case, you would start with an empty structure, and at time `i`

, insert `data[i]`

and then store a snapshot as `snapshot[i]`

. Then, `check(a,b,l,r)`

would be implemented as `return snapshot[b].countInRange(l,r) > snapshot[a].countInRange(l,r)`

. That is, if there were more elements in the target range as of time `b`

than there were as of time `a`

, then some element in the target range must have been added between `a`

and `b`

and thus satisfies your constraints.

If optimally implemented, the precomputation would take time `O(n log n)`

and space `O(n)`

, and queries would take time `O(log n)`

.

If you were willing to relax the `O(log n)`

requirement for queries, a simpler and potentially more practical approach would be a 2-dimensional k-D tree. Simply insert each `data[i]`

as the point `(i, data[i])`

, and then do a range search for `a<=x<b, l<=y<r`

. This gives you a query time of `O(sqrt(n))`

, which is not as efficient, but a lot easier to code up (or to find existing code for).

·
Santiago Trujillo
Report