Get the number of integers between each given range

Issue

The task:

Write a function that receives 3 arrays and returns an array. The
first array contains n integers, their values range between 0 and
10^9. "numbers". The second array is a low-range array, which contains
the lower end of a range, it contains q integers. "low". The third
array is a high-range array, which contains the higher end of a range,
it contains q integers. "high".

The function should return an array that contains the number of
integers in the first array, that fall in its range, given by the
low-range and high-range arrays.

In the returned array, at index i, there should be the number of
integers in "numbers" which are bigger or equal to low[i] and smaller or equal to high[i].

Examples:
count_range([12,13,14,15,17],[14],[14]) should return [1]
count_range([12,13,14,15,17],[14,15],[14,18]) should return [1,2]
count_range([12,13,14,15,17],[12],[17]) should return [5]

This is how I solved the question but I feel like there might be a more efficient way to solve seeing as the arrays could be long and the numbers could be really big.

I’d be glad to get some insights or tests that challenge this could to help me think in a better direction.

def binarySearch(data, val):
    highIndex = len(data) - 1
    lowIndex = 0
    while highIndex > lowIndex:
        index = math.ceil((highIndex + lowIndex) / 2)
        sub = data[index]
        if sub > val:
            if highIndex == index:
                return sorted([highIndex, lowIndex])
            highIndex = index
        else:
            if lowIndex == index:
                return sorted([highIndex, lowIndex])
            lowIndex = index
    return sorted([highIndex, lowIndex])


def count_range(numbers, low, high):
    numbers.sort()
    result = []
    range_dict = {}
    for i in range(len(numbers)):
        if numbers[i] not in range_dict:
            range_dict[numbers[i]] = i
    for i in range(len(low)):
        low_r = low[i]
        high_r = high[i]
        if low_r not in range_dict:
            range_dict[low_r] = binarySearch(numbers, low_r)[0]
        low_index = range_dict.get(low_r)
        if high_r not in range_dict:
            range_dict[high_r] = binarySearch(numbers, high_r)[0]
        high_index = range_dict.get(high_r)
        while high_index+1 < len(numbers) and numbers[high_index + 1] == numbers[high_index]:
            high_index += 1
        if low_r in numbers or low_r < numbers[0]:
            low_index -= 1
        result.append(high_index - low_index)
    print(result)
    return result

Solution

Using Python’s bisect:

from bisect import bisect_left, bisect_right

def count_range(numbers, low, high):
    numbers.sort()
    return [bisect_right(numbers, hi) - bisect_left(numbers, lo)
            for hi, lo in zip(high, low)]

In my solution, the sorting actually took about 90% of my time, then handling the queries only took about 10% of my time.

Benchmark with 100,000 numbers and 1000 ranges (Try it online!):

  71.6 ms  count_range_Sara
  24.1 ms  count_range_Kelly
6303.6 ms  count_range_Nin17

  70.5 ms  count_range_Sara
  23.1 ms  count_range_Kelly
6225.0 ms  count_range_Nin17

  64.5 ms  count_range_Sara
  23.1 ms  count_range_Kelly
6306.7 ms  count_range_Nin17

(Note: Sara’s solution I measured is the original slightly buggy one, which I included despite the buggyness as the question is about efficiency. And Nin17’s solution that I measured is their original one from their now deleted answer, not the NumPy one.)

Answered By – Kelly Bundy

Answer Checked By – Mildred Charles (AngularFixing Admin)

Leave a Reply

Your email address will not be published.