Skip to content

Latest commit

 

History

History
153 lines (120 loc) · 5.27 KB

fastest-sorting-integers.md

File metadata and controls

153 lines (120 loc) · 5.27 KB

Problem

Sort a given a huge array of bounded, non-duplicate, positive integers. Extend the problem to include negative integers. What are the different ways to reduce memory complexity of the problem.

Notes for clarity: The reason we are calling the array bounded is for the reason of allocation of memory in the first example. Second, ignoring negative integers is only to simplify the first code sample for understanding.

Updates since publishing:

  • Fixed bug in boolean array size as pointed by ploggingdev - thanks!
  • Added notes on similarity with BucketSort
  • Updated post title to indicate the domain to be bounded
  • Thanks to edi9999 for pointing O(N) constraint during the getNextSetBit() operation. Refer to issue #4 for more details.

Solution

Most of the interview candidates that I have talked to about this problem come up with the quick answer as MergeSort or divide-and-conquer. The cost of sorting being O(N * Log(N)) - which in this case is not the fastest sorting time.

The fastest time to sort an integer array is O(X), where X is the maximum value of integer in the array. For cases where N approaches X, that is, the array is so huge that the number of elements in the array equal the difference in min and max values, the complexity approaches O(N)

Let me explain how.

  • Construct a boolean array of length N
  • For every integer n in array, mark the boolean at index n in array as true
  • Once the array iteration is complete, just iterate over the boolean array again to print all the indices that have the value set as true
public void sortBoundedIntegers(int[] array) {
  /// the regular checks first
  if(array == null) {
    return;
  }

  if(array.length == 0) {
    return;
  }

  // build the boolean result array
  // Integer.MAX_INT makes sure that all integer values can be accommodated
  // this is very memory in-efficient and thus one should ideally use
  // a sparse bit-array for space savings
  boolean[] result = new boolean[Integer.MAX_INT];

  // iterate
  for(int index = 0; index < array.length; index++) {
    int num = array[index];
    result[num] = true;
  }

  // print the result
  for(int index = 0; index < result.length; index++) {
    if(result[index]) {
      System.out.println(index);
    }
  }
}

Note: The above solution using a boolean array is not the optimized solution and is meant only for clarity of algorithm. The ideal solution would be to use a sparse bit-array. Please go over Issue #4 for some discussion.

Additional constraints and optimizations available

  • A boolean occupies one-byte of memory. Thus, switching to a bit-vector (also called as bit-array) will reduce the memory consumption by a factor of 8. Check code sample 2.

  • In case the integers are also negative, another bit-array can be used to check for negatives, and then both iterated one-after-another to produce result. Check code sample 2.

  • To further reduce the memory consumption, one can make use of sparsed-bit-arrays. This can lead to huge drop in memory consumption if the integers are spaced apart a lot. Check code sample 2. Refer to brettwooldridge/SparseBitSet for one such implementation.

  • In case the integer array contains duplicates, use a small short array than the boolean array to hold the number of times an integer has been seen, thus still sorting in O(X) time.

  • If we use sparse bit-arrays on the above code example, then the part of the problem becomes similar to BucketSort where the buckets/bins are created to reduce memory space of the problem. However, the individual bin-sorting is based on bits in the bit-array.

Code Sample 2

public void sortBoundedIntegers(int[] array) {
  // run regular checks

  final int len = array.length;

  // considering SparsedBitSet is an implementation available
  BitSet negative = new SparsedBitSet();
  BitSet positive = new SparsedBitSet();

  // sort
  for(int index = 0; index < len; index++) {
    int num = array[index];
    if(num < 0) {
      num = 0 - num; // convert to positive
      negative.setBit(num, true);
    } else {
      positive.setBit(num, true);
    }
  }

  // output
  int index = -1;
  do {
    index = negative.getNextSetBit(index);
    if(index < 0) {
      break;
    }

    System.out.println(0 - index);
  } while(true);

  index = -1;
  do {
    index = positive.getNextSetBit(index);
    if(index < 0) {
      break;
    }

    System.out.println(index);
  } while(true);
}

Additional Reading