Learning Journal

2/17/2024

Top K Frequent Elements - Binary Heap

function topKFrequent(nums: number[], k: number): number[] {
    const hashTB:Record<string, number> = {};

    for(let i = 0; i < nums.length; i++) {
        if(hashTB[nums[i]] === undefined) {
            hashTB[nums[i]] = 1;
        }
        else 
            hashTB[nums[i]]++;
    }

    const maxHeap = new MaxHeap(hashTB);

    const result:number[] = [];

    for(let i = 0; i < k; i++){
        result.push(maxHeap.pop());
    }

    return result;
};


class MaxHeap {
    private arr: [number, number][] = [];
    readonly NUM:number = 0;
    readonly FREQUENCY:number = 1;

    constructor(hashTB:Record<string,number>) {
        const keys = Object.keys(hashTB);
        
        for(let i = 0; i < keys.length; i++) {
            this.push([Number(keys[i]), hashTB[keys[i]]])
        }

    }

    private swap(i1:number, i2:number) {
        const temp = this.arr[i1];
        this.arr[i1] = this.arr[i2];
        this.arr[i2] = temp;
    }

    private getParentI(i:number) {
        return Math.floor((i - 1)/2);
    }

    private getLeftChildI(i:number) {
        return 2*i + 1;
    }

    private getRightChildI(i:number) {
        return 2*i + 2;
    }

    push(item:[number,number]){
        this.arr.push(item);

        let curI = this.arr.length - 1, parentI = this.getParentI(curI);

        while(curI > 0 && this.arr[curI][this.FREQUENCY] > this.arr[parentI][this.FREQUENCY]){
            this.swap(curI, parentI);        
            curI = parentI;
            parentI = this.getParentI(curI);
        }
    }


    pop():number {
        if(this.arr.length === 0) throw new Error("The heap is empty");

        const retVal = this.arr[0][this.NUM];
        const last = this.arr.pop();

        if(this.arr.length === 0) return retVal;

        this.arr[0] = last;
        let curI = 0;

        let leftChildI = this.getLeftChildI(curI), rightChildI = this.getRightChildI(curI);

        while((leftChildI < this.arr.length && this.arr[curI][this.FREQUENCY] < this.arr[leftChildI][this.FREQUENCY])
            || (rightChildI < this.arr.length && this.arr[curI][this.FREQUENCY] < this.arr[rightChildI][this.FREQUENCY])
        ) {
            const largerChildI = rightChildI < this.arr.length && this.arr[rightChildI][this.FREQUENCY] > this.arr[leftChildI][this.FREQUENCY] ?
                                        rightChildI : leftChildI;

            this.swap(curI, largerChildI);

            curI = largerChildI;
            leftChildI = this.getLeftChildI(curI);
            rightChildI = this.getRightChildI(curI);    
        }


        return retVal;       
    }
}function topKFrequent(nums: number[], k: number): number[] {
    const hashTB:Record<string, number> = {};

    for(let i = 0; i < nums.length; i++) {
        if(hashTB[nums[i]] === undefined) {
            hashTB[nums[i]] = 1;
        }
        else 
            hashTB[nums[i]]++;
    }

    const maxHeap = new MaxHeap(hashTB);

    const result:number[] = [];

    for(let i = 0; i < k; i++){
        result.push(maxHeap.pop());
    }

    return result;
};


class MaxHeap {
    private arr: [number, number][] = [];
    readonly NUM:number = 0;
    readonly FREQUENCY:number = 1;

    constructor(hashTB:Record<string,number>) {
        const keys = Object.keys(hashTB);
        
        for(let i = 0; i < keys.length; i++) {
            this.push([Number(keys[i]), hashTB[keys[i]]])
        }

    }

    private swap(i1:number, i2:number) {
        const temp = this.arr[i1];
        this.arr[i1] = this.arr[i2];
        this.arr[i2] = temp;
    }

    private getParentI(i:number) {
        return Math.floor((i - 1)/2);
    }

    private getLeftChildI(i:number) {
        return 2*i + 1;
    }

    private getRightChildI(i:number) {
        return 2*i + 2;
    }

    push(item:[number,number]){
        this.arr.push(item);

        let curI = this.arr.length - 1, parentI = this.getParentI(curI);

        while(curI > 0 && this.arr[curI][this.FREQUENCY] > this.arr[parentI][this.FREQUENCY]){
            this.swap(curI, parentI);        
            curI = parentI;
            parentI = this.getParentI(curI);
        }
    }


    pop():number {
        if(this.arr.length === 0) throw new Error("The heap is empty");

        const retVal = this.arr[0][this.NUM];
        const last = this.arr.pop();

        if(this.arr.length === 0) return retVal;

        this.arr[0] = last;
        let curI = 0;

        let leftChildI = this.getLeftChildI(curI), rightChildI = this.getRightChildI(curI);

        while((leftChildI < this.arr.length && this.arr[curI][this.FREQUENCY] < this.arr[leftChildI][this.FREQUENCY])
            || (rightChildI < this.arr.length && this.arr[curI][this.FREQUENCY] < this.arr[rightChildI][this.FREQUENCY])
        ) {
            const largerChildI = rightChildI < this.arr.length && this.arr[rightChildI][this.FREQUENCY] > this.arr[leftChildI][this.FREQUENCY] ?
                                        rightChildI : leftChildI;

            this.swap(curI, largerChildI);

            curI = largerChildI;
            leftChildI = this.getLeftChildI(curI);
            rightChildI = this.getRightChildI(curI);    
        }


        return retVal;       
    }
}

THE PROBLEM

My solution uses that MaxHeap. Basically, it goes over the array to build a hash table that has each number as key and their frequencies as value. Then, it will add all the key-value pair to the MaxHeap and pop them out one by one k time to add to the result array.

It is faster than the Heap approach suggested on Leetcode. The suggested solution on LeetCode also uses a heap, but it uses the min heap, then, after k times of adding the pair to the heap, it will add and pop at the same time to make sure at all time, the heap always contains k max elements (because the (k + 1)th is popped out immediately). However, it will add n time and pop n-k time. Each time it adds or pops, it needs logN time. So, the time it needs is nlogN.

My algorithm needs nLogN to add in the constructor, and kLogN to pop out. It will work faster if k is much smaller than n.

However, it doesn't improve the big O much. The Leetcode solution also suggests QuickSelect, which I will try next.

Revised by ChatGPT

My solution employs a MaxHeap to efficiently find the top K frequent elements in an array. The process begins by iterating through the array to construct a hash table, where each key represents a unique number and its corresponding value denotes the frequency of that number. Subsequently, all key-value pairs are added to the MaxHeap.

The primary difference between my solution and the one suggested on LeetCode lies in the choice of heap. While the LeetCode solution utilizes a MinHeap to maintain the k most frequent elements, my approach uses a MaxHeap. In the LeetCode solution, each insertion into the MinHeap is followed by a removal if the heap size exceeds k, ensuring that only the k most frequent elements are retained. This approach results in n insertions and n-k removals, each requiring O(logN) time, leading to a total time complexity of O(nlogN).

In contrast, my solution involves adding all elements to the MaxHeap in the constructor, which takes O(nlogN) time, and then extracting the top k elements, which takes O(klogN) time. This approach is more efficient when k is significantly smaller than n, as the extraction phase is faster. However, the overall time complexity remains O(nlogN), similar to the MinHeap approach.

Next, I plan to explore the QuickSelect algorithm as an alternative method, which has the potential to offer improved performance.