Skip to main content

Heaps

Single Threaded CPU Problem Solution In C++/Java/Python/Javascript

Problem Description

You are given n​​​​​​ tasks labeled from 0 to n - 1 represented by a 2D integer array tasks, where tasks[i] = [enqueueTime i, processingTime i]means that the i​​​​​​t task will be available to process at enqueueTime i and will take processingTime i to finish processing.

You have a single-threaded CPU that can process at most one task at a time and will act in the following way:

If the CPU is idle and there are no available tasks to process, the CPU remains idle.
If the CPU is idle and there are available tasks, the CPU will choose the one with the shortest processing time. If multiple tasks have the same shortest processing time, it will choose the task with the smallest index.
Once a task is started, the CPU will process the entire task without stopping.
The CPU can finish a task then start a new one instantly.

Return the order in which the CPU will process the tasks.

What does threads means in CPU?

In the context of a CPU, a thread is essentially the smallest sequence of programmed instructions that can be managed independently by the operating system. Think of it as a "mini-task" or a light-weight process. Modern CPUs can run multiple threads at the same time (using techniques like multi-threading or hyper-threading), allowing for parallel execution of tasks, which improves the overall efficiency and responsiveness of your computer.

Example

Following are Single Threaded CPU examples.

Input: tasks = [[1,2],[2,4],[3,2],[4,1]]
Output: [0,2,3,1]
Explanation: The events go as follows:
- At time = 1, task 0 is available to process. Available tasks = {0}.
- Also at time = 1, the idle CPU starts processing task 0. Available tasks = {}.
- At time = 2, task 1 is available to process. Available tasks = {1}.
- At time = 3, task 2 is available to process. Available tasks = {1, 2}.
- Also at time = 3, the CPU finishes task 0 and starts processing task 2 as it is the shortest. Available tasks = {1}.
- At time = 4, task 3 is available to process. Available tasks = {1, 3}.
- At time = 5, the CPU finishes task 2 and starts processing task 3 as it is the shortest. Available tasks = {1}.
- At time = 6, the CPU finishes task 3 and starts processing task 1. Available tasks = {}.
- At time = 10, the CPU finishes task 1 and becomes idle.

Input: tasks = [[7,10],[7,12],[7,5],[7,4],[7,2]]
Output: [4,3,2,0,1]
Explanation: The events go as follows:
- At time = 7, all the tasks become available. Available tasks = {0,1,2,3,4}.
- Also at time = 7, the idle CPU starts processing task 4. Available tasks = {0,1,2,3}.
- At time = 9, the CPU finishes task 4 and starts processing task 3. Available tasks = {0,1,2}.
- At time = 13, the CPU finishes task 3 and starts processing task 2. Available tasks = {0,1}.
- At time = 18, the CPU finishes task 2 and starts processing task 0. Available tasks = {1}.
- At time = 28, the CPU finishes task 0 and starts processing task 1. Available tasks = {}.
- At time = 40, the CPU finishes task 1 and becomes idle.

Constraints:

  • tasks.length == n
  • 1 <= n <= 10^5
  • 1 <= enqueueTimei, processingTimei<= 10^9

The goal is to determine the sequence in which the CPU processes all tasks. Initially, one might consider selecting tasks based on priority, but a more dynamic approach is needed to minimize computational overhead.

Optimal Approach

Intuition

Real World Example

Single Threaded CPU example can be related in following way. Imagine you're running a busy coffee shop where customers place orders at different times, and each order takes a certain amount of time to prepare. To keep everything organized, you assign a ticket number to every order as soon as it arrives, ensuring that you never lose track of which order is which. This is similar to indexing the tasks in the problem so that each one retains its identity.

Next, you arrange the orders based on their arrival times, just like sorting the tasks by their enqueue times. As the clock ticks, you add every order that has arrived into a waiting area—a list of orders that are ready to be processed. However, instead of serving the orders strictly in the order they came in, you choose the order that can be prepared the fastest. Now you have a huge pile of orders lined up in system but you need to prepare the order that came earlier. This can be achieved by looking upon all the order once and choosing the order which came earlier. This would be a hectic task as the number of orders increased. So, what you need is a way in which fastest order comes to you instantly i.e as orders keep on coming up you add it and according to which came earlier you choose it. So, we should come up with a data structure that deals with this efficiently. Heap is one of the data structure used for this kind of tasks.

What is Heap?

A heap is a specialized tree-based data structure often used to implement priority queues. Heaps are usually implemented as complete binary trees (every level is fully filled except possibly the last, and all nodes are as far left as possible), which allows them to be efficiently stored in an array. With a heap, you can quickly access the minimum or maximum element in constant time, and insertion or deletion operations take logarithmic time. This makes heaps extremely useful in algorithms that require repeated retrieval of the "best" element, like scheduling tasks or sorting data with heapsort.

What is min-heap?
A min-heap is a specialized binary tree-based data structure where the smallest element is always at the root. This means that for any given node in the tree, its value is less than or equal to the values of its children. Because of this property, you can quickly access the minimum element at any time. Additionally, a min-heap is usually implemented as a complete binary tree, which means all levels of the tree are fully filled except possibly the last level, and all nodes in the last level are as far left as possible.

What is max-heap?
A max-heap is a complete binary tree data structure where every parent node has a value greater than or equal to its children, ensuring that the largest element is always at the root. This property makes it easy to quickly access the maximum element. Like a min-heap, a max-heap is typically implemented using an array, which efficiently represents the complete binary tree.

General Operations on Heap:

  • Insertion:
    Inserting an element takes O(log n) time because after placing the new element at the end of the array, you "bubble up" (or "sift up") to restore the heap property.
  • Removal (Extract-Min/Extract-Max):
    Removing the root element (the minimum in a min-heap or maximum in a max-heap) also takes O(log n) time since after removal, you "bubble down" (or "sift down") the last element placed at the root to reestablish the heap property.
  • Peek (Find-Min/Find-Max):
    Retrieving the root element is an O(1) operation because it is always at the front of the array.
  • Building a Heap:
    Building a heap from an unsorted array can be done in O(n) time using a bottom-up heapify process.
  • Space Complexity:
    The heap uses O(n) space to store n elements.

There will be moments when no orders are ready because no customer has arrived yet. During these times, rather than doing nothing inefficiently, you simply wait until the next order is placed. This waiting period is managed in the algorithm by advancing the current time to match the arrival time of the next task when the waiting list is empty.

As you work on one order, new orders might come in and join the waiting area. This efficient system of adding, waiting, and processing continues until every order is completed, ensuring that your coffee shop runs smoothly and customers are served as quickly as possible.

Why do we add an index to each task before sorting?

We add an index to each task to preserve its original identity. When the tasks are sorted by enqueue time (and processing time, if needed), the index helps keep track of which task is which. This is especially important for tie-breaking when two tasks have the same processing time—by using the original index, we ensure that the task order remains consistent with the problem's requirements.

How does the algorithm handle periods when the CPU is idle?

If there are no tasks available at the current time (i.e., the heap is empty) but there are still upcoming tasks, the algorithm advances the current time to the enqueue time of the next task. This simulates the CPU waiting without wasting any processing time and ensures that all tasks are eventually processed as soon as they become available

Approach

Single Threaded CPU algorithm includes following steps:

Step 1: Index the Tasks
For every task in the input, append its original index. This ensures that after sorting, each task can still be uniquely identified. In the code, this is done by pushing the index into each task's vector.

Step 2: Sort the Tasks
Sort the tasks based on their enqueue time. This way, you can efficiently determine which tasks become available as time progresses. The sorted order helps in processing tasks in the order of their arrival.

Step 3: Initialize the heap
Set up a min-heap (heap) to manage tasks based on their processing time, using the original index for tie-breaking if necessary. This data structure helps in always selecting the task that can be completed fastest from the available ones.

Step 4: Add Tasks to the Queue
Begin with the current time set to the enqueue time of the first task. Then, while iterating through the sorted tasks, add every task whose enqueue time is less than or equal to the current time into the heap. This simulates the scenario where tasks become available for processing over time.

Step 5: Handle Idle CPU Time
If the heap is empty but there are still tasks that haven't been processed, it means the CPU is idle. In this case, update the current time to the enqueue time of the next task. This ensures that the CPU waits appropriately until new tasks arrive.

Step 6: Process the Tasks
When the heap is not empty, extract the task with the smallest processing time from the queue. Append the original index of this task to the answer list, and update the current time by adding the processing time of the task. This step mimics the CPU executing a task without interruption.

Step 7: Continue Until All Tasks Are Processed
Repeat Steps 4 to 6 until all tasks have been pushed into the queue and processed. Finally, return the answer list containing the order in which the tasks were executed.

Dry Run

Following is dry run for Single Threaded CPU algorithm.

Step 1: Start with tasks = [[1,2], [2,4], [3,2], [4,1]] and append each task’s original index, resulting in [1,2,0], [2,4,1], [3,2,2], [4,1,3].


Step 2: Sort the tasks by their enqueue time. In this example, they remain in the order: [1,2,0], [2,4,1], [3,2,2], [4,1,3].


Step 3: Initialize variables with i = 0, set the current time t = 1 (the enqueue time of the first task), create an empty heap, and an empty answer list.


Step 4: At t = 1, since tasks[0][0] = 1 is ≤ t, push task 0 (processing time 2, index 0) into the heap and increment i to 1.


Step 5: The queue is not empty, so pop the task with the smallest processing time (task 0). Append its index 0 to the answer list and update t = 1 + 2 = 3.


Step 6: With t = 3 and i = 1, check the next tasks. Since tasks[1][0] = 2 is ≤ 3, push task 1 (processing time 4, index 1) into the queue and increment i to 2. Also, tasks[2][0] = 3 is ≤ 3, so push task 2 (processing time 2, index 2) into the queue and increment i to 3.


Step 7: The queue now contains tasks 1 and 2. Pop the task with the smallest processing time (task 2). Append its index 2 to the answer list and update t = 3 + 2 = 5.


Step 8: With t = 5 and i = 3, check the remaining task. Since tasks[3][0] = 4 is ≤ 5, push task 3 (processing time 1, index 3) into the queue and increment i to 4.


Step 9: The queue now has tasks 1 and 3. Pop the task with the smallest processing time (task 3). Append its index 3 to the answer list and update t = 5 + 1 = 6.


Step 10: With all tasks added (i = 4), the queue still contains task 1. Pop task 1, append its index 1 to the answer list, and update t = 6 + 4 = 10.


Step 11: The queue is now empty and all tasks have been processed. The final execution order is [0, 2, 3, 1].

Code for All Languages
Single Threaded CPU solution in C++ :Optimal
class Solution {
public:
    vector<int> getOrder(vector<vector<int>>& tasks) {
        int n = tasks.size();
        // Step 1: Append original index to each task to preserve its identity
        for (int i = 0; i < n; i++) {
            tasks[i].push_back(i); // tasks[i] becomes [enqueueTime, processingTime, index]
        }
        
        // Step 2: Sort tasks by enqueue time
        sort(tasks.begin(), tasks.end(), [](const vector<int>& a, const vector<int>& b) {
            return a[0] < b[0];
        });
        
        // Min-heap to store tasks as pairs: {processingTime, index}
        // This ensures that we always process the task with the shortest processing time
        priority_queue<pair<int, int>, vector<pair<int, int>>, greater<pair<int, int>>> pq;
        
        vector<int> ans;       // To store the order of task processing
        int i = 0;             // Index to iterate over the tasks array
        long long t = tasks[0][0]; // Current time, starting at the first task's enqueue time
        
        // Step 3: Process tasks until all tasks are handled
        while (i < n || !pq.empty()) {
            // Add all tasks that have arrived by current time t into the heap
            while (i < n && tasks[i][0] <= t) {
                pq.push({tasks[i][1], tasks[i][2]}); // {processingTime, index}
                i++;
            }
            
            // Step 4: If no tasks are available, jump time to the next task's enqueue time
            if (pq.empty()) {
                t = tasks[i][0];
            } else {
                // Step 5: Process the next available task with the smallest processing time
                auto [procTime, idx] = pq.top();
                pq.pop();
                ans.push_back(idx); // Record the original index of the processed task
                t += procTime;      // Update current time by adding the task's processing time
            }
        }
        
        // Step 6: Return the final execution order
        return ans;
    }
};

Single Threaded CPU solution in Java :Optimal
class Solution {
    public int[] getOrder(int[][] tasks) {
        int n = tasks.length;
        // Step 1: Create a new array to include the original index with each task.
        // Each task is represented as [enqueueTime, processingTime, index]
        int[][] newTasks = new int[n][3];
        for (int i = 0; i < n; i++) {
            newTasks[i][0] = tasks[i][0];  // enqueue time
            newTasks[i][1] = tasks[i][1];  // processing time
            newTasks[i][2] = i;           // original index
        }
        
        // Step 2: Sort the tasks by their enqueue time.
        Arrays.sort(newTasks, (a, b) -> Integer.compare(a[0], b[0]));
        
        // Step 3: Initialize a (min-heap) that sorts by processing time.
        // If two tasks have the same processing time, sort by their original index.
        PriorityQueue<int[]> pq = new PriorityQueue<>((a, b) -> {
            if (a[1] == b[1]) {
                return Integer.compare(a[2], b[2]);
            }
            return Integer.compare(a[1], b[1]);
        });
        
        // List to store the order in which tasks are executed.
        List<Integer> result = new ArrayList<>();
        long currentTime = newTasks[0][0];  // Start at the enqueue time of the first task
        int i = 0;  // Pointer to traverse through the sorted tasks
        
        // Step 4: Process tasks until all tasks have been executed.
        while (i < n || !pq.isEmpty()) {
            // Add all tasks that have arrived by the current time into the heap.
            while (i < n && newTasks[i][0] <= currentTime) {
                pq.offer(newTasks[i]);
                i++;
            }
            
            // Step 5: If no tasks are available in the queue, jump current time to the next task's enqueue time.
            if (pq.isEmpty()) {
                currentTime = newTasks[i][0];
            } else {
                // Process the next available task with the smallest processing time.
                int[] task = pq.poll();
                result.add(task[2]);         // Append the task's original index to the result
                currentTime += task[1];      // Update current time by adding the processing time
            }
        }
        
        // Step 6: Convert the result list to an array and return it.
        int[] order = new int[result.size()];
        for (int j = 0; j < result.size(); j++) {
            order[j] = result.get(j);
        }
        return order;
    }
}

Single Threaded CPU solution in Python :Optimal
import heapq
from typing import List

class Solution:
    def getOrder(self, tasks: List[List[int]]) -> List[int]:
        # Step 1: Append each task's original index so we can identify it later.
        # Each task becomes a tuple: (enqueueTime, processingTime, index)
        tasks_with_index = [(task[0], task[1], i) for i, task in enumerate(tasks)]
        
        # Step 2: Sort tasks by their enqueue time.
        tasks_with_index.sort(key=lambda x: x[0])
        
        # Step 3: Initialize a min-heap to select tasks by processing time.
        # The heap stores tuples: (processingTime, index)
        heap = []
        result = []  # This will store the final order of task indices.
        time = tasks_with_index[0][0]  # Start at the enqueue time of the first task.
        i, n = 0, len(tasks)
        
        # Step 4: Process tasks until all tasks have been handled.
        while i < n or heap:
            # Add all tasks that are available by the current time into the heap.
            while i < n and tasks_with_index[i][0] <= time:
                enqueue, processing, index = tasks_with_index[i]
                heapq.heappush(heap, (processing, index))
                i += 1
                
            # Step 5: If no tasks are available, jump the time to the next task's enqueue time.
            if not heap:
                time = tasks_with_index[i][0]
            else:
                # Pop the task with the smallest processing time (and smallest index if tied).
                processing, index = heapq.heappop(heap)
                result.append(index)
                time += processing  # Update the current time by adding the processing time.
                
        # Step 6: Return the final order of task indices.
        return result

Single Threaded CPU solution in Javascript :Optimal
/**
 * @param {number[][]} tasks
 * @return {number[]}
 */
var getOrder = function(tasks) {
    const n = tasks.length;
    // Step 1: Append each task's original index to preserve its identity.
    // Each task becomes [enqueueTime, processingTime, index].
    const tasksWithIndex = tasks.map((task, index) => [task[0], task[1], index]);
    
    // Step 2: Sort the tasks by their enqueue time.
    tasksWithIndex.sort((a, b) => a[0] - b[0]);
    
    // Step 3: Define a MinHeap class to use as a heap.
    // The heap will store pairs [processingTime, index] and compare by processingTime first, then index.
    class MinHeap {
        constructor() {
            this.heap = [];
        }
        
        size() {
            return this.heap.length;
        }
        
        push(val) {
            this.heap.push(val);
            this._siftUp();
        }
        
        pop() {
            if (this.size() === 0) return null;
            this._swap(0, this.size() - 1);
            const popped = this.heap.pop();
            this._siftDown();
            return popped;
        }
        
        _siftUp() {
            let index = this.size() - 1;
            while (index > 0) {
                const parent = Math.floor((index - 1) / 2);
                if (this._compare(this.heap[index], this.heap[parent]) < 0) {
                    this._swap(index, parent);
                    index = parent;
                } else {
                    break;
                }
            }
        }
        
        _siftDown() {
            let index = 0;
            const length = this.size();
            while (true) {
                let left = 2 * index + 1;
                let right = 2 * index + 2;
                let smallest = index;
                
                if (left < length && this._compare(this.heap[left], this.heap[smallest]) < 0) {
                    smallest = left;
                }
                if (right < length && this._compare(this.heap[right], this.heap[smallest]) < 0) {
                    smallest = right;
                }
                if (smallest !== index) {
                    this._swap(index, smallest);
                    index = smallest;
                } else {
                    break;
                }
            }
        }
        
        _compare(a, b) {
            // Compare by processing time; if equal, compare by original index.
            if (a[0] !== b[0]) return a[0] - b[0];
            return a[1] - b[1];
        }
        
        _swap(i, j) {
            [this.heap[i], this.heap[j]] = [this.heap[j], this.heap[i]];
        }
    }
    
    // Step 4: Initialize the min-heap, result array, and time pointer.
    const heap = new MinHeap();
    const result = [];
    let time = tasksWithIndex[0][0]; // Start at the first task's enqueue time.
    let i = 0;
    
    // Step 5: Process tasks until all tasks have been handled.
    while (i < n || heap.size() > 0) {
        // Add all tasks available by current time into the heap.
        while (i < n && tasksWithIndex[i][0] <= time) {
            const [enqueue, processing, index] = tasksWithIndex[i];
            // Push [processingTime, originalIndex] into the heap.
            heap.push([processing, index]);
            i++;
        }
        
        // If no tasks are available, jump time to the next task's enqueue time.
        if (heap.size() === 0) {
            time = tasksWithIndex[i][0];
        } else {
            // Process the next task: the one with the smallest processing time.
            const [processing, index] = heap.pop();
            result.push(index);
            time += processing; // Update current time.
        }
    }
    
    // Step 6: Return the final order of task indices.
    return result;
};

Single Threaded CPU complexity analysis:Time Complexity O(n log n).

The overall time complexity is O(n log n).

  • Indexing tasks takes O(n).
  • Sorting the tasks takes O(n log n).
  • Each task is pushed and popped from the heap once, with each heap operation costing O(log n), resulting in O(n log n) overall.

Single Threaded CPU complexity analysis: Space Complexity O(n ).

Auxiliary Space Complexity: O(n)
The extra space used by the algorithm (beyond the input) includes:

  • An array (or list) to store the tasks with their original indices, which requires O(n) space.
  • A (min-heap) that, in the worst case, can hold all n tasks, resulting in O(n) space.

Thus, the auxiliary space complexity is O(n).

Overall Space Complexity:O(n)
When including the input, the overall space used is the sum of the input space and the auxiliary space. Since the input itself is O(n) and the extra space is O(n), the overall space complexity remains O(n).

Learning Tip

Now we have successfully tackled this problem, let's try these similar problems.

Similar Problems

In "Parallel Courses III," you're given several courses, each with a specified duration, along with a list of prerequisite relationships. The goal is to determine the minimum time required to complete all courses when you can take multiple courses concurrently as long as their prerequisites are satisfied. The problem essentially requires you to compute the longest time path in a directed acyclic graph representing the courses and their dependencies.

This problem asks you to determine the minimum total time a computer needs to be turned on to complete a set of tasks. Each task has a specific duration it must run (which can be non-continuous) and must be executed within a given time window [start, end]. Since the computer can run an unlimited number of tasks concurrently, the challenge is to schedule the tasks optimally, ensuring the computer is only on when necessary to fulfill all required durations.