Skip to main content

Prefix Sum

Maximum Population Year

Problem Description

You are given a 2D integer array logs where each 
logs[i] = [birth i, death i] indicates the birth and death years of the ith person.
The population of some year x is the number of people alive during that year. The ith person is counted in year x's population if x is in the inclusive range [birth i, death i- 1]. Note that the person is not counted in the year that they die.
Return the earliest year with the maximum population.

Explanation

The task is to determine the earliest year with the maximum population based on a list of birth and death years.

  • Each person is represented as [birth, death], meaning they are alive from birth to death - 1 (inclusive).
  • The year death is not included in the population count.

WE need to find the year with the highest number of people alive. If multiple years have the same population, we have to return the earliest year.

Examples

Input: logs = [[1993,1999],[2000,2010]]
Output: 1993
Explanation: The maximum population is 1, and 1993 is the earliest year with this population.

Input: logs = [[1950,1961],[1960,1971],[1970,1981]]
Output: 1960
Explanation:
The maximum population is 2, and it had happened in years 1960 and 1970.
The earlier year between them is 1960.

Constraints

  • 1 <= logs.length <= 100
  • 1950 <= birth i < death i <= 2050

The solution for the problem statement can be thought of as checking the birth and death years of each individual in the logs array one by one and finalise the year which has the highest population. Let's see how we can perform this simulation.

Brute Force Approach

Intuition

The first intuition that came to our mind is , that imagine we maintain empty baskets numbered from 1950 till 2050 (as given in the constraints) with a 0 number of balls in each basket. When we check the birth and death of each individual, we will simply keep on adding 1 ball from the birth year till death year-1, representing the individual was alive between these years. After we check the details of all the individuals, we can just iteratively check which basket has the highest number of balls representing that that particular year has the highest population. If more than 1 basket has the same score we will choose the one which appeared earliest.

Approach

Step1: Initialize Array and Variables:

  • Create an integer array vis[] of size 101, initialized to 0 (to track population for each year from 1950 to 2050).
  • Initialize a variable max to 0 to store the maximum population value.

Step2: Iterate Through Logs: For each pair [birth, death] in the logs array:

  • Iterate through years from birth to death - 1
    • Increment vis[year] by 1 for each year the person is alive.
    • Update max as the maximum of max and vis[year].

Step3: Find the Earliest Year with Maximum Population: Initialize a variable year to 1950(as the answer will lie above 1950).

  • Iterate through vis[] from index 1950 to 2050:
    • If vis[i] == max: Set year = i and break out of the loop.

Step 4: Return the Result: Return year as the result.

Dry-Run

logs[][] = [[2000, 2010], [2005, 2015], [2010, 2020]]
Output= 2005
Explanation: The earliest year with max = 2 is 2005.

Step 1: Initialize Variables

  • vis[]: Array of size 2051 initialized with 0.
  • max: Initialized to 0.
  • year: Initialized to 1950.

Step 2: First Loop - Process Each Log
First Entry: [2000, 2010]

  • birth = 2000, death = 2010.
  • Update vis[] for years 2000 to 2009 (inclusive):
    • vis[2000]++ → vis[2000] = 1
    • vis[2001]++ → vis[2001] = 1
    • ...
    • vis[2009]++ → vis[2009] = 1
  • Update max during this process:
    • For all updated years, max = 1.

Second Entry: [2005, 2015]

  • birth = 2005, death = 2015.
  • Update vis[] for years 2005 to 2014 (inclusive):
    • vis[2005]++ → vis[2005] = 2
    • vis[2006]++ → vis[2006] = 2
    • ...
    • vis[2014]++ → vis[2014] = 1
  • Update max during this process:
    • For years 2005 to 2009, max = 2.

Third Entry: [2010, 2020]

  • birth = 2010, death = 2020.
  • Update vis[] for years 2010 to 2019 (inclusive):
    • vis[2010]++ → vis[2010] = 2
    • vis[2011]++ → vis[2011] = 2
    • ...
    • vis[2019]++ → vis[2019] = 1
  • Update max during this process:
    • max remains 2.

Step 3: Second Loop - Find Earliest Year with max

  • Traverse vis[] from 1950 to 2050:
    • Check each year:
      • At 2005: vis[2005] == max (2) → Set year = 2005 and break.

Final Output: Return year = 2005.

Brute Force in all Languages
1. C++ Try on Compiler   
class Solution {
public:
    int maximumPopulation(vector<vector<int>>& logs) {
        // Create an array to track the population for years 1950 to 2050
        int vis[101] = {0}; // Array size is 101 for years 1950 to 2050

        // Initialize the variable to track the maximum population
        int max_population = 0;

        // Loop through each log entry
        for (auto& q : logs) {
            // Extract the birth and death years
            int birth = q[0];
            int death = q[1];

            // Increment the population for each year the person is alive
            for (int j = birth; j < death; j++) {
                vis[j - 1950]++; // Map year to index
                // Update the maximum population
                max_population = max(max_population, vis[j - 1950]);
            }
        }

        // Initialize the variable to store the earliest year with max population
        int year = 1950;

        // Find the earliest year with the maximum population
        for (int i = 1950; i <= 2050; i++) {
            if (vis[i - 1950] == max_population) {
                year = i;
                break;
            }
        }

        // Return the result
        return year;
    }
};

2. Java Try on Compiler   
class Solution {
    public int maximumPopulation(int[][] logs) {
        // Create an array to track the population for years 1950 to 2050 (101 years)
        int[] vis = new int[101]; // Array size is 101 for years 1950 to 2050

        // Initialize the variable to track the maximum population
        int max = 0;

        // Loop through each log entry
        for (int[] q : logs) {
            // Extract the birth and death years
            int birth = q[0];
            int death = q[1];

            // Increment the population for each year the person is alive
            for (int j = birth; j < death; j++) {
                vis[j - 1950]++; // Map year to index (1950 -> 0, 1951 -> 1, etc.)
                // Update the maximum population
                max = Math.max(max, vis[j - 1950]);
            }
        }

        // Initialize the variable to store the earliest year with max population
        int year = 1950;

        // Find the earliest year with the maximum population
        for (int i = 1950; i <= 2050; i++) {
            if (vis[i - 1950] == max) {
                year = i;
                break;
            }
        }

        // Return the result
        return year;
    }
}

3. Python Try on Compiler   
class Solution:
    def maximumPopulation(self, logs: List[List[int]]) -> int:
        # Create a list to track the population for years 1950 to 2050 (101 years)
        vis = [0] * 101

        # Initialize the variable to track the maximum population
        max_population = 0

        # Loop through each log entry
        for log in logs:
            # Extract the birth and death years
            birth, death = log

            # Increment the population for each year the person is alive
            for year in range(birth, death):
                vis[year - 1950] += 1  # Adjust index to start from 0 for 1950
                # Update the maximum population
                max_population = max(max_population, vis[year - 1950])

        # Initialize the variable to store the earliest year with max population
        year_with_max_population = 1950

        # Find the earliest year with the maximum population
        for year in range(1950, 2051):
            if vis[year - 1950] == max_population:
                year_with_max_population = year
                break

        # Return the result
        return year_with_max_population

4. JavaScript Try on Compiler   
var maximumPopulation = function(logs) {
    // Create an array to track the population for years 1950 to 2050
        let vis = new Array(101).fill(0); // Array size is 101 for years 1950 to 2050

        // Increment the population for each year the person is alive
        for (let log of logs) {
            let birth = log[0];
            let death = log[1];
            for (let year = birth; year < death; year++) {
                vis[year - 1950]++; // Map year to index (1950 -> 0, 1951 -> 1, etc.)
            }
        }

        // Find the earliest year with the maximum population
        let maxPopulation = 0;
        let year = 1950;

        for (let i = 0; i < 101; i++) {
            if (vis[i] > maxPopulation) {
                maxPopulation = vis[i];
                year = 1950 + i; // Map index back to year
            }
        }

        return year;
};

Time Complexity: O(n^2)

Processing Logs: For each log, we loop from birth to death - 1. This results in O(s), where s is the total years spanned by all logs. In the worst case, S=N×100, so the complexity is O(n).

Finding Maximum Year: Iterates through vis[], taking O(2051), effectively O(1).
Overall Time Complexity: O(n).

Space Complexity: O(n)

Auxiliary Space Complexity: refers to the additional or extra space used by an algorithm, excluding the space required to store the input data.

  • The vis[] array requires O(n) space. No other significant data structures are used.

So, the overall Auxiliary Space Complexity: O(n).

Total Space Complexity:
To input elements to the logs array takes O(n) space.
Auxiliary space complexity: O(n)

The Overall Total Space Complexity is O(n) + O(n) = O(2n) = O(n)


The brute force solution checks each year for every person’s lifespan. This is fine for small inputs but becomes inefficient when the input size grows. But don't you think we are performing the same operation i.e. visiting each year and increasing the count over and over again, which may lead to a bigger runtime? We have to figure out what else can we do !!

Optimal Approach

Intuition

In the Brute Force approach, we were iterating over each query and updating the vis[ ] array which seems to be in-efficient.
Instead of counting the number of people alive for every year, can we track how the population changes at specific points ?

When someone is born population increases by 1.
When someone dies (death-1) population decreases after this year.

So instead of processing every year for every person’s lifespan, we can simply focus on these change points (birth and death). This will reduce our unnecessary computations.

If we track how the population changes year by year, we can calculate the population of any year by summing all the changes that occurred before and during that particular year. This means the population of a year is a cumulative sum of changes.

For example: If 5 people are born in 2000, and 2 die in 2001, the cumulative effect by 2001 is:
Population in 2000 = +5.
Population in 2001 = +5 (carryover) - 2 = +3.

Did we just observe that the population of the year 2001 was dependent upon the previous year i.e. 2000? What intuition are we getting from this observation?
Exactly !! Its prefix sum approach since we carry forward the previous information.

So, we will maintain an array vis[] of size 100 and initialize it with 0.
After this, are going to do is we will iterate through the logs[][] information and extract the birth and the death year for each individual we will mark +1 to the birth year and -1 to the death year.

For example , birth year is 2005 and the death year is 2040.
We will mark
vis[2005] = +1 and vis[2040] = -1.
This means that 1 individual was alive from 2005 till 2039 and died on the year 2040. This means that the population was constant i.e. 1 from 2005 to 2039.

Now, imagine we filled the vis array with +1 and -1 for each individual in the nums array. Is there a way to tell which year has the most population?
Hence, we will maintain a running sum from the start of the vis[] array till the end of vis[] array. Whenever , we encounter a +1, this means a individual was born and will add it to the running sum. When we encounter a -1 we will subtract from the running sum.

For example:
Person A has birth year = 1990 and death year = 2020
Person B has birth year = 2005 and death year= 2050
Person C has birth year = 1980 and death year = 2030
Using this info we will mark the vis[] array i.e +1 in vis[birth Year] and -1 in vis[death year] for the three persons.
Once, we are done marking, we will iterate from 1950 (as given in constraints) till 2050.
First we will come across 1980 with value +1 and will add it to runningSum= 0+1 = 1
Next, we will come across 1990 with value +1 and will add it to runningSum= 1+1=2 .
What does this mean?? This meant that in the year 1990 , 2 people were alive.
next, we will come across 2005 and add 1 to running sum and runningSum=3.

Now , when we encounter 2019, we will encounter a -1 which signifies that on the year 2019 , a individual died which resulted in a decrease in the population. So we subtract -1 from runningSum. Likewise, we will follow the algorithm and update the maxYear.

We maintained a runningSum which carry forwarded the previous information to the current ones..

So, the prefix sum will help us calculate the population at each year by simply adding the previous year’s population to the change for the current year.
Prefix sum is like "sweeping through" the timeline, step by step, to calculate population for each year based on previous years.

And since we sweep through the timeline year by year, updating the running totals as we encounter changes at each year and keep track of the population at each point, we call this as the Line Sweep Algorithm.

Adding all together.
Prefix Sum is how we calculate the population for each year during the sweep. Line Sweep is the framework: moving through the timeline and processing changes incrementally.

In Line sweep, we initialize the start interval i.e. birth as 1 and the end interval +1 i.e. death year by -1. We will do the same thing exactly by iterating the logs[][].

What is Line Sweep Algorithm ?

The Line Sweep Algorithm is a computational technique used to process intervals or events that occur over a continuous timeline or 2D space. It involves "sweeping" a vertical or horizontal line across the timeline or space, processing events in a specific order (usually sorted by time or position).

How it works ?

Identify Key Events: Each interval contributes two key events: a start (when the interval begins) and an end (when the interval ends). These events are marked with specific values (e.g., +1 for a start and -1 for an end).

Sweep Through Events
A line sweeps through the events, processing them sequentially. As the sweep progresses, a running total is updated based on the events.

During the sweep, you can calculate or track metrics such as the maximum overlap, total coverage, or active intervals.

Example Dry-Run:
You are given a list of intervals [(1,3),(2,5),(4,6)]. Count the maximum number of overlapping intervals at any point.

Dry Run:
Define Events:For each interval (start,end), create two events:

      • (start,+1): An interval starts.
      • (end,−1): An interval ends.

Events: [(1,+1),(3,−1),(2,+1),(5,−1),(4,+1),(6,−1)]

Sort Events:

    • Sort by time. If times are equal, process ending events first (to handle overlapping properly).
    • Sorted Events: [(1,+1),(2,+1),(3,−1),(4,+1),(5,−1),(6,−1)]

Process Events:

    • Use a counter to track active intervals.
    • Update the counter at each event and track the maximum.

Dry Run:

    • Start with active = 0, max_overlap = 0.
    • Process events:

1. Event (1,+1): active = 1, max_overlap = 1.

2.Event (2,+1): active = 2, max_overlap = 2.

3.Event (3,−1): active = 1, max_overlap = 2.

4.Event (4,+1): active = 2, max_overlap = 2.

5.Event (5,−1): active = 1, max_overlap = 2.

6.Event (6,−1): active = 0, max_overlap = 2.

Result: The maximum overlap is 2.

Advantages of this algorithm

It is Efficient as it reduces the need for nested iterations or brute force, typically achieving a linear time complexity, where n is the number of events.
It works well in large data-sets.

After, we are done with the initialization, we compute the population for each year by adding the population change at the current year to the population of the previous year using the formula vis[i] = vis[i] + vis[i - 1].
The formula sweeps the previous population values all together and brings it forward.
While we calculate the prefix sum compare the current year’s population with the maximum population seen so far, if the current year’s population is greater, update the maximum population and record the year. If there’s a tie (same maximum population in multiple years), the earlier year will remain the result.

Approach

Step 1: Initialize an Array for Population Changes

  • Create an array vis to store population changes for all years within the given range (e.g., 1950 to 2050).
  • Initialize all elements of vis to 0.

Step 2: Mark Population Changes

  • For each log in the input data:
    • Increment the value at the birth year index by +1 to indicate a population increase.
    • Decrement the value at the death year index by -1 to indicate a population decrease.

Step 4: Calculate Population Using Prefix Sum

  • Start from the first year (e.g., 1950) and iteratively compute the running total population for each year.
    • For each year: Add the value of the previous year to the current year's value to calculate the population for the current year.

Step 5: Find the Year with Maximum Population

  • Keep track of the maximum population encountered so far and the corresponding year.
  • If a year has a higher population than the current maximum, update the maximum population and record the year.
  • If multiple years have the same maximum population, retain the earlier year as the result.

Return the Year with Maximum Population: After processing all years, return the year with the highest population.

Dry-Run

Input: logs = [[2000, 2010], [2005, 2015], [2010, 2020]]
Output: 2005
Explanation: The earliest year with max = 2 is 2005.

Step 1: Initialize Array for Population Changes: Create an array vis of size 2051 (for years 1950 to 2050), initialized to 0.

Step 2: Mark Population Changes: For each interval in logs:
[2000, 2010]:

  • Increment vis[2000] by 1 → vis[2000] = 1
  • Decrement vis[2010] by 1 → vis[2010] = -1

[2005, 2015]:

  • Increment vis[2005] by 1 → vis[2005] = 1
  • Decrement vis[2015] by 1 → vis[2015] = -1

[2010, 2020]:

  • Increment vis[2010] by 1 → vis[2010] = 0 (since vis[2010] was previously -1)
  • Decrement vis[2020] by 1 → vis[2020] = -1

Step 3: Compute Population Using Prefix Sum: Iterate from 1950 to 2050 and compute the population for each year:

  • vis[i] = vis[i] + vis[i - 1] (cumulative sum).

After processing:

  • vis[2000] = 1
  • vis[2001] to vis[2004] = 1 (carrying over from 2000)
  • vis[2005] = 2 (1 from previous + 1 from interval [2005, 2015])
  • vis[2006] to vis[2009] = 2 (carrying over from 2005)
  • vis[2010] = 2 (net change from interval [2010, 2020])
  • vis[2011] to vis[2014] = 2
  • vis[2015] = 1
  • vis[2016] to vis[2019] = 1
  • vis[2020] = 0

Step 4: Find the Year with Maximum Population

Initialize:

  • max_population = vis[1950]
  • max_year = 1950

Traverse vis from 1950 to 2050:

  • If vis[i] > max_population:
      • Update max_population to vis[i].
      • Update max_year to i.

        If vis[i] == max_population and i is earlier than the current max_year, keep the earlier year.

Results after processing:

    • max_population = 2 at years 2005, 2006, ..., 2014.
    • The earliest year with maximum population is 2005.

Final Output: Return 2005.

Optimal Code in all Languages
1. C++ Try on Compiler   
class Solution {
public:
    int maximumPopulation(vector<vector<int>>& logs) {
        
        // Create an array to track population changes for years 1950-2050
        int vis[101] = {0}; // Array size is 101 for years 1950 to 2050

        // Record the population changes for each log
        for (auto& log : logs) {
            // Increment population for the birth year
            vis[log[0] - 1950] += 1;

            // Decrement population for the death year
            vis[log[1] - 1950] -= 1;
        }

        // Initialize variables to track the maximum population and its corresponding year
        int maxNum = vis[0], maxYear = 1950;

        // Traverse the years and compute prefix sums to determine population
        for (int i = 1; i < 101; i++) {
            // Update the population of the current year using prefix sum
            vis[i] += vis[i - 1];

            // Check if the current year's population is greater than the max population
            if (vis[i] > maxNum) {
                // Update the maximum population
                maxNum = vis[i];

                // Update the year corresponding to the maximum population
                maxYear = 1950 + i; // Map index back to the year
            }
        }

        // Return the year with the maximum population
        return maxYear;
    }
};

2. Java Try on Compiler   
class Solution {
    public int maximumPopulation(int[][] logs) {
        // Create an array to track population changes for years 1950-2050
        int[] vis = new int[101];  // Size changed to 101 (from 1950 to 2050, inclusive)

        // Record the population changes for each log
        for (int[] log : logs) {
            // Increment population for the birth year
            vis[log[0] - 1950] += 1;
            
            // Decrement population for the death year
            vis[log[1] - 1950] -= 1;
        }
        
        // Initialize variables to track the maximum population and its corresponding year
        int maxNum = vis[0], maxYear = 1950;
        
        // Traverse the years and compute prefix sums to determine population
        for (int i = 1; i < vis.length; i++) {
            // Update the population of the current year using prefix sum
            vis[i] += vis[i - 1];
            
            // Check if the current year's population is greater than the max population
            if (vis[i] > maxNum) {
                // Update the maximum population
                maxNum = vis[i];
                
                // Update the year corresponding to the maximum population
                maxYear = 1950 + i;
            }
        }
        
        // Return the year with the maximum population
        return maxYear;
    }
}

3. Python Try on Compiler   
class Solution:
    def maximumPopulation(self, logs: List[List[int]]) -> int:
        
        # Create an array to track population changes for years 1950-2050 (size 101)
        vis = [0] * 101
        
        # Record the population changes for each log
        for log in logs:
            # Increment population for the birth year
            vis[log[0] - 1950] += 1
            
            # Decrement population for the death year
            vis[log[1] - 1950] -= 1
        
        # Initialize variables to track the maximum population and its corresponding year
        max_num = vis[0]
        max_year = 1950
        
        # Traverse the years and compute prefix sums to determine population
        for i in range(1, len(vis)):
            # Update the population of the current year using prefix sum
            vis[i] += vis[i - 1]
            
            # Check if the current year's population is greater than the max population
            if vis[i] > max_num:
                # Update the maximum population
                max_num = vis[i]
                
                # Update the year corresponding to the maximum population
                max_year = 1950 + i
        
        # Return the year with the maximum population
        return max_year

4. JavaScript Try on Compiler   
var maximumPopulation = function(logs) {
     // Create an array to track population changes for years 1950-2050 (size 101)
        let vis = new Array(101).fill(0);
        
        // Record the population changes for each log
        for (let log of logs) {
            // Increment population for the birth year
            vis[log[0] - 1950] += 1;
            
            // Decrement population for the death year
            vis[log[1] - 1950] -= 1;
        }
        
        // Initialize variables to track the maximum population and its corresponding year
        let maxNum = vis[0], maxYear = 1950;
        
        // Traverse the years and compute prefix sums to determine population
        for (let i = 1; i < vis.length; i++) {
            // Update the population of the current year using prefix sum
            vis[i] += vis[i - 1];
            
            // Check if the current year's population is greater than the max population
            if (vis[i] > maxNum) {
                // Update the maximum population
                maxNum = vis[i];
                
                // Update the year corresponding to the maximum population
                maxYear = 1950 + i;
            }
        }
        
        // Return the year with the maximum population
        return maxYear;
};

Time Complexity: O(n^2)

Mark Population Changes:

  • Iterating through the logs array and updating the vis array takes O(n) time, where n is the number of logs.
  • Each update operation is constant time (O(1)).

Compute Population Using Prefix Sum:

  • Iterating through the years (from 1950 to 2050, a fixed range) takes O(100) time.

Overall Time Complexity:

  • The total time complexity is O(n + 100).
  • Since 100 is a constant, the time complexity simplifies to O(n).

Space Complexity: O(n)

Auxiliary Space Complexity: refers to the additional or extra space used by an algorithm, excluding the space required to store the input data.

Population Array (vis): An array of size 2051 is used to store population changes and cumulative sums. This requires O(1) additional space since the size is constant regardless of input size.

Other Variables: A few integer variables are used (maxNum, maxYear, etc.), which contribute O(1) space.

Overall Auxiliary Space Complexity: The auxiliary space complexity is O(1).

 Total Space Complexity:
Input Size:The input array logs requires O(n) space.
Auxiliary Space: O(1)

Overall Total Space Complexity: Total space complexity = O(n) + O(1) = O(n).


Learning Tip

You are given a 0-indexed 2D integer array nums representing the coordinates of the cars parking on a number line. For any index i, nums[i] = [starti, endi] where starti is the starting point of the ith car and endi is the ending point of the ith car.

Return the number of integer points on the line that are covered with any part of a car.

There is a car with capacity empty seats. The vehicle only drives east (i.e., it cannot turn around and drive west).

You are given the integer capacity and an array trips where trips[i] = [numPassengersi, fromi, toi] indicates that the ith trip has numPassengersi passengers and the locations to pick them up and drop them off are fromi and toi respectively. The locations are given as the number of kilometers due east from the car's initial location.

Return true if it is possible to pick up and drop off all passengers for all the given trips, or false otherwise.

💡
Showcase your skills by joining LearnYard Technologies FZ-LLC as a Technical Content Writer. Apply now and inspire the next generation of learners—fill out the form: https://forms.gle/CGDsbGkcapSfvXKM8