Solving a Leetcode problem daily — Day 6 | Longest Common Prefix

Solving a Leetcode problem daily — Day 6 | Longest Common Prefix

Conquering LeetCode: Finding the Longest Common Prefix

This blog post tackles a classic LeetCode challenge: given an array of strings, write a function that finds the longest string which is a prefix (beginning) of all the strings in the array. We’ll delve into the problem statement, code implementation, step-by-step explanation, real-world applications, and conclude with some key takeaways.

Here is the Leetcode problem link —

https://leetcode.com/explore/learn/card/array-and-string/203/introduction-to-string/1162/

Problem Statement

LeetCode presents a challenge where you’re given an array of strings (strs). The task is to write a function that returns the longest common prefix shared by all strings in the array.

Example:

Input: strs = ["flower", "flow", "flight"]
Output: "fl"
Explanation:: Here, "fl" is the longest prefix that appears at the beginning 
of all three strings.
Input: strs = ["dog","racecar","car"]
Output: ""
Explanation: There is no common prefix among the input strings.

Constraints:

  • The number of strings (strs.length) can range from 1 to 200.

  • The length of each individual string (strs[i].length) can range from 0 to 200.

  • All characters in the strings are lowercase English letters.

Code Implementation

Here’s the C++ code that solves this challenge:

class Solution {
public:
  string longestCommonPrefix(vector<string>& strs) {
      if (strs.size() == 1) {
        return strs[0]; 
      }

      string pref = "";
      int currIdx = 0;

      while (1) {
        char ch;
        for (int i = 0; i < strs.size(); i++) {
          string s = strs[i];

          if (currIdx >= s.length()) { 
            return pref;
          }

          if (i == 0) { 
            ch = s[currIdx];
          }

          if (s[currIdx] != ch) { 
            return pref;
          }
        }
        pref += strs[0][currIdx++]; 
      }
      return pref;
  }
};

Breakdown of the Code

This code defines a class Solution with a function longestCommonPrefix that takes the array of strings (strs) as input and returns the longest common prefix.

  1. Edge Case: The code first checks if there’s only one string (strs.size() == 1) in the input array. In that case, the single string itself is the longest common prefix, so it's returning that(return strs[0]).

  2. Initialization:

  • pref (empty string) stores the growing common prefix.

  • currIdx (integer) keeps track of the current index (position) within the strings being compared.

3. Looping and Comparison:

  • An infinite while(1) loop continues until the common prefix is found or there's a mismatch.

Inside the loop:

  • A character variable ch is declared to hold the character to be compared across all strings.

  • A for loop iterates through each string (strs[i]) in the array.

  • If currIdx reaches the end of the current string (currIdx >= s.length()), it means there's no more prefix to compare, so the current pref is returned.

  • In the first iteration (i == 0), the character at currIdx from the first string (strs[0]) is assigned to ch as the reference point for comparison.

  • If the character at currIdx in the current string (s[currIdx]) doesn't match the reference character ch, it indicates a mismatch, and the loop terminates, returning the current pref.

4. Building the Prefix:

If all characters at currIdx across all strings match, the character from strs[0] at currIdx is appended to the pref string and simultaneously the currIdx is also incremented to move to the next character position for the next iteration. (pref += strs[0][currIdx++])

5. Return Value:

The loop eventually breaks when a mismatch is found between the strings character at currIdx or we reached the end of any of the string. The final pref string containing the longest common prefix is returned.

Lets verify the code with a dry run of a testcase in the next section

Code Dry Run with a Test Case

Let’s use the provided example:

Input: strs = ["flower", "flow", "flight"]

Step 1: Edge Case Check (Skipped)

There are more than one string, so the code proceeds to the main logic.

Step 2: Initialization

  • pref = "" (empty string)

  • currIdx = 0

Step 3: Loop Iteration 1

Character Comparison:

  • ch is assigned the first character from the first string (ch = 'f')

  • All three strings have ‘f’ at index 0, so the loop continues.

Building the Prefix:

  • ‘f’ is appended to pref: pref = "f"

  • currIdx is incremented to 1 (currIdx = 1)

Step 4: Loop Iteration 2

Character Comparison:

  • All three strings have ‘l’ at index 1, so the loop continues.

Building the Prefix:

  • ‘l’ is appended to pref: pref = "fl"

  • currIdx is incremented to 2 (currIdx = 2)

Step 5: Loop Termination (Mismatch)

Character Comparison:

  • The third string (“flight”) has ‘i’ at index 2, while the first two strings have ‘o’. There’s a mismatch.

The loop terminates at this point.

Result:

The final value of pref is "fl", which is the longest common prefix for all three strings. The function returns "fl".

Real-World Applications

Finding the longest common prefix has various applications in real-world scenarios:

  • File Compression: In data compression techniques like Lempel-Ziv coding, the longest common prefix between repeated strings can be identified and replaced with a reference, reducing file size.

  • Autocompletion: Search engines and text editors often use the longest common prefix to suggest completions for partially typed words, improving user experience.

  • Natural Language Processing (NLP): NLP tasks like text classification or stemming (reducing words to their base forms) can utilize common prefixes to group related words or identify word stems.

Conclusion

This blog post explored the LeetCode challenge of finding the longest common prefix using a C++ code. We broke down the problem statement, presented the code with detailed step-by-step explanations, performed a dry run with a test case and discussed real-world applications.

References

  • File Compression (Lempel-Ziv Coding):https://ethw.org/Milestones:Lempel-Ziv_Data_Compression_Algorithm,_1977 by Wikipedia: This article provides an overview of Lempel-Ziv (LZ) coding, a family of data compression algorithms. It mentions that LZ77, a specific variant, uses the concept of finding repeating phrases and replacing them with references to a dictionary, effectively using longest common prefixes for compression.

  • Autocompletion:https://searchengineland.com/how-google-autocomplete-works-390257 by Google Research: This blog post discusses various techniques used in search query autocomplete. It mentions that a prefix tree data structure can be employed to efficiently identify prefixes of words stored in the system, enabling suggestions based on the common prefix typed by the user.

  • Natural Language Processing (NLP):https://en.wikipedia.org/wiki/Stemming by Wikipedia: This article describes stemming algorithms, a technique in NLP used to reduce words to their base forms. Some stemming algorithms, like the Porter Stemmer, rely on identifying common prefixes among words to achieve stemming.

  • The diagrams in the Dry run section are created using Canva.

  • Cover photo: by Chris Ried on Unsplash

If you liked this blog, do applaude by clicking on the clap hands icon and drop down any feedback you have in the comments.

You can follow me medium.com/@subhradeep_saha and check out my other posts as well where I explained the solutions of few other interesting Leetcode problems: