Dynamic Programming Series #2: Longest Palindromic Substring

·

0 min read

As promised, in this chapter of this series, we shall tackle a problem together using dynamic programming. The problem we are solving is as follows:


Given a string s, find the longest palindromic substring in s. You may assume that the maximum length of s is 1000.

Example 1: Input: "babad" Output: "bab" Note: "aba" is also a valid answer.

Example 2: Input: "cbbd" Output: "bb"


Recalling to the first chapter of the series, do you think we should use a top-down or a bottom-up approach of DP to solve this problem? Take a moment to think of which approach you would use and why.

My Approach and Why

For me, whenever I see a problem which looks like I can solve using DP, I can't help but to first look at the brute force solution. The brute force solution is obviously very inefficient and it involves going through all the possible starting and ending positions for a substring, and then checking if it is a palindrome. If we look at Example 1's input: "babad", using zero-indexing this gives us (0, 0), (0, 1), (0, 2), ... , (0, n-1) as possible starting and ending position for the first letter, (1, 1), (1, 2), ... , (1, n-1) for the second letter and so on, where n is the length of the string input (so in this case n = 5). This means a total of n + (n-1) + ... + 1 = n(n+1)/2 different possible substrings for the string "babad", and because verifying each substring would take O(n) time, the total time complexity amounts to O(n^3). So what did I take away from this brute force analysis? We need to verify each and every substring possible. This means a bottom-up approach would work best.

Now that we have chosen which approach we will be using, let's establish some base cases.

Case 0 (n = 0): It is vacuously true that an empty string is a palindrome

Case 1 (n = 1): Since a letter by itself is a palindrome, all substrings of length 1 is a palindrome

Case 2 (n = 2): Substrings of length 2 can only be a palindrome if the first and last letters are the same

Case 3 (n = 3): Substrings of length 3 are palindromes if the first and last letters are the same, the middle letter doesn't matter because we know that a single letter by itself is a palindrome. Hmm... this seems familiar. It seems like we can determine case 3 using the results of case 1 and case 2.

Case 4 (n = 4): Substrings of length 4 are palindromes if the first and last letters are the same, and the two remaining letters make up a palindrome.

By now, you should begin to see some structure in what makes up a palindrome. It seems like for words to be palindromes, the first and last letter has to be the same and the string excluding the first and last letter must also be a palindrome.

Capture.PNG

It essentially has the following recursive definition:

def isPalindrome(string):
    if len(string) == 1 or len(string) == 0:
        return True  # string is a palindrome
    else if string[0] == string[len(string) - 1]:
        # string is a palindrome only if the string excluding the first and last letter is also a palindrome
        # Note: slicing here works such that the index len(string) - 1 is not included
        return isPalindrome(string[1: len(string) - 1])
    else:
        return False  # string is not a palindrome

Now, it's clear that we have essentially three base cases: case 0, case 1, and case 2. Since we are using the bottom-up (tabulation) approach, we have to now tabulate initial values of these base cases and use those values to compute our desired solution for our given input.

So how should we go about in doing this? How do we go about representing all the possible starting and ending positions for a substring and storing the solutions for it? What would be a good data structure for this? Well, it seems like a matrix (i.e., 2D array) would work just fine.

We can define i's to be the starting indices and the j's to be the ending indices of a substring such that the diagonal of the matrix must be True and anything below this diagonal must be False because we are not required to consider substrings in reverse order. Notice that our matrix will have a size of n * n, where n is the length of our input string.

Capture1.PNG

Implementation in Python

So now that we know what to do, let's move this idea to Python. First, we have a main function that takes in an input string and let's initialize the a 2D array to have all False values like so:

def main(string):
    # A matrix of booleans, storing the results of whether the substring at index i to j is a palindrome
    palindrome_table = [[False for i in range(len(string))] for j in range(len(string))]

Now, we want to set the diagonal to be filled with all True values.

for k in range(len(string)):
    palindrome_table[k][k] = True

At this point, we have computed our values for our first and second base case: case 0 and case 1! Now, for the third base case, we need to check for each substrings of length 2 if they are a palindrome and store their results.

for k in range(len(string) - 1):
    if string[k] == string[k+1]:
        palindrome_table[k][k+1] = True

Looking good. Now, we'll just have to continue building the matrix while working our way up from the base case.

# For substring of lengths k, where k > 2
    for k in range(2, len(string)):
        for i in range(len(string) - k):
            j = i + k
            if palindrome_table[i + 1][j - 1] and string[i] == string[j]:
                palindrome_table[i][j] = True

Ok, so we've completed our matrix of values, now what? How do we find the longest palindromic substring? Well, it's simply a matter of tracking the starting and ending index of our longest substring by tweaking our existing code.

Capture2.PNG

And whew, we're done! That was definitely challenging but we did it! In the next post of the series, I'll continue with solving more problems with dynamic programming. Don't worry if you didn't get this one. I found this one to be quite difficult too. We'll do a simpler one next time.

P.S. For those of you who are wondering, this problem was taken from LeetCode here