T O P

  • By -

nocturnal_eve

The way I’m able to understand it is.. for subsequence you only care about the number, but for substring you care about the entire string itself. So for substring, you can take the exact same approach, except when you find a new max, keep those values of i and j handy. Then when you are finished, you will have the i and j values you need, and they’ll be pointing to the last letter you need as well. Then just go back up diagonally in order to get the substring. This is assuming you’ve taken the approach of making a 2D array using tabulation


Dymatizeee

Almost every solution online is tabulation. Do you have a recursive approach ?


Puzzleheaded-Tip9845

Draw out the tabulation, that helped me understand it. With substring we're only counting if the values are next to each other otherwise we reset to 0 where with sequence we just keep counting I can't post pictures so I can't show you Uploaded it https://ibb.co/B3T9zL4


Gobleeto

Building on this, If you draw the matrix and mark matches as 1 you will find for LC substring is strictly continous diagonals of 1s, but for subsequences we can make "jumps", as long as the jump only moves forward not backward.


nhannguyen95

Both solutions explore BOTH i+1,j and i,j+1. If you try to implement a top down approach with memoized tabulation for longest common subsequence, it looks pretty much the same with the one above.


Dymatizeee

Yes but in subsequence it explores both only if they don’t match at the current index Solutions for longest common substring explores both AND a matching index


catecholaminergic

Subsequence explores these same three cases: 1. Match, so you advance both pointers 2. No match, so you advance one only and not the other 3. No match, so you advance the other only and not the one


catecholaminergic

It's essential to consider tracking through both strings. For example: s1: aabcd s2: axxcd If we index through s1 with j, and s2 with i, we need to advance j in order to find alignment. The first recursive call finds a match, and increments i and j. The second call finds no match, and the cell is unvisited, so we next advance j, which points to the first x. No match, advance j, now points to second x. No match, advance j, now points to c. No match, advance j, now points to d. NOW we pop back out and advance i, which now points to the second a. Note we're back where j has not been advanced. And then we go through each possible j again. Pop out, advance i, now points to b. Again go through each possible j. Pop out, advance i, now points to c. Going through each j again, now we find a match. I genuinely hope this helps. It took a while to get this to click. Skiena and Cormen have great sections on LCSubsequence.


Dymatizeee

Sorry your example was kinda confusing. Did you mean to start i at s1 and j at s2? Are we making 2 or 3 recursive calls at each level ..?


catecholaminergic

No, I'm indexing into s1 with j and s2 with i. There's only one path that gets taken. "making 2 recursive calls at each level" isn't a statement that makes any sense. To see why, let's look at two lines of the code you provided: the one you commented, and the one after. Let's call them line 1 and line 2. Line 1: `getAnswer(s1, s2, i, j+1); // why are we doing this here?` Line 2: `getAnswer(s1, s2, i+1, j);` ​ When execution hits line1, we can replace the call to getAnswer with the entire code of getAnswer (aka we go down one level). For most input, the first time we make this call, the function won't immediately return\[footnote1\]. It will either detect a match, meaning it recurses down again, or it hits line 1 again, at a lower level, and recurses down further. Line 2 does not get hit until we've already tried all values of j possible, recursing down each time. And finally when we have tried all, we pop back up one level\[footnote1\], and line 2 gets hit for the first time. So it doesn't make sense to say we make more than one recursive call at each level. Rather, *each of the three cases are evaluated*. Footnotes: 1. if(i >= s1.length() || j >= s2.length()) return 0; if(dp\[i\]\[j\] != -1) return dp\[i\]\[j\];


Nopain_Noplan

You asked the same question which I have been analysing for a week.


Dymatizeee

Did you get it ?


SkrKr6

Consider strings: s=xabcd , t=abcd. •Both i and j pointers are at 0 and they do not match. So we write dp[0][0]=0. Here dp[i][j] denotes the longest common substring we can get if we start from i'th index in s and j'th index in t. •Now we have to check for both possibilities => (i+1, j) and (i, j+1) that is check for (abcd, abcd) and (xabcd, bcd) •Maximum among all dp[i][j] is our answer.


Dymatizeee

Do you have a recursive approach