Count the strings that are subsequence of the given string

Improve Article

Save Article

Like Article

Improve Article

Save Article

Given a string S and an array arr[] of words, the task is to return the number of words from the array which is a subsequence of S.

Examples:

Input: S = “programming”, arr[] = {“prom”, “amin”, “proj”}
Output: 2
Explanation: “prom” and “amin” are subsequence of S while “proj” is not)

Input: S = “geeksforgeeks”, arr[] = {“gfg”, “geek”, “geekofgeeks”, “for”}
Output: 3 
Explanation:” gfg”, “geek” and “for” are subsequences of S while “geekofgeeks” is not.

Naive Approach:  The basic way to solve the problem is as follows:

The idea is to check all strings in the words array arr[] which are subsequences of S by recursion.

Below is the implementation of the above approach: 

C++

// C++ code for the above approach:

#include <bits/stdc++.h>
using namespace std;

bool isSubsequence(string& str1, int m, string& str2, int n)
{
    if (m == 0)
        return true;
    if (n == 0)
        return false;

    // If last characters of two strings
    // are matching
    if (str1[m - 1] == str2[n - 1])
        return isSubsequence(str1, m - 1, str2, n - 1);

    // If last characters are not matching
    return isSubsequence(str1, m, str2, n - 1);
}

// Function to count number of words that
// are subsequence of given string S
int countSubsequenceWords(string s, vector<string>& arr)
{

    int n = arr.size();
    int m = s.length();
    int res = 0;
    for (int i = 0; i < n; i++) {
        if (isSubsequence(arr[i], arr[i].size(), s, m)) {
            res++;
        }
    }
    return res;
}

// Driver Code
int main()
{
    string S = "geeksforgeeks";
    vector<string> arr
        = { "geek", "for", "geekofgeeks", "gfg" };

    // Funtion call
    cout << countSubsequenceWords(S, arr) << "\n";
    return 0;
}

Time Complexity: O(m*n) 
Auxiliary Space: O(m) for recursion stack space

Efficient Approach: The above approach can be optimized based on the following idea:

  • Map the index of characters of the given string to the respective characters array.
  • Initialize the ans with the size of arr.
  • Iterate over all the words in arr one by one.
  • Iterate over each character.
  • Find strictly greater index than prevIndex in dict.
  • If the strictly greater element is not found, it means the current word is not a subsequence of the given string, so decrease res by 1.
  • Else update prevIndex.
  • After iterating over all the words, return ans.

Below is the implementation of the above approach:

C++

// C++ code for the above approach:

#include <bits/stdc++.h>
using namespace std;

// Function to count number of words that
// are subsequence of given string S
int countSubsequenceWords(string s, vector<string>& arr)
{
    unordered_map<char, vector<int> > dict;

    // Mapping index of characters of given
    // string to respective characters
    for (int i = 0; i < s.length(); i++) {
        dict[s[i]].push_back(i);
    }

    // Initializing res with size of arr
    int res = arr.size();

    for (auto word : arr) {

        // Index where last character
        // is found
        int prevIndex = -1;
        for (int j = 0; j < word.size(); j++) {

            // Searching for strictly
            // greater element than prev
            // using binary search
            auto x = upper_bound(dict[word[j]].begin(),
                                 dict[word[j]].end(),
                                 prevIndex);

            // If strictly greater index
            // not found, the word cannot
            // be subsequence of string s
            if (x == dict[word[j]].end()) {
                res--;
                break;
            }

            // Else, update the prevIndex
            else {
                prevIndex = *x;
            }
        }
    }
    return res;
}

// Driver Code
int main()
{
    string S = "geeksforgeeks";
    vector<string> arr
        = { "geek", "for", "geekofgeeks", "gfg" };

    // Function call
    cout << countSubsequenceWords(S, arr) << "\n";

    return 0;
}

Time Complexity: O( m * s * log(n) ), where m is the length of the given string, s is the max length of the word of arr and n is the length of arr
Auxiliary Space: O(n)

Like this post? Please share to your friends:
Leave a Reply

;-) :| :x :twisted: :smile: :shock: :sad: :roll: :razz: :oops: :o :mrgreen: :lol: :idea: :grin: :evil: :cry: :cool: :arrow: :???: :?: :!: