Status
Not open for further replies.
1 - 2 of 2 Posts

#### Sphinx

·
##### Registered
Joined
·
606 Posts
Discussion Starter · ·
sorry guys i posted this in web development by accident... heres my problem:

Ok, I have a vector "v2" containing a bunch of strings.

I have another vector "v1" that contains keywords. I go through this vector v1's values using a simple for loop. On each iteration, I want to perform a binary search through vector "v2" to see how many times that word exists in v2.

My problem is doesnt binary search only return the first position it finds the word in? However there are going to be many instances of each word from "v1" in "v2", and I need to count how many times each word in "v1" occurs in "v2".

I have done this succesfully using nested for loop (put another for loop through v2 inside the for loop through v1, but I will get points off as this is not as efficient as binary search).

Both vectors are sorted.

help! thanks!

#### AGCurry

·
##### Registered
Joined
·
431 Posts
So, for each keyword (v1) you need to find a count of that word's occurrences in ALL of v2?

Given your existing data structures, the only way to do this is by an exhaustive search through ALL of v2 for each keyword - NOT with a binary search.

For a small amount of data, the nested-loop approach would be okay. But
were I programming this problem for a large volume of data, I would parse all of the strings in v2 and store the words in a binary tree, which would look like:

struct b_tree
{
char *word ;
int count ;
struct b_tree *left, *right ;
} ;

OR, an array of

struct
{ char *word ;
int count ;
} ;

Then a binary search would be appropriate.

In each case, *word can point to the first occurrence of that word in v2.

The b-tree approach would make it easier to store the words, while the array approach would make it simpler to search using the library function bsearch().

1 - 2 of 2 Posts
Status
Not open for further replies.