When your question is answered use !solved to mark the question as resolved.
Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question run !howto ask.
140 messages ยท Page 1 of 1 (latest)
When your question is answered use !solved to mark the question as resolved.
Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question run !howto ask.
Program not counting word occurrence correctly, just incrementing, why?
!f
@river bay's code (missing deletion permissions), requested by @twin drum:
Hi, I am trying to count the occurrence of words in a string. It should take in a string, count the word, and say how many times each word is input. My string is delimited by the whitespace and then added to a vector.
my approach was to start the vector at 0 outside of a for loop. then, i would loop through the size of the vector. then, if vector[0] equals vector[i], increment a counter. then at the end of the loop vector[0] would increment to vector[1]. I thought this would work, but it just increments every time from 1 to the end (1 to 10 for example). can someone look at my code and help me understand why it is just incrementing, and not adding the numbers correctly? i'll post it all below
int count = 0;
int inc_value = 0;
for (string word; ps >> word;)
vs.push_back(word);
vs[inc_value];
while (inc_value < vs.size()) {
for (int i = 0; i < vs.size(); ++i) {
if (vs[inc_value] == vs[i]) {
count++;
}
else {
count = 0;
}
cout << vs[inc_value] << " " << count << '\n';
inc_value++;
}
}```
the output is:
hello 1
this 2
is 3
a 4
test 5
that's the output, ok, but for what input? what is ps
you're incrementing inc_value in the inner/second for loop, so by the time that for loop completes the first time, your program exits, the while loop may as well not exist
the input is just "hello this is a test". ps is something else entirely for checking whitespace and taking in files etc. basically the ps is just user input, sorry for that confusion
in light of this, inc_value always equals i
so yes, count is always incremented
ahhh okay, that was what i was afraid of was happening. my plan was for the whole for loop to go over inc_value[0], then once the for loop is done with that it would go over inc_value[1], but that is definitely not happening. how would you recommend I go about changing that? or should i approach it entirely different ?
do you have restrictions? like is this for school where you're not allowed to use various things?
because, with no restrictions, I would recommend using https://en.cppreference.com/w/cpp/algorithm/count
or slightly better https://en.cppreference.com/w/cpp/algorithm/ranges/count
ranges is C++20 tho
yea right now im on c++11 i think, and I can only use basic stuff unfortunately yeah
then move your inc_val++ outside the for loop, but still inside the while loop
i was gonna put it in a map and do it that way, but i have to stick with code and act like i've never coded before
and the cout too
outside the for loop, but inside the while loop.... got it. ill give it a go and see what's going on. thank you so far
although, technically this probably isn't 100% what you need
hello world hello would end up giving a count of 4
because you'll "count the occurrences of hello" twice
and there are two occurrences, so two times two is four (last I checked, in non-complex Euclidian geometry at least)
so really, in fact, without std::count yes a map is probably the appropriate tool here
without a map, you'll need to keep track of not only the count of a word, but if you have seen that word already, you need to keep track of that, so you don't recount the word a second time when that word occurs later
or really maybe just a std::set
but you could use a std::vector to store the words you have already "seen"
it's just a bit more cumbersome to answer the question "have I already seen this word?" by that method
it appears you're at least allowed to use std::vector so... ๐คท
alright well i took the cout and the inv_value outside of the for loop but still in the while loop, and it's giving me this output
hello 1
world 0
hello 1
well, your logic is also still a bit off in several various ways
when you see "world" when you're handling "hello" you will reset the count of "hello" to zero because hello != world
but I assume that's not what you want
yeah definitely not haha. i didn't think of it until i saw the hello world hello function that way. i'm thinking your idea of a vector would work well. but wouldn't the problem of counting the occurrence still be around? there would be multiple of the same word in a vector and they would be out of order and all over in the vector
no
you search the vector for the current word
if the word is in the "seen" vector, you simply do nothing and continue to the next word
otherwise you add the current word to the list of "seen" words
using a std::set would do this for you basically
without std::set you'll need to search your "seen" words yourself
ah okay. well in that case i'll bend the rules and use set haha.
I mean, you actually still need to consult the set
to see if the word is yet "seen" or not
it's just a bit simpler to do that with a std::set than by hand with a vector
okay. i like that idea. so generally, i can use set to see if the current word has already been seen. and if it has i increment a counter? and if it hasnt been seen I increment the counter to 1?
also, on a different note, are you licensed? if not that's piracy and

licensed?
mmm, not quite, but close enough for now, go try it first and then come back
yes, Winrar is not free ๐
oh lmao i am still within the 30 day trial period 
but sure thing i'll give it a go
conceptually, from a high level perspective, you need to answer the question "have I counted this word already"
std::set or std::map could both be reasonable tools for doing this
probably better to use std::unordered_set or std::unordered_map tho
also, on a more related but also different note, do you know how to use a debugger?
@river bay Has your question been resolved? If so, run !solved :)
though depending on exactly how you use a map, you almost may not really need to do this at all, but if maps are forbidden then you must write this logic yourself
no i dont, i use sublime text to code and powershell to run the program haha. but i think i do sadly need to write a fair deal of the logic if i wasn't allowed to use maps
ok
I like Sublime Text, I think ST is one of the best editors available
but I would likely recommend you use Visual Studio (the IDE) instead
I rather doubt ST has a debugger? but idk, it's been years since I used ST
even if it does, you will need to study and learn how to use it
Visual Studio does have a debugger, and you shouldn't need much study to use it
set a breakpoint, click the green arrow, and you're basically done
oh interesting. it seems like a debugger would come in handy lol. after i finish this assignment i'll have to look into it.
i ended up getting a set working instead of a vector. and when i do cout, it doesn't output duplicates. so that would definitely help with outputting a certain amount. so now i am just stumped on actually counting the occurrence
im kinda thinking like i was when i had the vector. maybe i can sort the set, then compare the current with the next? if it is the same, then count++. if not, i start the counter back to 1. then i can sort it by most frequent?
slightly better example
note that Visual Studio (the IDE) is not the same as VSCode
sadly, MS appears intent on confusing people into believing they are the same, but they are not
VS is a full IDE, it comes with a compiler, a linker, the standard library and a debugger
VSCode is mostly just a good "plain text editor" similar to ST in that way
though VSCode does actually have a debugger, but then you'll need to learn VSCode, and you will probably have a much easier time if you just use actual VS instead, actual VS is free
make sure this checkbox โ๏ธ is checked when you install VS, and that should be all you need to get started
create a blank console app C++ project โ๏ธ and you're off to the races
oh wow okay awesome, i'll definitely be sure to look into this when I get this done. thanks for all the pointers with vs
yeah i am lol
ok, then all is well, VS only really works on Windows, but if you are on Windows, then VS is widely considered one of the best IDEs on any OS
I prefer the "plain text editor" approach too, like Sublime Text (or in my case
) but realistically life will probably be much simpler, faster, easier for you if you just use VS the IDE for now
okay, sounds great! thank you :)
haha i do have a question though... it might be kind of a silly one
only one way to find out ๐
so i'm noticing that the set outputs the words in alphabetical order. this seemed fine when my idea was to compare the current word to the next word, but in the set they are not automatically alphabetized! i looked online to see if there was a non convoluted way to sort it before outputting it, but i can't find one. do you have any ideas? i'll give you my new code i wrote for clarity
set<string> words;
for (string word; ps>>word;)
words.insert(word);
int i = 0;
for (string str : words) {
if (str[i] == str[i + 1]) {
count ++;
}
else {
count = 1;
}
cout << str << " " << count << '\n';
i++;
}```
I see
you would still use the original code you had
the suggestion is not to replace vector with set
the suggestion is to also add some code to track the answer to the question "have I counted this word yet?"
a set is a natural and reasonably performant way to do that
if you don't need order (which you don't, you only need "have I seen this?"), then std::unordered_set is likely to be somewhat more performant than std::set because constant amortized time complexity
and in theory, de-facto O(1) time lookups
ah man, i thought i was getting somewhere with this one lol. so i go back to the vector thing and add a set. i loop through the vector, and add each word to the set as i go along. then, if the word already exists in set, i just increment a counter? is that seeming somewhat correct?
in practice this never meets reality because cache misses
sounds quite close, yes
have a try and show us what you come up with
okay ill give it a go lol
and no rebalancing
and O(log n) is still technically worse than O(1)
quite good in its own right, but still quite a lot worse than constant time
ah, i am sad to admit that what i thought would work did not in fact
i'll show you what i did and explain my thoughts
set<string> words;
vector<string> vs;
for (word; ps>>word;)
vs.push_back(word); // read words
vs[inc_value];
while (inc_value < vs.size()) {
for (int i=0; i<vs.size(); ++i) {
//adding each word to the set
words.insert(vs[i]);
//if the vector is already in the set
if (words.find(vs[inc_value]) != words.end()) {
//increment counter
count++;
}
else {
//otherwise reset it
count = 1;
}
//outputting
cout << vs[inc_value] << " " << count << '\n';
inc_value++;
}
}```
so im having it loop through the size of the words typed just like before
and each loop the word gets added to the set
then, if the set finds the current iteration of the string, it will increment the counter, otherwise reset it with 1
then, it outputs those results
what happens is this still:
hello 1
world 2
hello 3
im not sure why it keeps incrementing. i guess it could be because inc_value is set to 0, and the for loop starts at zero. so it makes me wonder if i need to sort the vector
but maybe that wouldn't fix it, im not sure anymore haha im at a loss
well, if the word is already in the set, you should just skip the inner for loop entirely
it means you have "already counted" this word before
the inner for loop counts occurences
if you have already counted occurrences of X, you don't want to count X a second time, or your count will be double the actual/real count
@river bay's code (missing deletion permissions), requested by @twin drum:
set<string> words;
vector<string> vs;
for (word; ps >> word;)
vs.push_back(word); // read words
vs[inc_value];
while (inc_value < vs.size()) {
for (int i = 0; i < vs.size(); ++i) {
// adding each word to the set
words.insert(vs[i]);
// if the vector is already in the set
if (words.find(vs[inc_value]) != words.end()) {
// increment counter
count++;
}
else {
// otherwise reset it
count = 1;
}
// outputting
cout << vs[inc_value] << " " << count << '\n';
inc_value++;
}
}```
ah, so looking if a word is already in a set should not even be in the for loop? interesting okay. then what would be in the for loop if not that?
checking for and counting matches
which you are doing, and that's fine
your count reset is still (probably) wrong though
;compile
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main() {
vector<string> words{"apple", "banana", "apple"};
int inc_val = 0;
while (inc_val < 3) {
for (int i = 0; i < 3; i++) {
cout << words[inc_val] << " " << words[i] << endl;
}
inc_val++;
}
}
apple apple
apple banana
apple apple
banana apple
banana banana
banana apple
apple apple
apple banana
apple apple
consider for a moment what your loops are actually doing when you put them together like this โ๏ธ
ohhh wow okay, yeah i see that my while loop and for loop are causing me issues
I think maybe you are causing you issues ๐
if you don't want to use a debugger, then cout all the things so you can see what's really going on, step by step
delete the extraneous couts once you get it working
okay, i'll go ahead and do that. but first im gonna take a little break, lol im starting to get a little annoyed with the issues im causing myself and im getting sleepy but i'll come back to it after a little bit. thank you for the help so far. it has definitely been helpful so far figuring out what's wrong and why it is wrong
rest is good, I dare suggest required. good luck.
This question thread is being automatically closed. If your question is not answered feel free to bump the post or re-ask. Take a look at !howto ask for tips on improving your question.
This question thread is being automatically closed. If your question is not answered feel free to bump the post or re-ask. Take a look at !howto ask for tips on improving your question.