EASY: problem with strtok parsing a line | Together C & C++ | Page 1

worldly patioBOT Jan 27, 2023, 8:09 PM

#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question run !howto ask.

wraith salmon Jan 27, 2023, 8:26 PM

#

from what I can see on the docs, there's only one delimiter and that's what strtok searches for.

#

so you'd need to either split up the strtok calls, or just write out an alternate version of strtok yourself.

brazen ravine Jan 27, 2023, 8:28 PM

#

you can use multiple delimiters in strtok

wraith salmon Jan 27, 2023, 8:29 PM

#

what are you putting into stdin?

#

sorry, it only showed me examples of it using one

brazen ravine Jan 27, 2023, 8:31 PM

#

its a file

wraith salmon Jan 27, 2023, 8:32 PM

#

oh yeah it only gets one token per line

brazen ravine Jan 27, 2023, 8:32 PM

#

yea

wraith salmon Jan 27, 2023, 8:32 PM

#

because you put NULL into the input str at the bottom of the second while loop

#

shouldn't you put something like strlen(token) + 1 + line into the first argument of the second strtok()?

brazen ravine Jan 27, 2023, 8:33 PM

#

ill check

#

didn't work

#

NULL gets the next token i thought

wraith salmon Jan 27, 2023, 8:38 PM

#

wdym lol NULL = no string

#

it doesn't internally store your string iirc

#

oh nvm it does

#

wtf

brazen ravine Jan 27, 2023, 8:39 PM

#

yea

#

every source online uses NULL

#

i think im placing it in the wrong spot

wraith salmon Jan 27, 2023, 8:40 PM

#

no it's in the right place

#

it could be that it mallocs the token so that the token you're modifying is actually not really the original string

brazen ravine Jan 27, 2023, 8:43 PM

#

wat

wraith salmon Jan 27, 2023, 8:43 PM

#

when you're lowercasing the token string

#

it isn't modifying the line string

#

also i would really just split that into a separate function

brazen ravine Jan 27, 2023, 8:44 PM

#

ok

wraith salmon Jan 27, 2023, 8:51 PM

#

if it helps at all, this would print out every word like thing:```c
#include <stdio.h>
#include <ctype.h>
int main() {
char str[] = "the fox is very fast";

int start = -1;
int end = 0;
for(int i = 0; str[i]; i ++) {
if(start < 0 && isalnum(str[i]) || str[i] == '')
start = i;
else if(isalnum(str[i]) || str[i] == '') end = i;
else {
printf("%.*s\n", end - start + 1, str + start);
start = -1;
}
}
if(start > -1) printf("%s\n", str + start);
}

#

in a line

#

if you want to use something like this, i think it would do the same tihng?

brazen ravine Jan 27, 2023, 8:54 PM

#

would this work with an array that im using for my keywords i need to search the file for

wraith salmon Jan 27, 2023, 8:55 PM

#

umm do you want me to try to adapt it to your program?

brazen ravine Jan 27, 2023, 8:55 PM

#

nah

wraith salmon Jan 27, 2023, 8:55 PM

#

but yeah it very much would

#

do you get how it works? it just keeps up a slice and then it sees if theer's an unfinished word on the second to last line

brazen ravine Jan 27, 2023, 8:57 PM

#

ill try to see if i can use some of this

wraith salmon Jan 27, 2023, 8:57 PM

#

im pretty sure this is faster than using strtok or wahtever too, and completely threadsafe

brazen ravine Jan 27, 2023, 8:58 PM

#

an else if strcmp with the keyword array would work right

wraith salmon Jan 27, 2023, 8:58 PM

#

put it inside the else ye

#

and at the end

brazen ravine Jan 27, 2023, 9:07 PM

#

when it prints to stdout some of the words are printed twice & some of them moved onto new lines

wraith salmon Jan 27, 2023, 9:07 PM

#

can you post the code?

brazen ravine Jan 27, 2023, 9:10 PM

#

  while (fgets(line, sizeof(line), stdin))
   {
      int start = -1;
      int end = 0;
      for (int i = 0; line[i]; i++)
      {
         if (start < 0 && isalnum(line[i]) || line[i] == '_')
            start = i;
         else if (isalnum(line[i]) || line[i] == '_')
            end = i;
         else
         {
            printf("%.*s\n", end - start + 1, line + start);
            start = -1;
         }
      }
      if (start > -1)
      {
         printf("%s\n", line + start);
      }
   }

real torrent Jan 27, 2023, 9:10 PM

#

The issue is with the printf("%s\n", line); statement at the end of the while loop. It is printing the original line variable which only contains the first word (the one that was tokenized and potentially modified) and not the entire line with all the modified words. To fix this, you can replace the line variable with a new variable, such as output, and concatenate each modified token to it before printing. Then at the end of each iteration of the while loop, you should reset the output variable to an empty string, so that the next line is printed correctly.

brazen ravine Jan 27, 2023, 9:11 PM

#

real torrent The issue is with the printf("%s\n", line); statement at the end of the while lo...

i will try that thanks

craggy pivot Jan 27, 2023, 9:11 PM

#

try this

while (fgets(line, sizeof(line), stdin))
   {
      for (i = 0; i < sizeof(keywords) / sizeof(keywords[0]); i++)
      {
         if (strstr(line, keywords[i]) != NULL)
         {
            for (j = 0; j < strlen(line); j++)
            {
               line[j] = tolower(line[j]);
            }
            break;
         }
      }
      printf("%s", line);
   }

wraith salmon Jan 27, 2023, 9:13 PM

#

craggy pivot try this ``` while (fgets(line, sizeof(line), stdin)) { for (i = 0; i <...

doesn't that only replace a keyword once?

craggy pivot Jan 27, 2023, 9:14 PM

#

Here is an updated version of the code that will replace multiple keywords at once:

while (fgets(line, sizeof(line), stdin))
   {
      for (i = 0; i < sizeof(keywords) / sizeof(keywords[0]); i++)
      {
         char *keyword = keywords[i];
         int keyword_len = strlen(keyword);
         char *found = strstr(line, keyword);
         while (found != NULL)
         {
            for (j = 0; j < keyword_len; j++)
            {
               found[j] = tolower(found[j]);
            }
            found = strstr(found + keyword_len, keyword);
         }
      }
      printf("%s", line);
   }

#

In this version, we are using the strstr() function to check if any of the keywords are present in the line, if they are we are then converting the keyword to lowercase using the tolower() function. We are using while loop to find all occurrence of keyword in the line and replace it with lowercase.

wraith salmon Jan 27, 2023, 9:15 PM

#

also, i don't know if that code is very performant

#

since you're looping through every keyword every line

#

you could just remove the while fgets though

brazen ravine Jan 27, 2023, 9:18 PM

#

craggy pivot Here is an updated version of the code that will replace multiple keywords at on...

if the keyword is present in another word it still lowercases it. i.e if the keyword is "sand" and a word is sandwich, it still gets lowercased. could it be replaced with strcmp

wraith salmon Jan 27, 2023, 9:18 PM

#

yeah that's also another downside

wraith salmon Jan 27, 2023, 9:19 PM

#

brazen ravine ```c while (fgets(line, sizeof(line), stdin)) { int start = -1; ...

so for this

#

ok lemme fix it

craggy pivot Jan 27, 2023, 9:20 PM

#

!format

worldly patioBOT Jan 27, 2023, 9:20 PM

#

In this version, I have added a check before converting the keyword to lowercase. The check is to verify that the keyword is a standalone word and not a part of another word. We are checking if the keyword is at the start of the line or if the keyword is at the end of the line. If these conditions are satisfied, we are converting the keyword to lowercase.
This way we can avoid partial word matching and also keep it done using strcmp.


while (fgets(line, sizeof(line), stdin)) {
  for (i = 0; i < sizeof(keywords) / sizeof(keywords[0]); i++) {
    char* keyword = keywords[i];
    int keyword_len = strlen(keyword);
    char* found = line;
    while ((found = strstr(found, keyword)) != NULL) {
      int start = found - line;
      int end = start + keyword_len;
      if ((start == 0 || !isalpha(line[start - 1])) && (!isalpha(line[end]))) {
        for (j = 0; j < keyword_len; j++) {
          found[j] = tolower(found[j]);
        }
      }
      found += keyword_len;
    }
  }
  printf("%s", line);
}

whiteh4cker

wraith salmon Jan 27, 2023, 9:21 PM

#


  char* cur;
  while ((cur = fgets(line, sizeof(line), stdin)))
  {
      int start = -1;
      int end = 0;
      for (int i = 0; cur[i]; i++)
      {
         if (start < 0 && isalnum(cur[i]) || cur[i] == '_')
            start = i;
         else if (isalnum(cur[i]) || cur[i] == '_')
            end = i;
         else
         {
            printf("%.*s\n", end - start + 1, cur + start);
            start = -1;
         }
      }
      if (start > -1)
      {
         printf("%s\n", cur + start);
      }
  }

#

@brazen ravine

brazen ravine Jan 27, 2023, 9:24 PM

#

worldly patio In this version, I have added a check before converting the keyword to lowercase...

this worked

#

thanks for your help @wraith salmon @craggy pivot @real torrent

craggy pivot Jan 27, 2023, 9:24 PM

#

you are welcome

wraith salmon Jan 27, 2023, 9:25 PM

#

gl bro

brazen ravine Jan 27, 2023, 9:25 PM

#

!solved

worldly patioBOT Jan 27, 2023, 9:25 PM

#

Thank you and let us know if you have any more questions!

#EASY: problem with strtok parsing a line