#Comfy String Parsing.

61 messages · Page 1 of 1 (latest)

junior coyote
#

and i want to execute a function corresponding to certain strings (in an optimized way):

bluetooth => functionA()
boo       => functionB()
boots     => functionC()
coffee    => functionA()

so here is my function:

void parse_and_execute(char* input) {
  if (input[0] == 'b') {
    if (input[1] == 'l') {
      if (std::strcmp(input + 2, "uetooth") == 0) {
        functionA();
      } else {
        defaultFunction();
      }
    } else if (input[1] == 'o' && input[2] == 'o') {
      if (input[3] == '\0') {
        functionB();
      } else if (std::strcmp(input + 3, "bs") == 0) {
        functionC();
      } else {
        defaultFunction();
      }
    } else {
      defaultFunction();
    }
  } else if (input[0] == 'e') {
    if (std::strcmp(input + 1, "ffoc") == 0) {
      functionA();
    } else {
      defaultFunction();
    }
  } else {
    defaultFunction();
  }
}

but its not very comfortable to write, i want to lay what string corresponds to what function in a flat manner.

1 idea that i had that DID NOT work was this:

constexpr uint32_t hashString(std::string_view str) {
  uint32_t hash = 2166136261u;
  for (char c : str) {
    hash ^= static_cast<uint32_t>(c);
    hash *= 16777619u;
  }
  return hash;
}

void execute(char* input) {
  const uint32_t hash = hashString(input);

  switch (hash) {
    case hashString("bluetooth"):
      functionA();
      break;
    case hashString("boo"):
      functionB();
      break;
    case hashString("boots"):
      functionC();
      break;
    case hashString("coffee"):
      functionA();
      break;
    default:
      defaultFunction();
      break;
  }
}

i thought i could get a hash at compile-time and input at runtime, but for this to work, we need input to be constexpr which doesn't work. i need it to be available at runtime

glass obsidianBOT
#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question use !howto ask.

round violet
jolly crescent
#
#include <functional>
#include <string>
#include <iostream>

// functions to run
void A() { std::cout << "A" << std::endl; }
void B() { std::cout << "B" << std::endl; }
void C() { std::cout << "C" << std::endl; }
void D() { std::cout << "D" << std::endl; }

// commands to functions
std::map<std::string, std::function<void()>> string_to_command = {
    {"bluetooth", A},
    {"boo", B},
    {"boots", C},
    {"coffee", D}
};

// perform checks before running
void run_command(std::string command) {
    if (auto function = string_to_command[command])
        function();
    else
        std::cout << "Command not found." << std::endl;
}

int main(void) {
    run_command("boo");
    run_command("coffee");
    run_command("command that doesn't make sense");
    return 0;
}```
glass obsidianBOT
#

@jolly crescent

It looks like you may have code formatting errors in your message

Note: Make sure to use back-ticks (`) and not quotes (')
Note: Make sure to specify a highlighting language, e.g. `cpp`, after the back-ticks

Markup

```cpp
int main() {}
```

Result
int main() {}
junior coyote
#

i don't think its worth computing hash map just for 1-time parse checking

jolly crescent
#

Do you have a buffer?

junior coyote
#

hash map is computed during runtime

#

buffer?

jolly crescent
#

like an input buffer?

junior coyote
#

std::string_view?

jolly crescent
#

is that the char *

junior coyote
#

its stored in char* in the example in post

#

std::string_view and char* are basically same thing

jolly crescent
#

yeah, but the buffer could be longer than the string

junior coyote
#

std::string_view is same as const char* + size of string + some helper functions

jolly crescent
#

yeah, that is not char * at all

#

just contains something like it

junior coyote
#

ye

jolly crescent
#

could you tell me more precisely what you are aiming for?

round violet
#

why did the switch not work?

junior coyote
#

parsing string in optimized way

jolly crescent
#

okay, so is it an input buffer that is zeroed? or are they exact length strings?

junior coyote
#

sorry, what?

#

wdym zeroed?

jolly crescent
#

what is 'input' exactly?

#

is it like a string with extra room

round violet
#

how optimized does this have to be if there's going to be only 4 cases?

jolly crescent
#

or is it just the data

junior coyote
#

ok im writing a compiler, think of input as source code which we need to parse

jolly crescent
#

also, why are we even optimising this?

junior coyote
#

cuz why not

jolly crescent
#

right...

#

okay, I'll write something else

#

it will be ugly

junior coyote
#

im pretty sure, if im not insane

round violet
#

i'm naive but i think if you're only comparing a couple cases it's that big of a deal

junior coyote
#

im only showing simple example

#

there's a lot more tokens i need to compare

jolly crescent
#

Just don't bother with the other ways

#

trust me

#
#include <map>
#include <functional>
#include <string>
#include <iostream>

// commands to functions
std::map<std::string, std::function<void()>> string_to_command = {
    {"bluetooth",   []() { std::cout << "A" << std::endl; } },
    {"boo",         []() { std::cout << "B" << std::endl; } },
    {"boots",       []() { std::cout << "C" << std::endl; } },
    {"coffee",      []() { std::cout << "D" << std::endl; } }
};

// perform null check before running
void run_command(std::string command) {
    if (auto function = string_to_command[command])
        function();
    else
        std::cout << "Command not found." << std::endl;
}

int main(void) {
    run_command("boo");
    run_command("coffee");
    run_command("command that doesn't make sense");
    return 0;
}```
#

What you are trying to do is called premature optimisation

#

it can often make your code run more slowly too

#

or maybe it slows you down by making the project take WAY longer

#

and every command you add is only going to make it worse with the other techniques

#

the loss in performance from mapping strings to things is so miniscule that it would almost never make sense to do it another way

#

each string has finite length, and I reckon your commands are rather short

#

go for what is easier to maintain first

#

optimise LAST

#

if it is even necessary to do so

quartz wadi
#

It does work as is

junior coyote
# jolly crescent ```cpp #include <map> #include <functional> #include <string> #include <iostream...

If you're still interested, i just realized that yesterday i accidentally re-invented std::unordered_map without realizing it
My constexpr approach and your std::unordered_map approach both work in similar way and have O(1) lookup (not std::map tho it has O(n) lookup)
So some statments i said like

i don't think its worth computing hash map just for 1-time parse checking
is stupid cause that's literally the same thing i did lol.

Also the hashmap solution would be better for performance for large amount of tokens compared to if-else statement which im pretty sure is O(n)

and about that "premature optimisation" thing. for a hobby project, so i don't think it matters.

jolly crescent
# junior coyote If you're still interested, i just realized that yesterday i accidentally re-inv...

O(1) and O(n) are basically the same when n is tiny, tiny, tiny n, in fact, O(n^2) is closer to O(1) for tiny, tiny, tiny n. It's best to fix these things later when it makes sense to do so. Of course, if you 100% know order doesn't matter, then use an unordered map. It probably won't make much of a dent with so few commands at this stage though. I looked up std::map, and all its core operations have O(log(n)) performance. Unordered map averages as constant time for your core operations.

One justification for using map over unordered map would be if you have a 'help' command or a 'match closest' to command like 'q' for 'quit'. These would benefit from being in order.

If-else shouldn't be O(n). I reckon a well written one could get you O(log(n)). The compiler can also make its own optimisations. The problem with the case here is that is that big-O rarely makes sense when the data is uneven and the algorithm relies on it being even to behave like what the big-O indicates.

Optimisation involves maths and testing. Unless your hobby project focuses on those two things, then it's best to do those things after writing your core functionality. It might not even be whether it's map or unordered map that is the problem, or maybe that unordered map comes back around and bites you later. Just keep it as flexible as you can before committing to any optimisations.

flint quartz
junior coyote
#

!solved

glass obsidianBOT
#

Thank you and let us know if you have any more questions!

This thread is now set to auto-hide after an hour of inactivity