#How to do string pattern matching (regex)?

1 messages · Page 1 of 1 (latest)

cedar creek
#

I'm trying to read each line from a particular string and then match then against a pattern, but I keep getting the error in the SS, here's the code:

decipherMap :: proc(mapFile: string) -> int {
  lines : []string = strings.split_lines(mapFile)
  regex :string =
    "(one" +
    "|two" +
    "|three" +
    "|four" +
    "|five" +
    "|six" +
    "|seven" +
    "|eight" +
    "|nine" +
    "|\\d)"

  matches : ^match.Match

  for line in lines {
    match.gmatch(line, regex, matches)
    fmt.println(line)
  }

  return 0
}

Does anyone know how I could make this work? I have no idea how to turn line into a pointer nor how to instantiate a variable of type ^[32]Match so I can use the gmatch procedure, any help is appreciated

shut coyote
#

Not 100% familiar with what exactly the regex is doing tbh but the following code should at least get rid of the errors

decipherMap :: proc(mapFile: string) -> int {
  lines : []string = strings.split_lines(mapFile)
  regex :string =
    "(one" +
    "|two" +
    "|three" +
    "|four" +
    "|five" +
    "|six" +
    "|seven" +
    "|eight" +
    "|nine" +
    "|\\d)"

  matches : [match.MAX_CAPTURES]match.Match
  for &line in lines {
    match.gmatch(&line, regex, &matches)
    fmt.println(line)
  }

  return 0
}
#

the &line basically allows the line variable to be addressable

#

instead of ^[32]match I just put a stack allocated array [32]match and then just gave the function the address of the matches

cedar creek
shut coyote
cedar creek
#

The regex in question simply looks for any instances of the words "one", "two", "three" etc and digits (1, 2, 3) in a given string

shut coyote
#

yeah

shut coyote
#

I imagine that you can look through the source code

#

and figure out whats happening

#

the match seems to have a start and end byte

#

might be able to index into the line

#

and figure something out

#

from there

cedar creek
#

Yeah I'm very new to that but that's what I've been doing, unfortunately the text/match library seems to have been addeed recently so documentation is a bit scarce

cedar creek
shut coyote
#

👍 np

marsh pivot
#

It does not support | afaik so that's why your example doesn't work I think

cedar creek
#

Are you familiar with the match library? I still don't know how to fetch the substrings that have match, if any :(

marsh pivot
#

I've used it a few times, what are you trying?

cedar creek
# marsh pivot I've used it a few times, what are you trying?

Well it's rather simple really, I wrote a similar piece of code in C++ (the solution is probably not optimal, but ignore that), that checks for matches in a string and then stores the first and last matched groups into two variables:

  while (getline(mapFile, inputLine)) {
    regex_search(inputLine, strMatch, digitsRg);
    string firstM = strMatch.str();
    string lastM;

    for (smatch sm; regex_search(inputLine, sm, digitsRg);) {
      lastM = sm.str();
      inputLine = regex_match(sm.str(), regex("\\d"))
        ? sm.suffix()
        : sm.str().back() + sm.suffix().str();
    }

    int value = stoi(translateStr(firstM) + translateStr(lastM));

    sum += value;
  }

I noticed that the match.gmatch method takes an argument of type Match as the third argument, so I'm assuming that this type would work similarly to smatch in C++? With smatch I can use the method .str() to extract the string value of that submatch (assuming that there's any), but I don't know what the equivalent in Odin would be, here's my Odin code again:

decipherMap :: proc(mapFile: string) -> int {
  lines : []string = strings.split_lines(mapFile)
  regex :string = "%d"

  matches : ^[match.MAX_CAPTURES]match.Match

  for &line in lines {
    //matcher := match.matcher_init(line, regex)
    //fmt.println(matcher.captures_length)

    something, b := match.gmatch(&line, regex, matches)
    fmt.println("check this out: ", something)
  }

  return 0
}

Any ideas of how to do a similar operation in Odin?

#

You can ignore most of the code, basically I'm trying to understand how does Odin stores these substrings matches and how can I access then, or check wether there are any matches at all

marsh pivot
#

this would iterate over each match on the lines:

    decipherMap :: proc(mapFile: string) -> int {
      lines : []string = strings.split_lines(mapFile)
      regex :string = "%d"

      matches: [match.MAX_CAPTURES]match.Match

      for &line in lines {
        //matcher := match.matcher_init(line, regex)
        //fmt.println(matcher.captures_length)

        for something in match.gmatch(&line, regex, &matches) {
            fmt.println("check this out: ", something)
        }
      }

      return 0
    }
#

so every digit on the line will get printed here

cedar creek
marsh pivot
cedar creek
#

Interesting, that's pretty cool, alright I think I get it now xDD

#

really, thank you very much sir, I couldn't make sense of it from the docs or even the code

marsh pivot
#

No problem! I have been meaning to add some docs to it but haven't gotten around to it yet

#

It is very unclear so it is not your fault at all

cedar creek
#

Well it seems to have been added recently so that's completely fair, if I was a more competent programmer I'd love to help with documentation and development but I'm not there yet 🥲

marsh pivot
#

Is 2 years recent :p?

cedar creek
#

Oh... well that depends xDDD

#

Time is relative