#Custom Syntax parsing

144 messages · Page 1 of 1 (latest)

outer valley
#

Hello there..
any idea how to do it? I trying make a my own parser for non-quoted expressions for @{} json format.
(\w+):\s*([^ ,]+) Regex
Problem here are a any function what contains an ,, so the Matcher will also broke the function any idea for regex what can handle it?
Input:
{A: false, int: {_i}, json: {_json}, loc: location(10, 20, 30), mame: l(1,2)}

        public void putAll(final String jsonString) {
            int length = jsonString.length();
            if (length > 1) {
                String input = "{A: false, int: {_i}, json: {_json}, loc: location(10, 20, 30), mame: l(1,2)}";
                Map<String, String> map = new HashMap<>();
                Pattern p = Pattern.compile("(\\w+):\\s*([^,]+)");
                Matcher m = p.matcher(input);
                while (m.find()) {
                    map.put(m.group(1), m.group(2));
                }
                System.out.println(map);
            }
        }

Output: {A=false, loc=location(10, json={_json}, int={_i}, mame=l(1}

wanton wadiBOT
#

This post has been reserved for your question.

Hey @outer valley! Please use /close or the Close Post button above when your problem is solved. Please remember to follow the help guidelines. This post will be automatically closed after 300 minutes of inactivity.

TIP: Narrow down your issue to simple and precise questions to maximize the chance that others will reply in here.

thorny badge
#

send the string that u would like to be accepted @outer valley

#

and also send ur pattern

outer valley
#

C: json from text: "{'A': false}", A: false, another: "AAAA AAAA AAA", int: {_i}, json: {_json}, loc: location(10, 20, 30), mame: l(1,2)}
That shall be anything i making a parser for any expression.
so that can be also some expression like e.g.

json from text "{A: false}"
size of {elements::*}
etc.
package cz.coffee.skjson.parser;

import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public abstract class StringJsonParser {

    static final String SPECIAL_REPLACER = "◙";

    public static final String SKEX_INPUT = "{C: json from text: \"{'A': false}\", A: false, another: \"AAAA AAAA AAA\", int: {_i}, json: {_json}, loc: location(10, 20, 30), mame: l(1,2)}";

    public static class JsonExpressionMap {
        public void putAll(final String jsonString) {
            Map<String, String> map = parseInput(jsonString);
            for (Map.Entry<String, String> entry : map.entrySet()) {
                System.out.println(entry.getKey() + " : " + entry.getValue());
            }
        }
        Map<String, String> parseInput(String input) {
            input = input.replaceAll("\\s(?=([^\"]*\"[^\"]*\")*[^\"]*$)", "");
            System.out.println(input);
            Map<String, String> map = new HashMap<>();
            input = input.replaceAll("(?<=\\w)\\s*:\\s*(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)", SPECIAL_REPLACER);
            System.out.println(input);

            Pattern p = Pattern.compile("(\\w+)◙\\s*([\\w{}(),\"\\s]+)([,}])");
            Matcher m = p.matcher(input);

            while(m.find()) {
                map.put(m.group(1).trim(), m.group(2).trim());
            }

            return map;
        }
    }

    public static final class TEST {
        public static void main(String[] args) {
            JsonExpressionMap map = new JsonExpressionMap();
            System.out.println(SKEX_INPUT);
            map.putAll(SKEX_INPUT);
        }
    }
}
thorny badge
#

can u send

#

the pattern ur using here

outer valley
#

a using already three of regexes as you can see in the Java snippet

#

(\w+):\s*([\w{}()',\"\s]+)([,}])
otherwise that with wildcard instead of :

thorny badge
#

((\W+): ([\w{}()_]))

#

now send

#

whats below

#

send this

outer valley
thorny badge
#

dude

#

can u please

#

for the love of god

#

write this down and send it

outer valley
#

i did everything what you want. What do you mean down? the down parts its for Test string

thorny badge
#

XD

#

Dude

thorny badge
#

here right

#

type down here what is in the screenshot

gusty pelican
#

@outer valley i didn't quite understand your question. What syntax are you trying to parse, regular json ?

outer valley
outer valley
gusty pelican
outer valley
outer valley
gusty pelican
#

so it's your own syntax. There is no documentation for the syntax you are using ?

outer valley
#

this moment no.

#

I dunno if you know Skript Java Library.

#

the expressions came from it

outer valley
#

These expressions and the {_a} for e.g. its sort of variables in the language and expression are similar thing but instead of saving they will executed on the background
But that is doesn't matter i need only from input string the key:value pairs at the end.

gusty pelican
#

and key-value paris are always separated by a comma , and the key is separate using a colon: right ?

thorny badge
#

send me yours

#

the one you typed there

#

not mine

#

mine was an attempt to copy yours

gusty pelican
#

do you want to match the entire key-value pair in one group or in multiple groups ?

gusty pelican
#

that's not an answer to the question

outer valley
#

oh sorry, i wanna use two groups, group 1 = key, group 2 = whole value

outer valley
#

so many steps but that kinda works

package cz.coffee.skjson.parser;

import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public abstract class StringJsonParser {

    static final String SPECIAL_REPLACER = "◙";

    public static final String SKEX_INPUT = "{C: json from text \"{'A': false}\", A: false, another: \"AAAA AAAA AAA\", int: {_i}, json: {_json}, loc: location(10, 20, 30), mame: l(1,2)}";

    public static class JsonExpressionMap {
        public void putAll(final String jsonString) {
            Map<String, String> map = parseInput(jsonString);
            for (Map.Entry<String, String> entry : map.entrySet()) {
                System.out.println(entry.getKey() + " : " + entry.getValue());
            }
        }
        Map<String, String> parseInput(String input) {
            //input = input.replaceAll("\\s(?=([^\"]*\"[^\"]*\")*[^\"]*$)", "");
            input = replaceSpacesInBraces(input);
            System.out.println(input);
            Map<String, String> map = new HashMap<>();
            input = input.replaceAll("(?<=\\w)\\s*:\\s*(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)", SPECIAL_REPLACER);
            System.out.println(input);

            Pattern p = Pattern.compile("(\\w+)◙\\s*([\\w{}()':,.\"\\s]+)([,}])");
            Matcher m = p.matcher(input);

            System.out.println("\n");
            while(m.find()) {
                map.put(m.group(1).trim(), m.group(2).trim());
            }

            return map;
        }

        String replaceSpacesInBraces(String input) {
            String regex = "\\(([^)]*)\\)";
            Pattern pattern = Pattern.compile(regex);
            Matcher matcher = pattern.matcher(input);
            StringBuilder result = new StringBuilder();
            while (matcher.find()) {
                String match = matcher.group();
                String replacement = match.replaceAll("\\s", "");
                matcher.appendReplacement(result, replacement);
            }
            matcher.appendTail(result);
            return result.toString();
        }
    }

    public static final class TEST {
        public static void main(String[] args) {
            JsonExpressionMap map = new JsonExpressionMap();
            System.out.println(SKEX_INPUT);
            map.putAll(SKEX_INPUT);
        }
    }
}
thorny badge
#

@outer valley

outer valley
#

let me try, did you also the test string without a spaces?

thorny badge
#

without spaces

outer valley
#

yes remove the last comma

#

on the end

#

T: json from test "{'A': 'C'}",A: false, int: {_i}, json: {_json}, loc: location(10, 20, 30),mame:l(1,2),
expression will break the syntax

#

but i guess i can use SPECIAL MARK instead of :

#

before parsing

thorny badge
#

u can add the last coma in java

#

string.charAt(string.length-1) == ',' ? return string : return string += ",";

outer valley
#

but the last comma will throw exception

#

i Using third library called SkriptLang

thorny badge
#

add it in the end then remove it after it passes the matches test

outer valley
#

You are kinda right

#

did you also tried the expressions?

thorny badge
#

no you try them

outer valley
#

can i replace : with anything else? for e.g. ◙

thorny badge
#

try it

outer valley
#

Thank ya @thorny badge can you kinda explain what that does? And how that works

wanton wadiBOT
outer valley
# thorny badge try it

found issue when the things are quoted that will not match { "C": "json from text \"{'A': false}\"", "A": false, "another": "AAAA AAAA AAA", "int": "{_i}", "json": "{_json}", "loc": "location(10, 20, 30)", "mame": "l(1,2)"}

thorny badge
#

why would things be quoted

outer valley
#

("[^"]+"|\w+):(?: ?)+([\w{}():',\"\s]+)([,}])(?: ?)+ shall be that?

west ice
#

Sounds like a situation that regexes are too limited to handle.
Generally speaking, actions called "parsing" are normally too subtle for regexes.

thorny badge
west ice
#

Programming

outer valley
#

@thorny badge

#

when i use an array as value that will broke

#

and what could I use for parsing? @west ice

thorny badge
#

[ and ] are special characters

#

you need to use \ before them

#

to use them

outer valley
#

okey

outer valley
west ice
#

In your case, I'd say it feels like above regex but under the need for javacc

outer valley
#

so what should I use?

outer valley
west ice
outer valley
#

i asking what i shall to you for that

west ice
#

String.length()
String.charAt()

outer valley
#

i did something like that

west ice
#

If you're getting your thing to work, then great (beside how horrible these regexes are, that's your problem not mine)
If you're not, then I have trouble figuring out what you want, but I suspect you want something that can't be done with regexes.
That is all.

outer valley
#

Okay, but how am I supposed to do that without regex? by traversing a char array?

west ice
#

Yes. Obviously.
Or javacc if your situation is really crafty

outer valley
#

what do you mean by javacc?

outer valley
#

if you mean java compiler i guess that is overkill for that

west ice
#

No, it's a library to write compilers in Java. It's meant to mean java compiler compiler, but you can make a compiler of anything (such as JSON expressions wink wink)

outer valley
#

i tried that

options {
  IGNORE_CASE = true; // Ignorovat velikost písmen
}

PARSER_BEGIN(MyParser)
import java.util.*;

public class MyParser {
  public static void main(String[] args) throws ParseException {
    MyParser parser = new MyParser(System.in);
    Map<String, Object> result = parser.parse();
    System.out.println(result);
  }
}
PARSER_END(MyParser)

TOKEN : {
  <IDENTIFIER: (["a"-"z","A"-"Z"])+>
}

TOKEN : {
  <INTEGER: (["0"-"9"])+>
}

TOKEN : {
  <LEFT_BRACE: "{">
}

TOKEN : {
  <RIGHT_BRACE: "}">
}

TOKEN : {
  <COLON: "⁞">
}

void parse() :
{
  Map<String, Object> result = new HashMap<>();
}
{
  <LEFT_BRACE>
  KeyValue(result)
  <RIGHT_BRACE>
  {
    return result;
  }
}

void KeyValue(Map<String, Object> map) :
{
  Token key;
  Object value;
}
{
  key = <IDENTIFIER>
  <COLON>
  value = Value()
  {
    map.put(key.image, value);
  }
}

Object Value() :
{
  Token t;
}
{
  t = <INTEGER>
  {
    return t.image; // Hodnota je celé číslo
  }
  | t = <IDENTIFIER>
  {
    return t.image; // Hodnota je řetězec
  }
  | <LEFT_BRACE>
  KeyValueList()
  <RIGHT_BRACE>
  {
    return $KeyValueList.value; // Hodnota je mapa
  }
}

Map<String, Object> KeyValueList() :
{
  Map<String, Object> map = new HashMap<>();
  String key;
  Object value;
}
{
  (key = <IDENTIFIER>
  <COLON>
  value = Value()
  {
    map.put(key, value);
  })+
  {
    return map;
  }
}

void ErrorMessage() : {}
{
  <DEFAULT> {throw new ParseException("Syntax error: " + token.image);}
}

and then i use javacc command but that is not working anymore

#

if you mean that

west ice
#

I have no intention to try and check whether you used it correctly, but I have the impression you're trying to parse something that requires a stack or counter, and regexes can't do that. Between regexes and javacc, there is nothing but plain old coding.

outer valley
#

so using String.length(), and stringAt() and input.toCharArray();

west ice
#

You can take the char array directly if you prefer. I never get why nonbeginners mention that

outer valley
#

what do you mean, nonbeginners? and what they're referring to

west ice
#

I mean aside when using a thirdparty that requires it, there is no benefit to converting a String into a char array, as both are nothing but a sequence of characters

outer valley
#

So I shouldn't use char but letters?

west ice
#

I have the impression you're not even reading. I said both Strings and char arrays are the same. How would that involve a difference between char and letter?

outer valley
#

I'm an idiot. I translated it wrong. Okay, so you think I should go through the whole chain and define the conditions that control the paterns?

west ice
#

I suppose "define the conditions that control the patterns" may be a way to describe writing a parser algorithm, yes

outer valley
#

can you just give me an idea of what that might look like because I've never worked with parsing.

west ice
#

Well, that's the point. Learn. Start with something easy

#

It is way too big to just throw in a quick & easy example

outer valley
#

something like that?

package cz.coffee.skjson.parser;

import java.util.*;
import java.util.regex.*;

public class Main {
    public static void main(String[] args) {
        String input = "{userId⁞ {i}, any⁞ (10+10), products⁞ [{id⁞ {i}, quantity⁞ 1}, {id⁞ {b}, quantity⁞ 2}, {id⁞ 55, quantity⁞ 3}]}";
        Map<String, Object> parsedData = parseInput(input);
        System.out.println(parsedData);
    }

    public static Map<String, Object> parseInput(String input) {
        Map<String, Object> parsedData = new HashMap<>();
        Deque<Map<String, Object>> stack = new ArrayDeque<>();
        stack.add(parsedData);

        StringBuilder keyBuilder = new StringBuilder();
        StringBuilder valueBuilder = new StringBuilder();
        boolean readingKey = true;
        boolean readingValue = false;

        for (char c : input.toCharArray()) {
            if (c == '⁞') {
                readingKey = false;
                readingValue = true;
            } else if (c == ',') {
                readingKey = true;
                readingValue = false;
            } else if (c == '{') {
                Map<String, Object> newMap = new HashMap<>();
                assert stack.peek() != null;
                stack.peek().put(keyBuilder.toString(), newMap);
                stack.add(newMap);
                keyBuilder.setLength(0);
            } else if (c == '}') {
                stack.removeLast();
            } else {
                if (readingKey) {
                    keyBuilder.append(c);
                } else {
                    valueBuilder.append(c);
                }
            }

            if (!valueBuilder.isEmpty() && !readingValue) {
                String key = keyBuilder.toString();
                String value = valueBuilder.toString().trim();

                assert stack.peek() != null;
                System.out.println(value);
                if (value.matches("\\{(.+)}")) {
                    stack.peek().put(key, 123); // Replace 123 with the appropriate integer value
                } else if (value.matches("\\d+")) {
                    stack.peek().put(key, Integer.parseInt(value));
                } else {
                    stack.peek().put(key, value);
                }

                keyBuilder.setLength(0);
                valueBuilder.setLength(0);
            }
        }

        return parsedData;
    }
}
outer valley
west ice
#

That looks like it at least. There are many ways to go about it, notably with a mutable state object

outer valley
#

Could you then give the most basic and primitive example

#

Or some link where I could read something about the flow at least.

west ice
#

no, there is no such thing. And there would still be many ways to go about it and I don't want to impose my choice

#

If I knew of resources I would give one to you, but I learnt various ways myself, and the one resource I have is a textbook

outer valley
#

I'm not interested in imposing. I'm interested in what you would use.

#

Ahh, I don't have that one :/

west ice
#

yeah right. Anyway there is no such thing as most basic so I'd need to check your example and devote a couple hours to make a simple example. I might do that at some time but I just can't volunteer that much time unprepared

outer valley
#

I totally get it, can you at least give me the name of the textbook and I'll see if I can get it on the web

west ice
#

It's written in French and not available online, so even if I could find back its name it wouldn't be much help

outer valley
#

I see

wanton wadiBOT
#

💤 Post marked as dormant

This post has been inactive for over 300 minutes, thus, it has been archived.
If your question was not answered yet, feel free to re-open this post or create a new one.
In case your post is not getting any attention, you can try to use /help ping.
Warning: abusing this will result in moderative actions taken against you.

outer valley
west ice
#

I suspect there may be ways to make things clearer and simpler. But that looks like a parsing algorithm, yes

wanton wadiBOT
#

💤 Post marked as dormant

This post has been inactive for over 300 minutes, thus, it has been archived.
If your question was not answered yet, feel free to re-open this post or create a new one.
In case your post is not getting any attention, you can try to use /help ping.
Warning: abusing this will result in moderative actions taken against you.

outer valley
#

@gusty pelican How's that goin ?

outer valley
#

@gusty pelican hello there

gusty pelican
#

I sent you a dm

outer valley
gusty pelican
#

I can't read that

outer valley
#

Accept fr