#Week 36 — What is a regex and how can it be used in Java?

11 messages · Page 1 of 1 (latest)

glossy tuskBOT
#
Question of the Week #36

What is a regex and how can it be used in Java?

mortal thistle
#

accepted answer by saturn5vfive. (1111137361359818803):

#

A regex, also known is a regular expression is a string of text, also called a pattern, that is used to match parts of a larger piece of text easily and efficiently.

A regex makes use of special control characters to define parser rules to outline a specific, yet infinitely variable number of patterns that could be selected.

In java, a regex can be created using a Pattern object from the java.util.regex package as follows

Pattern re = Pattern.compile("import  .*;$")

the pattern above would probably match java import statements

we can then check if a given string matches the pattern by using code similar to

re.matcher(INPUT).matches()

this function returns a boolean which states if the given INPUT matches the regex pattern defined in "re"

#

best answer by 0x150 (1067125865785344031):

#

RegEx ("Regular Expressions") is a small sub-language inside of most languages, allowing the programmer to easily parse a text format, and extract information from it. It can be used to (partially) validate and parse E-Mails, for example.
A regex in general looks something like this: i am (\d{1,2}) years old\.. This regex will match the string "i am [any 2 digit number] years old.", and extract the age into its own group.
In java, the regex and age extraction would look like this: ```java
Pattern regex = Pattern.compile("i am (\d{1,2}) years old\.", Pattern.CASE_INSENSITIVE);
String input = "I am 19 years old.";
Matcher matcher = regex.matcher(input);
if (!matcher.find()) System.out.println("Invalid text format");
else System.out.printf("You are %s years old", matcher.group(1) /* the first group is always the entire matched text, custom groups start at 1 */);

#

The Matcher seen here is the class being responsible for matching a Pattern (compiled regex expression) to a given String, here it's being used to find() a match, then to extract the first custom group from the found match. matches() can also be used to check if the given regex expression matches the entire string, but if you just want to use regex to extract information from a string, find() would probably work better for that. You can also call find() multiple times to search the entire string for multiple occurances of the pattern, like so: ```java
Pattern regex = Pattern.compile("i am (\d{1,2}) years old\.", Pattern.CASE_INSENSITIVE);
String input = "I am 19 years old. I am 50 years old.i am 12 years old. jfskdfjsdkljfls I am 99 years old.";
Matcher matcher = regex.matcher(input);
int count = 0;
while (matcher.find()) {
System.out.printf("Person %d is %s years old.%n", ++count, matcher.group(1));
}

That code will print this:

Person 1 is 19 years old.
Person 2 is 50 years old.
Person 3 is 12 years old.
Person 4 is 99 years old.

You may have noticed that the string can be slightly malformed, the regex will just skip invalid characters and go to the next found match, if there is one. This is especially useful if you want to use regex to find a certain pattern in a string, like in this example. Knowing this, you could use regex to extract the href attribute from an `<a>` html tag: `<a.*?href *= *"(.*?)".*?>`
#

accepted answer by theoneandonlylark (139385988672782336) :

#

Regex means regular expression and can be used in Java to sort or search strings

#

best answer by dan1st (358291050957111296):

#

Regexes (short for regular expressions) are a powerful way to find and match patterns in strings.
One can specify what a pattern can look like and and then search for the pattern in a specific string or match the whole string against the pattern.
For example, the regex [A-Za-z]+ [A-Za-z]+ matches all strings consisting of any number of letters (A to Z, both lower- and uppercase) as long as there is at least one such letter followed by a space and another such sequence of letters..
Java allows to match a String against a regex using the String#matches method:

System.out.println("Hello World".matches("[A-Za-z]+ [A-Za-z]+"));//true
System.out.println("Hi there".matches("[A-Za-z]+ [A-Za-z]+"));//true
System.out.println("Hello_World".matches("[A-Za-z]+ [A-Za-z]+"));//false
#

Java also allows finding patterns inside Strings with a regex.
For example, it would be possible to use the regex <@[0-9]+> for detecting Discord (user) mentions as they start with <@ followed by the user ID and >.
It is possible to then extract the user ID by introducing a group for it by surrounding the part matching the user ID with parenthesis: <@([0-9]+)>

Pattern pattern = Pattern.compile("<@([0-9]+)>");//the pattern to match

String text = "Hi, I am @mortal thistle and @glossy tusk is a bot here";//text containing two Discord user mentions

Matcher matcher = pattern.matcher(text);//create a Matcher for searching through the String
while(matcher.find()){//as long as there are patterns left
  String mention = matcher.group();//get the whole matched text
  String userId = matcher.group(1);//get the group specified within () in the regex
  System.out.println("found '"+mention+"' mentioning user with ID "+userId);
}

This code yields the following output:

found '@mortal thistle' mentioning user with ID 358291050957111296
found '@glossy tusk' mentioning user with ID 743072402702860358