#I need help with my pattern matching and regex assignments

30 messages · Page 1 of 1 (latest)

cold matrixBOT
#

This post has been reserved for your question.

Hey @uncut robin! Please use /close or the Close Post button above when you're finished. Please remember to follow the help guidelines. This post will be automatically closed after 300 minutes of inactivity.

TIP: Narrow down your issue to simple and precise questions to maximize the chance that others will reply in here.

wise ore
#

what kind of link ?

#

and you are trying to extract all URLs from the given html, correct ?

#

what is currently not working as expected ?

#

I mean this regex is kind of stupid but maybe this works ?
http[s]?:\/\/[.A-Za-z\/]+

#

and you are sure that urls are only present in quotes ?

#

can you share the html

#

so it will always end in html ?

#

i came up with this:
http[s]?:\/\/[.A-Za-z\/?=#-]+|[A-za-z]+.html

#

Did you see that I edited my message ?

#

so you need all http: links but not the fragments with just index.html and so on.

#

then you can just remove the OR part

#

It should still be covered by this statement:
http[s]?:\/\/[.A-Za-z\/?=#-]+

#

and something.txt would be valid too.

#

wait but I thought you just said that you don't want to match those

#

so "index.html" should be valid too ?

#

oh ok.

#

in this case you are correct and we do need the or.
http[s]?:\/\/[.A-Za-z\/?=-]+|[A-za-z]+.html

#

this just removes the # from the pattern so it won't be matched

#

you forgot the escape characters.

#

the // isn't matched

#

they do...

#

It might work fine on the website but it will definitely won't work fine in java.

#

could you send me the exact input you currently have in regex101 ?

#

I meant what you are testing the regex with as a String not as a screenshot so I can copy it and try it with my current one

#

sure but I just need your testcase

#

oh alright in that case

#

please dm it to me

cold matrixBOT
#

💤 Post marked as dormant

This post has been inactive for over 300 minutes, thus, it has been archived.
If your question was not answered yet, feel free to re-open this post or create a new one.