#User text validator

20 messages · Page 1 of 1 (latest)

crude aspen
broken dagger
#

What kind of text are you filtering?

crude aspen
hollow anchor
#

Hmm, I'm not sure of OpenAI services are the best thing you could use to do something like this

#

there are specific language-based extractors that help you do things like this much easier than using an AI-model (and for free)

#
>>> from textblob import TextBlob
>>> txt = """Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the inter
actions between computers and human (natural) languages."""
>>> blob = TextBlob(txt)
>>> print(blob.noun_phrases)
#

would return

[u'natural language processing', 'nlp', u'computer science', u'artificial intelligence', u'computational linguistics']

#

Would this be along the lines of what you want?

#

and then you can use GPT3 afterwards to filter for just accusative nouns maybe

crude aspen
# hollow anchor Would this be along the lines of what you want?

These systems don't work well in Polish language, especially when user doesn't use diacritics or use slang. I'm pretty sure texts used to train these models are from books not directly from the internet.
Isn't it really impossible to filter some kind of text in "allowlist" manner? It sounds like a doable task.

hollow anchor
#

Oh I see, you're using Polish language

#

No it's definitely not impossible

#

it will just take some careful prompt engineering

#

To make it reliable and fault tolerant

crude aspen
#

So how can I do it? In OpenAI docs I found only classification with declared types without some sort of "any" type. Task should be simple: I give openai hundreds of texts contain accusative nouns and later if there is no such noun in requested text then send "false" in answer, otherwise "true".

hollow anchor
#

The only real way to do this is with prompt engineering, you'll have to make a prompt that instructs GPT on how to "think" about the sentences given to it in the prompt

#

You'll have to fit the sentences into the 4096 context limit too

crude aspen
hollow anchor
#

Not too expensive nowadays with ChatGPT