#Open Source Automated Sentiment Analysis

113 messages · Page 1 of 1 (latest)

queen schooner
rocky ferry
#

what do you mean by tuning gpt model?

queen schooner
#

prompt engineering to maximize sentiment analysis output, input refinement to maximize data availability

#

some of that is code refinement some of that is prompt work within working code

queen schooner
#

Some questions come to mind for others who end up reading:
What other API keys can individual users access for free or cheap? (this is not SAAS)
Can the code be updated to call for gpt-4 instead? - A branched product is desired here
Can the code be rewritten to handle other inputs from polygon? - A set of branched products is desired here

My core issue: I'm not actually a python coder, this was built in gpt-4 code interpreter, it is at it's contextual limit. It is now making fidelity errors when working with the code. Therefore - python expertise is needed for code review and improvement from here.

rocky ferry
#

did anyone tried gpt 4?

queen schooner
#

I have asked for gpt-4 API access. the gpt_sentiment_analysis code block could easily be rewritten to call gpt-4

#

I was wondering what was taking so long in processing - turns out it pulled 6750 news articles to go through on the tickers I selected

#

so like 30-50 for each of the prominent ones

swift gorge
#

What API arguments (copy your API call line of code) to obtain sentiment analysis? What is the spectrum of sentiments that can be output by the API? Binary? Happy/Sad?
Explain: "this was built in gpt-4 code interpreter". Did GPT write the python code? Explain "it is at it's contextual limit.". Is your initial prompt already maxing out the context token limit? Does sentiment analysis require tokens as response? I thought sentiment analysis was derived from values in hidden state neurons, not output tokens. Explain "It is now making fidelity errors"

queen schooner
#

So the code is available on github if you want to review it in detail.
Polygon.io API key pulls down news by ticker and some financial data as well by ticker
Regarding the code interpreter - Yes it wrote the code, I compiled it. When it ran and produced errors, gpt-4 troubleshot those errors. When the code worked we evaluated the initial databases and made improvements. It's currently on its 10th build

Regarding at the context limit, I am referring to the gpt-4 code interpreter. It has a context limit for working with code, the program is now at that context limit, which makes editing using gpt-4 code interpreter subject to fidelity issues and loss of things like error processing code.

#

sentiment analysis is done with gpt-3.5-turbo and with vader sentiment analysis. Both respond with a Good, Bad, Unknown, Neutral type system to account for errors. Yes gpt 4 wrote the code, I had to work to compile and we did troubleshooting together. Yes Sentiment analysis requires tokens back as a response because you want some sort of justification for it when you go back and review.

#

The fidelity errors I'm referring to have to do with gpt-4 code interpreter's context limit. The code barely fits, when it's trying to work on blocks it frequently makes fidelity errors, dropping pieces of code previously written even though the reference code is right there in the context

#

Therefore I've hit diminishing returns and cannot effectively improve the code without either learning a lot about python myself or engaging other coders in alignment with the open source project

swift gorge
#

The way that I use chatGPT to write python code that might get long is: I write a paragraph explaining what I want the program to do. I get a first draft from ChatGPT I analyze it and see if it has the right types of functions and copy the variable names it uses. I run the script and let GPT debug the errors. Second Iteration. If variable names are not descriptive, I make the variable names descriptive with multiple words and underbars. I add the descriptive variable names and the function names into the original paragraph I started with and I get another version and then it usually repeats the function names and the descriptive variable names. When it starts to get too large and it can't complete or it starts changing things after "continue" then I take a closer look at the functions and if they're too large or they don't do what I want. I ask chatGPT to "write a function" that does what I want and I specify using the defined descriptive variable names and function names. So, you can improve one function at a time as long as you control the inputs and the outputs. and you can make GPT add comments to help determine what the code does. these comments when correct may be selectivelyback into your original prompt paragraph. Third iteration: If you want to start from scratch you can start over with a new paragraph that is complete with descriptive variable names and function based on the comments. This modular approach allows me to obtain a final code that is longer than the token limit of chatGPT.

queen schooner
#

I'll keep that in mind though at this point because the code is functional, improvements are either cosmetic (adjusting print outputs) or model tuning (adjusting prompts) I won't touch the code again without the input of an expert in python

#

its at the 350 mark, 150 more companies to go, then I'll be able to publish the first database and report

swift gorge
#

I never programmed in python before March 2023. Chat GPT taught me how to python program based on ChatGPT responses. I have written Python scripts that call the OpenAI API for text and WALL-E. I cant write python from scratch but using GPT ( free) I can generate python that works. I guess I have worked around "fidelity errors" using my modular approach including specified descriptive variable names and descriptive function names. The cause of these fidelity errors between iterations seems to be: 1) GPT drops pieces of prior code to focus on new parts of code, forgetting older code lines. The old code atrophies. 2) GPT does not like to repeat prior working code and prefers to write new improved code, using different libraries for example. In such cases the code blocks do not interoperate. This can be mitigated by learning from prior iterations what libraries, variable names and functions and code snippets to specify into a new prompt. By specifying you might get new code but it will more likely work or interoperate with prior blocks of code.

queen schooner
#

that does appear to be a correct analysis of the situation

swift gorge
#

Your python is larger than what I produced using Chat GPT free version. But modular approach can be used. Your OPENAPI code is try:
prompt = f"As an analyst, assess the sentiment of the following information: {text}. Would you categorize it as 'Good', 'Bad', or 'Unknown' in the context of {ticker}?"
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are an analyst whose task is to assess the sentiment of financial news."},
{"role": "user", "content": prompt},
],
max_tokens=300)

                full_response = response['choices'][0]['message']['content'].strip()
                sentiment = full_response.split('\n')[0].strip().upper()

                if sentiment not in ['Good', 'Bad', 'Neutral']:
                    sentiment = 'UNKNOWN'
#

I infer from your filter text "if sentiment not in['Good', 'Bad', 'Neutral']: sentiment = 'UNKNOWN'" that the prompts sometimes fail to produce the magic words. If this is a problem, maybe Prompt Engineering as follows: put the three magic words at the end of you prompt ( FOLLOWING the {ticker} TEXT.). If this is minor problem then modify the "if sentiment not in ['Good', 'Bad', 'Neutral']:" to REPEAT the Prompt using the modified Prompt Structure. Then add a second "if sentiment not in ['Good', 'Bad', 'Neutral'" to detect whether the modified prompt produced a better result. You could extend this to measure the performance of N number of different prompt structures and put the best performing prompt structure in the top position.

#

So you could ask Chat GPT to "write a function named ____ calling the openAI (specify parameters) and conditionally-sequentially using each of the following three prompt structures... If the response to the first prompt fails to include (3 words) , then perform the second ... Add a counter to each prompt version to count the number of defections of the word in each response. Needs copy/paste for clarification but this would generally result in a working code block to paste into the whole script.

#

As a PS. Some smaller GPT models are capable of performing "sentiment analysis" based on hidden states at a single digital neuron. This suggests you may be able to perform suitable sentiment analysis on small text samples using a local GPT model. (Also, I don't know how vader works.)

queen schooner
#

It crashed due to a major polygon timeout and then I discovered it wasn't writing to the database as it cleared each ticker. I'm troubleshooting now

#

I may not have time to crack on this until the weekend, if you do rebuild the code do post that as a fork for testing

#

fortunately i put in a restart script so it's picking up where it left off but it's unclear as to why it didn't write to the db as it cleared each ticker

#

all that data looks like its stored fortunately, to be continued

queen schooner
#

While it's chunking on wrapping up the data I determined to break the code down into smaller chunks so that each one can be independently improved without having to recompile. This should also speed up the operation overall

steady kestrel
queen schooner
#

The prompting definitely could be improved, that falls under improvement of analysis and model which for me came after first functional pull of data which is 85% done and should wrap up today on 6500 articles for the S&P500

#

part of the next cycle will be to break the program down into 3 discreet functions. One for pulling news (we will expand inputs) One for pulling financial data (we will expand inputs) One for sentiment analysis and scoring (We will improve those areas)

swift gorge
#

Maybe, instead of Chat GPT you should be primarily using: "Integrated OpenAI Advanced Tweet Classifier
This API uses natural language processing techniques to analyze text data such as messages or tweets to determine if they have positive, negative, or neutral sentiments."

queen schooner
#

Alternate builds should be generated

#

We can compare output data in gpt-4 code interpreter to see quality of it

swift gorge
#

This article discusses 3 alternatives for sentiment analysis. https://www.datacamp.com/tutorial/text-analytics-beginners-nltk "Which platform is largely used for sentiment analysis using NLP difficulty level easy?
The Natural Language Toolkit (NLTK) is a popular open-source library for natural language processing (NLP) in Python. It provides an easy-to-use interface for a wide range of tasks, including tokenization, stemming, lemmatization, parsing, and sentiment analysis.
https://www.datacamp.com › tutorial
NLTK Sentiment Analysis Tutorial for Beginners - DataCamp"

NLTK sentiment analysis using Python. Follow our step-by-step tutorial to learn how to mine and analyze text. Use Python's natural language toolkit and develop your own sentiment analysis today!

queen schooner
#

ooh good research man

#

NLTK is used already for vader sentiment

#

yeah this type of sentiment analysis has a TON of applications.. finance just seems like the best one for the public 🙂

swift gorge
#

So, each time you run paid openAPI run a couple of other alternatives: for evaluation of alternatives; for weighted combination of results ( confidence level).... Maybe experience will show that one or more alternatives provides reliable results, obviating the paid openAPI and freeing up GPU compute for other users.

#

If your/ user has a local computer has a Nvidia GPU, then you may be able to offload sentiment analysis to a local GPT model. This implies a python routine to detect user hardware and select local GPU for local GPT model (or truncated portion of local GPT model).

queen schooner
#

Given that stacks of GPU's are coming on market it may be possible to build a rack of those for some real computational AI power

#

an offline model not calling gpt at all would be.. advantageous

#

truly free of outside influence and their guardrails

swift gorge
#

Quantized Integer16 pretrained GPT open source models exist. You should experiment with these to see if a local CPU is sufficient for sentiment analysis, avoiding the need for any GPUs.

queen schooner
#

I'll take a look, and if you determine an alternate build can be functionally made go for it 🙂

swift gorge
#

Given the following studies I presume that decomposing an open source GPT such as quantized GPT2 can provide adequate sentiment analysis with zero output tokens, based on finding the neurons that activate in a manner correlated to the desired sentiment of interest: OpenAI
https://openai.com › research › uns...
Unsupervised sentiment neuron
Apr 6, 2017 — Digging in, we realized there actually existed a single “sentiment neuron” that's highly predictive of the sentiment value.
Sentiment fixed to positive: Sentiment fixed to negative
GitHub Pages
https://rakeshchada.github.io › Senti...
The unreasonable effectiveness of one neuron
May 12, 2017 — The authors discovered that a single neuron captured the sentiment of an entire piece of text."

Introducing the ChatGPT app for iOS

queen schooner
#

That sounds cheaper to run and likely more effective for the initial screening

#

however the research paper did not include prompts or code so implementation may be beyond me, I'll have gpt review this to see what can be done

swift gorge
#

I think you need access to hidden state activations at every level and every token to discover the neuron that would be a best indicator of text sentiment. OpenAI seems to have done this with their closed models to produce their specialized sentiment analysis API ( why not try that one now?)

queen schooner
#

The prompt I started with is a replica of the research prompt as this first is meant to replicate that research

#

I can't try anything until it wraps up the current run because I neglected to have it write to db midstream, it has to complete this cycle for me to get the data

#

63 tickers to go, maybe tonight

#

we can implement gpt-2 locally

#

gpt-4 doesn't think it will do as good a job

#

but the api calls to 3.5 have been really bad all week, an offline model version is needed

swift gorge
#

I have tried to run GPT-2 simple (quantized) on my laptop. I ran out of RAM. My child has a gaming desktop with an NVIDIA GPU at different location. I am trying to set up a VPN or something to remote access his PC to work with GPT-2 model 117M which I download. I am interested in decomposing the GPT2 model to identify sentiment neurons... But I don't currently have enough hardware to perform the investigation.

queen schooner
#

if its that limiting then we stick with the API call and the error logic

#

it needs to be able to run on the average computer for maximum uptake

swift gorge
#

I only had 1.5 GB of available RAM in laptop. It needs more like 4GB or 8 or 16GB. I can upgrade RAM to that. You can give GPT2 SIMPLE a try on your hardware.

queen schooner
#

looks easy enough to implement as an alternate version. I'm in the middle of breaking the function down and have the news grabber complete, next up is the sentiment analysis section. I'll build two versions of that, one calling 3.5 turbo and one that uses an installed version of gpt2. Finally the third function does the rest of the math pulldown from polygon and the final scoring

swift gorge
#

@swift gorge good approach but I pointed out that the code should simultaneously seek a sentiment determination from Chat GPT and also the other methods, for comparison and evaluation purposes and for confidence metrics. The multiple determinations can be saved to an Excel worksheet for post analysis and computing reliability of each method of sentiment analysis. ( Not multiple alternative versions that become hard to compare and require separate runnings)

queen schooner
#

Yes multiple inputs of analysis are always best, and over time we'll continue to improve that aspect

#

The three scripts are now discreet so they can be improved and run independently

#

From there it should be easy to generate an alternate assessment program that uses gpt2

#

in both cases it will still also use vader, and any other tools we can plug in

queen schooner
#

Code has been rebuilt

steady kestrel
queen schooner
#

I did a bunch of research and playing in the sandbox last night, I am going to try a run using the isolated sentiment analysis script against the same news, but using the model text-curie-001 which apparently is geared specifically for sentiment analysis

#

Then I'll use gpt-4 code interpreter to compare the two datasets which should only differ by the gpt sentiment analysis

#

Another thought, if the curie model is less used it should run into far fewer retry due to overloaded api errors

#

we also need to add a periodic dump to database function so that you don't lose the whole analysis if it crashes midway

swift gorge
#

User
Quote = "You dance so well. Will you be my dance partner?" Sentiment =

ChatGPT
The sentiment of the quote "You dance so well. Will you be my dance partner?" appears to be positive or complimentary. It expresses admiration for the person's dancing skills and extends an invitation to be their dance partner.

User
Enter Sentiment-Analysis mode. Quote = "You dance poorly. Will you be my dance partner?" Sentiment =

ChatGPT
In sentiment analysis mode, the quote "You dance poorly. Will you be my dance partner?" is analyzed for sentiment.

Sentiment: Negative

Explanation: The quote contains a negative sentiment as it states that the person dances poorly. However, it also includes a somewhat contradictory request for the person to be their dance partner. Overall, the negative sentiment outweighs the potential positive aspect of the invitation, resulting in a negative sentiment classification.

User
Enter nonverbose Sentiment-Analysis mode. Quote = "You dance adequately. Will you be my dance partner someday?" Sentiment =

ChatGPT
In nonverbose sentiment analysis mode, the quote "You dance adequately. Will you be my dance partner someday?" is analyzed for sentiment.

Sentiment: Neutral

#

It seems that with basic prompt engineering you can cause even chatGPT to be less verbose. And ChatGPT already knows what sentiment-analysis is. So the amount of tokens in a prompt/response for sentiment can be reduced accordingly.

#

Enter single-word-response Sentiment-Analysis mode. Quote = "You dance adequately. Will you be my dance partner someday?" Sentiment =

ChatGPT
Neutral

queen schooner
#

interesting

#

The code that was provided has some improvements I'm trying to understand, have to rebuild the main function accordingly

#

Initial testing with text-curie-001 is better than gpt-3.5-turbo on sentiment response - very interesting

swift gorge
#

In nonverbose sentiment analysis mode, the quote "Plug Power sucks my hydrogen hose dry. Will it ever give me power and heat at my home?" is analyzed for sentiment.

Sentiment: Negative

Explanation: The quote expresses frustration or dissatisfaction with Plug Power, indicating that it "sucks" the hydrogen hose dry. Additionally, there is a skeptical question about whether it will provide power and heat at home. The overall sentiment is negative due to the negative tone and doubt expressed in the statement.

User
Enter single-word-response Sentiment-Analysis mode. Quote = "General Electric brings investment dreams to an end. Will it ever give me a dividend?" Sentiment =

ChatGPT
Negative

queen schooner
#

Right now thanks to the core improvements I'm rebuilding the main function, glad you're working on improving sentiment

#

it's going to take a team effort to get this thing to a really optimal state, otherwise it's progressing is limited by how fast I can learn to code, not optimal

queen schooner
#

One thing that has troubled me on review of the first run.. 80% of the API calls to 3.5 turbo failed. The server has too much traffic to use it reliably. We really need to look at offline models to replace that API call not only to constrain cost for users but for efficacy

queen schooner
#

Here are resources to explore next:

Based on your needs, here are some open-source language models that could be suitable for sentiment analysis:

spaCy: This is a high-ranking sentiment analysis package on GitHub. Built in Python uses 6 programming languages

Pattern: Another high-ranking sentiment analysis package on GitHub that is mainly in Python. This package provides data collection options via web scrapers or integrating APIs and applying sentiment analysis on collected data as a comprehensive solution​1​.

MeaningCloud: This is a low-code or no-code tool that provides sentiment analysis. It offers a free tier that supports API integration, which may help automate your text analysis process​1​.

Social Searcher: This tool specializes in social media sentiment analysis and offers real-time searches for free. Its dashboard view is particularly helpful to compare different platforms and have a clear understanding of the overall sentiment of a specific keyword​1​.

Hugging Face's Transformers: This is a library that supports a wide variety of natural language processing tasks, including sentiment analysis. It offers a simple API for downloading and using pre-trained models. You can download models like BERT and use them for sentiment analysis tasks​2​.

VADER (Valence Aware Dictionary and sEntiment Reasoner): This is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains​3​. (Vader is already integrated)

TextBlob: This is a popular Python library for processing textual data. It provides a simple API for diving into common natural language processing tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, and more​4​.

swift gorge
# queen schooner One thing that has troubled me on review of the first run.. 80% of the API calls...

OpenAI has insufficient GPUs to service the demand for API calls during peak hours. This sentiment analysis of past comments does not require real time API calls. So, write a python script to launch the GPT inference calls for past data only in off-peak hours. This will reduce the GPU overload problems for everyone and reduce your API call failures/drops. Also, use the Python TRY feature to detect "429" and "overloaded" error codes to shut down the GPT inference and wait 1 hour to retry.

queen schooner
#

It ran for 4 straight days on 6500 news articles.. that kind of process won't work

#

I am willing to do API calls on final scoring analysis but to do that crawling on gpt-3.5-turbo is.. not great

queen schooner
#

it ran into 80% errors.. at all times of day. The server simply cannot handle the traffic atm

#

maybe the text-curie-0001 will be less busy

swift gorge
#

Can you install a logger that timestamps the error messages and the token count? To detect times and circumstances of failures?

queen schooner
#

I did it's how I'm aware of that information

steady kestrel
queen schooner
#

well, free to use if in development. One of the big attractions to polygon is their monthly fee to regular users. Need to find free and alternate options for users

#

there is even a free key if they're willing to tolerate the rate limitation

sage lichen
swift gorge
#

I saw in API forum there is a website that measures and charts 500 token response times for OpenAI API. Also, add error-logging with time stamp AMD token count to your script and evaluate that way.

#

"How to get a faster response time?
Choose a model with a faster response time (text-davinci-003 is usually fast and comparable in output quality to gpt-3.5-turbo, but is 10x more expensive)
Try again outside of peak hours
Reduce your max_tokens parameter
Use the Google Docs integration if you want to generate longer content such as blog posts"

queen schooner
#

I think the majority of the analysis can be done by existing python programs, and then we can push a final analysis of fewer data points to gpt which should help both issues

queen schooner
#

well I've hit a wall folks, gpt-4 code interpreter has been degraded resulting in an inability to compile functional code.

swift gorge
#

Wow. That is disturbing news. Try chatGPT (free) for writing individual Python Functions. Give description of its purpose. Specify the libraries, variable names, inputs, outputs.

queen schooner
#

Things did improve I'm troubleshooting minor details now

#

just hit my 3 hour limit I'll be back when I can compile more. It seems like there is an issue passing the keys to polygon

queen schooner
#

Some halucination but generally the code interpreter is working better now

queen schooner
#

Really close to functional model again, working on the RSI and MACD pulls and other related troubleshooting then once it's functional I'll update the git