#fails to exclude commentary and lies
1 messages · Page 1 of 1 (latest)
issue persists even though I tried multiple approaches
is this really a bug?
Does it have any bad concequences?
uses tokens, and makes it closer to time limit, also makes harder to make clean code and also lies
There is no time limit. Only a message number limit.
umm i tested it, it also have time limit separate then token limit
Are you sure? That is not specified anywhere.
And I don't see why they would hide it.
around 120-150 second per message to get a failure occur
probably harder to reach compared to token limit
also code cell number is limited too
That might b true
it's 20 per message
but based on number of code cells i think, not on their length
i tested multiple times to make sure none of it passed 20 almost all stuck at 20
yes but I don't see what less comments would change on that.
oh it's number limit it's separate from time limit
i mean looks like both code cells and message itself have a time limit
and probably separate token limits
yes code cells execution probably has a timeout. But, comments do not make the code take any longer to execute since they are comments
it's not execution timing it's writing timing
tokens limit is a model limitation, not a hardcoded one
execution was mostly millisecond while writing could take more than 30 seconds
I never noticrd noticed it. I had soem taking more than 2 min and it was fine
it crushes code cell but at message continue button solves it
does it run code cells after 2 min mark
that's message length, not the global token. Anyways, you could try adding a custom instruction in your custom instructions. something like:
When providing code examples or running code through the environment, please refrain from including any comments in the code.
can you try that?
and in a new chat to reload the system prompt?
i tried low length version and it still broke, i even achieved under 100 characters
i filled my custom instruction space but i can store somewhere else to put back
wait
User profile:
Hello. For the purpose of saving character length and for my personal preferences, I kindly request that you refrain from including comments in any code examples or code that is executed in the environment. This is very important to me and contributes significantly to my user experience.
user profile some space to include, does it okay if i add next to existing one
yes. It won't work perfectly, but i tested it and it stil adds less comments
here is an optical recognition feedback from gpt-4
OpenAI OCR Analysis Report
Preface:
This report is generated by ChatGPT, a conversational AI model trained by OpenAI. The report was requested by a user who is exploring the capabilities of OCR (Optical Character Recognition) in this platform. The goal is to provide a detailed account of OCR issues encountered, solutions applied, and recommendations for further improvements.
Table of Contents:
Initial Issues:
-
Fragmented and Unclear Text: Initial OCR attempts on various screenshots produced fragmented and unclear text, rendering the output ineffective for any practical application.
-
Legacy Engine Failure: Attempts to use Tesseract's different OCR Engine Modes (OEMs) led to errors, particularly for the legacy engine modes (OEM 0 and OEM 2). The error messages indicated that legacy engine components were missing.
Diagnostics:
-
Complexity of Screenshots: The screenshots may contain multiple text fonts, sizes, and orientations, along with other graphical elements that could interfere with text recognition.
-
Missing Components: The errors with the legacy engine modes were likely due to missing components in the Tesseract installation. This is indicated by the absence of the 'eng.traineddata' file.
next part
Solutions Applied:
-
Pre-processing Techniques: Applied thresholding to the grayscale image to improve OCR. Thresholding is a method that sets a certain intensity level as the threshold, turning pixel intensities below the threshold to 0 (black), and those above to the maximum value (white).
-
Use of Different OEMs: Experimented with different OCR Engine Modes to find the most effective one. The LSTM (Long Short-Term Memory) engine (OEM 1) and the default engine (OEM 3) successfully extracted text.
-
Error Handling: Implemented try-except blocks to catch and diagnose errors during the OCR operations. This helped in pinpointing the issues more clearly.
Results:
-
Successful OCR: Post-thresholding, the OCR process successfully extracted readable text from the screenshot.
-
Identified Text: The extracted text was part of this chat, which was discussing issues related to Tesseract's OCR Engine Modes.
Future Recommendations:
-
Advanced Pre-processing: Consider implementing techniques like adaptive thresholding, dilation, and erosion to improve OCR results further.
-
Update Tesseract: Installing the missing legacy engine components could enable the use of all available OEMs, providing more flexibility.
-
Alternative OCR Tools: OpenAI could consider integrating other OCR tools or libraries that may offer better accuracy or additional features.
This report aims to provide OpenAI with valuable insights into the OCR capabilities and limitations within this platform, offering directions for potential improvements.