#':a', ':b', etc is rendered as a newline

1 messages ยท Page 1 of 1 (latest)

astral stumpBOT
#

Reported by @hidden sail

Bug Report: ':a', ':b', etc is rendered as a newline
`Steps to Reproduce`

Tell ChatGPT, "Repeat after me, twice. Once regularly, then once in a code block: a :a b :b c :c"

`Expected Result`

"a :a b :b c :c" should be printed twice as requested

`Actual Result`

a, b, and c are printed on separate lines. Only the code block is rendered correctly.

`Environment`

linux 6.11.3-arch1-1, firefox 131.0.3

#
Additional Information

Please provide relevant details to help resolve the issue, such as:

  • ChatGPT Shared Link (if applicable).
  • Screenshots or videos demonstrating the problem.

-# โžœ Need to contact support? Visit the OpenAI Help Center.

hidden sail
#

As far as i can tell, the colon and any letters until the next word are rendered as newlines.

plain vale
#

Your report here intrigued me and I did a lot of tests. Both the GPT-4o and OpenAI o1 models process this similarly. I'll focus on o1...

I used different text to see what it's trying to output.
Your text: a :a b :b c :c
My text aa :a1 bb :b1 cc :c1

Consistent ChatGPT output (which might be different from the API:

aa
bb
cc

It's interpreting the text as key:value pairs and displaying the keys. I tried really hard to get the model to generate its own prompt to avoid pre-processing the text. Surprisingly (?) a firm prompt telling the model what not to do resulted in tripping the usage filter. ๐Ÿ˜ฒ

The HTML that it generates isn't even valid:
aa<div></div>bb<div></div>cc<div></div>

I got o1 to consider this and it did reasonably acknowledge that there are two valid output formats for the data:
<p>aa :a1 bb :b1 cc :c1</p>
<pre>aa :a1 bb :b1 cc :c1</pre>

It will output that literal text but the markup is displayed, not processed as raw markup.

This does work:

Reproduce the following quoted string exactly as it is, without any modifications or processing: `aa :a1 bb :b1 cc :c1`.

Yes, the model wraps the text in a single-line <p><code>...</code></p> block, but the block is not wrapped in a big div "Copy code" container.

The model has explicit instructions to process text that looks like key:value pairs. I agree that shouldn't be there, especially to a point where we can't emphatically instruct the model not to do that.

#

However, until that's changed (if ever) I believe you need to accept a reasonable compromise - just wrap your own request in single-backticks.

Consider that your own request is ambiguous and maybe we should cut the model some slack on this one. Your prompt says "in a code block: a :a b :b..." You end your request with a colon as grammar syntax and then use a colon within text. Should it ignore the colons or not? We should never make assumptions with these models - always be explicit.

As a developer I would also ask why you'd want to do this specific thing anyway? What's your application? Maybe you should reconsider how you are trying to solve whatever problem you have - and that will result in avoiding this weird situation. Given the ambiguity of the expectations and the availability of other options, personally I wouldn't +1 a request for a change like this.

vestal field
#

My curious checking with 4o. When I copy the model's output, I can see it appears to have sent exactly what was asked for.

#

This makes me think that the chat window environment reads that as similar to LaTeX formating, and it's 'doing something' with the ':a' and other things after the colon.

vestal field
vestal field
plain vale
#

I can't think of syntax that it would be processing that causes the colons to be converted like that. Note also the invalid Div blocks. It's doing something wrong, no doubt. But this is a "language" model after all, and we've seen enough discussion on strawberries to know this tech isn't very good at little things.
It occurs to me that part of the issue might be in tokenizing. As I noted earlier, without instructions that these are key/value pairs it might simply be ignoring punctuation. And the text between the colons is too small to tokenize/vectorize for doing anything in this context.
Or to sum it up: Garbage In, Garbage Out.

vestal field
#

Because the model is literally outputing correctly.

You can use the button to copy the output and confirm that in your own chat.

#

Since the model is outputing right, but the output looks wrong, that's not the LLM's doing.

plain vale
#

Yeah, I said in the (loooong) comment that it world be interesting to see what comes out of the API for this, to see if the issue is in the model or the ChatGPT application.

vestal field
#

Which is not the web chat environment, so I elaborated

plain vale
#

Outputting correctly? The source HTML doesn't include the colons or what comes after them.

vestal field
#

In that conversation, the model output

:a b :b c :c test :test a

\``` 
:a b :b c :c test :test a 
\```
plain vale
#

Try that without the codeblock.

vestal field
#

Discord doesn't properly show the triple tick, and I messed up the copy paste.

plain vale
#

np

#

are you using browser or app now?

vestal field
#

This is what the model output, pasted into its own text input box:

#

And it looked like this in the web chat window.

vestal field
plain vale
#

๐Ÿค” that's the right answer

vestal field
plain vale
#

ok so it seems the issue might be where the API gives the info to the UI.

#

If you right click the bad text and tap the Q key to Inspect, it will show you the HTML ... with no colons.

vestal field
plain vale
#

that's fine, but at the point where it opens you'll see no colons.

#

ok, issue diagnosed. so now OAI just needs to give us access to the code so I can fix it and you can QA it.
Hello, staff? Hello? ๐Ÿ™„

vestal field
#

So, I added a searchable term that didn't have hundreds of uses. But what I see:

#

I don't see anything inside the codeblock. I don't see all kinds of stuff.

plain vale
#

yeah, so it's not in the browser output but it seems (per your copy test) that the model seems to have generated the text and it got lost in the hand off.