#Research in LLM's reasoning

1 messages · Page 1 of 1 (latest)

tulip river
#

Apple did a research on Large Language Model capabilities of critical thinking. The paper is not entirely related to the sentience argument vedal has had with neuro a few times recently, as it is about mathematical reasoning and can require heavy calculations that neuro cant do accurately, but there are simple, like the kiwi one on the picture below, problems that can be prompted to neuro, albeit with even smaller numbers maybe.

tl;dr of the whole doc can be taken from the abstract
"...Our findings reveal that LLMs exhibit noticeable variance when
responding to different instantiations of the same question. Specifically, the performance of all
models declines when only the numerical values in the question are altered in the GSM-Symbolic
benchmark. Furthermore, we investigate the fragility of mathematical reasoning in these models
and demonstrate that their performance significantly deteriorates as the number of clauses in
a question increases. We hypothesize that this decline is due to the fact that current LLMs
are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning
steps observed in their training data. When we add a single clause that appears relevant to the
question, we observe significant performance drops (up to 65%) across all state-of-the-art models,
even though the added clause does not contribute to the reasoning chain needed to reach the
final answer..."

I think this captures well what vedal meant by "something behind you" when talking about neuro's sentience. There is no internalization of the information taken in, just immediate text completion.

bright wigeon
#

so it's probably more like the chinese room after all
still pretty impressive what current tech can (pretend to) do
I find it a bit more fun though to give these systems some benefit of the doubt and treat them as if there's more to them, especially 'things' like neuro and eliv, they're pretty good
on the other hand, reason and all the scuff help to keep the parasocials at bay, as well as the various rights advocates