gpt-4 worse than gpt-3.5-turbo-0613 for python code generation | OpenAI | Page 1

I am trying to implement this https://python.langchain.com/docs/integrations/toolkits/python
However, when comparing the gpt-4 and the gpt-3.5-turbo-0613 the newer model is unable to solve the task. while the old model does.

These are the responses I get from both models whne asking for the 10th fibonacci number:
gpt-3.5-turbo-0613
```Invoking: Python_REPL with `def fibonacci(n):
if n <= 0:
return 0
elif n == 1:
return 1
else:
return fibonacci(n-1) + fibonacci(n-2)

fibonacci(10)`

The 10th Fibonacci number is 55.```

gpt-4
```Invoking: Python_REPL with `def fibonacci(n):
if n<=0:
return 'Input should be positive integer'
elif n==1:
return 0
elif n==2:
return 1
else:
fibo=[0,1]
while len(fibo)<n:
fibo.append(fibo[len(fibo)-1]+fibo[len(fibo)-2])
return fibo[-1]

print(fibonacci(10))`

34
The 10th fibonacci number is 34.```

Any idea why is this?

Python Agent | 🦜️🔗 Langchain

This notebook showcases an agent designed to write and execute python code to answer a question.

#gpt-4 worse than gpt-3.5-turbo-0613 for python code generation