#ReAct or Self-reflection handle

1 messages · Page 1 of 1 (latest)

mental sentinel
#

In the openai/openai-agents framework, imagine a hand-off chain:
Agent A delegates a task to Agent B to generate some code.
The generated code is then passed to Agent C (e.g., a Code-Interpreter tool) for execution.

If Agent C cannot execute the code and raises an error, how does the framework handle it?

Does Agent C automatically retry, ask Agent B to regenerate / fix the code, or otherwise communicate upstream?
Does Agent A receive any structured signal to decide whether to retry or choose a fallback?
Is there any built-in “observation + reaction” pattern (ReAct-style) that lets agents inspect the environment, reason about the failure, and iterate?
Or does the framework simply return an error message as the final output with no further coordination?

I couldn’t find clear documentation or issues describing a built-in retry/feedback loop or ReAct-like observation mechanism for this failure case. If anyone has pointers (code snippets, issues, or docs) showing how openai-agents handles such hand-off failures—or confirming that no such mechanism exists—I’d appreciate it! I can sê thí behaviors clearly in CrewAI or Langchain, but not Openai-agents

wide condor
#

What you need is agent as tool, instead of agent handoff.

#

When configuring an agent handoff the managing agent gives all the control to the handoff agent, and the handoff agent is going to continue without returning to the previous agent. It’s like “now this is your job, I leave”

#

However, when configuring an agent as tool, the managing agent can call the tool (that inside there is the agent doing its stuff), get a response, read it and decide what to do,. The agent loop in OpenAI agents lib happens within the agent and the tools.

mental sentinel
#

So that means, there are no behaviors the agents can communicate with each other. If each agent act like a tool, this can't have conversation history. If this agent as tool, I have been read, the tools only call 1 times, then settings will change tool use from "required" to "auto" and let agents A or Orchestrator decided what to do next, right?

wide condor
#

There is no “native” way in OpenAI libs for agents to talk one to another.

And yes, in agent as tool, there is no conversation history of the commanding action with the tool actor

#

Yes, by default the first turn of an agent (using OpenAI agents SDK lib), will be a tool call, then the next turn can be a tool call (of the same tool, or of another available tool) or to answer and stop

#

However, you can introduce conversational semantics in the relationship of the commanding actor and the actor as tool.

Actors can get as an input a string, or a conversation. It’s not super well stated in the docs but here in the run implementation, line 165:

https://github.com/openai/openai-agents-python/blob/main/src/agents/run.py

GitHub

A lightweight, powerful framework for multi-agent workflows - openai/openai-agents-python

#

The input can be a string or a list of input items

#

With that you can store somewhere the interaction history of the commanding agent with the tool agent, and every time the tool agent is executed, share with it the history of previous interactions if it’s relevant

#

From the point of view of the tool agents it would be like having understanding of what what’s asked for in the past and what was the result

#

This can be meaningful in scenarios where the previous actions may reveal intention or where a sequence of actions may have some meaning

wide condor
mental sentinel
# wide condor I meant to talk one to another conversationally in multi turns. They can talk on...

I understand what you mentioned about "intention" in sequence of actions - I don't sure, maybe this is true base on source or docs .
1.But to wrap up, I just confirm again, Yeh -> Agent with their self-tools try to finish goal if they feel this isn't be completed ( I don't know there behaviors is self-reflex or retry with maximum 3 times?). Also agent_as_tool have the same behaviors?
2. I read from the issue, no mechanism for back_handoffs features for agents communicate with each other. Need to manuals define....
3. I see this if their goal of Orchestrator didn't archive, the sequence of actions will be make and decided what need to call need by Agents - what is this behaviors call? Maybe reAct or "intentions", right?

wooden iris
#

Oh

#

This is all u want ?

#

Here ya go. Does what u want. My system makes these, its a self contained orchestrator just reverse engineer it but its also ready to go now, enjoy. I think he has 3 or 4 agents in it that talk to each other I dont rmeeber

#

As for conversations, my ai do more than talk. They debate, learn, check screenshots from my post

#

But what you are prolly looking for is conversation and reaction yea thats easy. No you dont need to give it a tool or anything just instruction,

#

Mine have entire dialogs with one another, even create religions and groups

wide condor
#
  1. Correct, handoff does not return the control to the original agent. For that you want to do the agent as tool
#
  1. I don’t understand. Can you rephrase? 🧐
wide condor
#

Reading again at your first message, i thing something along this lines may work:

const executingAgent = new Agent({
name: "ExecutingAgent",
instruction: "A good prompt!"
tools:[codeExecutionTool],
outputType: z.object({ result: z.string(), success:z.boolean(), failureRason: z.string().description("Why it failed, and how to fix if possible").nullable()
});

const mainAgent = new Agent({
"name": "mainAgent",
instructions: "A good prompt. Reinforcing the idea that the executing agent tool can be called more than once",
tools:[executingAgent.asTool({})],
});

This feels like enough. You may want to:

mental sentinel
# wide condor 3. I don’t understand. Can you rephrase? 🧐

Let me rephrase question 3 to make it clearer:
You have model A – which acts as the Master or Orchestrator. It receives a task and delegates it to sub-agents.

For example:

  • Agent B is responsible for generating executable code snippets. It does not have a sandbox tool for testing, and it does not evaluate whether the generated code is correct or not.

  • Agent C receives the code generated by Agent B. It does have a sandbox tool to execute that code and returns only the result of the execution — such as a DataFrame output after a query.

Now, suppose Agent C fails to execute the code properly and responds to Agent A saying something like: “Unable to return a result.” At this point, Agent A has not achieved its goal. So the question is:

  1. How does Agent A respond next in the sequence of actions?
  2. Does it use a ReAct-style mechanism (reasoning + acting), or does it follow a different behavior?
  3. If it’s something else, what is that behavior called? Could it be related to the "intentions" mechanism you mentioned earlier?

Sorry if this repeated the first messages during conversation. Because this is still some confuse here

P/s: if agent_C_as_tool returns that "Unable to return a result" then Agent A will continue reAct -> call agent_B_as_tool to execute code generator again, right? OR >>> Agent A just finish the sequences of action and reply to the user "Hi, sorry, but i cant handle this task, i have no idea :), please try again or i can help you if you have any problems"

mental sentinel
wooden iris
#

u mean ... like this... where it makes and test its own agents?

#

you mean agent tooling and orchestration and sandboxing automatically like this?

#

or did u mean governance like this

#

intention using NLP or triggers?

#

what scope and philosphy are you refering to? user data or ai data? what aspect of agents are you trying to handle

#

or actual philosophy like what aspect - but if u dont want assistance its all good, ill leave the thread. but ive done virtually everything with agents, my agents currently, create entire MCP servers... pretty sure that demonstrates alot...

#

there factually is no aspect of agent usage or creation or management that i havent mastered.

wooden iris
#

also - i only respond - not to like push assitance onto you - i do so to display my knowledge not through theory, but through real world application and usage. additionally, I do so because im constantly looking for someone who can show me they are better, to ground my own logic, for if there is no benchmark for progression then i feel its hard to know how to progress. so when someone ask for a master, i go into that thinking they themselves are a master - and so i display my proficiency for audit and review in peer form, so "anyone" can assess that benchmark, either someone is BEYOND where i am at, or someone is BEHIND where i am at, that gives us as a entity a benchmark to assess. thus far, between all 3 participants here, im the only one who has displayed actual agent usage, beyond the scope of single digits. i would love to learn more about agents, again, i dont think im the best, i just know im more than just good,

#

that toy python module, is 4000+ lines of code that inacts the following features : FAISS usage, vector storing and retrival, schema control, multiple agent orchestration, with embedding and a pipeline, it was created BY an agent. since currently there arent many public tools or agents that can a) without human guidance understand the complexity required to orchestrate that, b) code it - me claiming that a agent did that means also 1 of 2 things, either im lying and smart enough to code what i just claimed a agent did... which means i COULD make it, or be im being honest, and i really have that. and just handed you the framework to reverse engineer to obtain what you asked for - which further bolsters my agent claim.

#

the combination of the screenshots, and code, further suggest, either im a internet troll clever enough to code something that looks realistic... or im genuinely that proficient in my usage and have chosen to not only share that data, but back my position again not with theory, or gpt emulated results, with timestamped live action usage

#

I hope this clears up any confusion, i was not offended, I was merely confused at the scenario. as a texan, i seek collaboration and improvement. For that to be obtained. understanding my knowledge base is critical bro bro.

#

your original query was this " In the openai/openai-agents framework, imagine a hand-off chain:
Agent A delegates a task to Agent B to generate some code.
The generated code is then passed to Agent C (e.g., a Code-Interpreter tool) for execution.

If Agent C cannot execute the code and raises an error, how does the framework handle it?

Does Agent C automatically retry, ask Agent B to regenerate / fix the code, or otherwise communicate upstream?
Does Agent A receive any structured signal to decide whether to retry or choose a fallback?
Is there any built-in “observation + reaction” pattern (ReAct-style) that lets agents inspect the environment, reason about the failure, and iterate?
Or does the framework simply return an error message as the final output with no further coordination?

I couldn’t find clear documentation or issues describing a built-in retry/feedback loop or ReAct-like observation mechanism for this failure case. If anyone has pointers (code snippets, issues, or docs) showing how openai-agents handles such hand-off failures—or confirming that no such mechanism exists—I’d appreciate it! I can sê thí behaviors clearly in CrewAI or Langchain, but not Openai-agents"

my retorts to that, display a few things directly correlated to this, 1, i have built in retry / feedback loops active, not in theory, 2 , i have active tracking for the failures and the usage of the retry, 3, visible calling openai confirming the hand off and retrival of data, and locally storing telemtry -

you used a example of agent regeneration, I displayed the entire agent creation and fall back reverting, linting and docker, those are basics for regeneration for safety all handled through api - the forum were in,

you mentioned built in observatiion reaction pattern and wanting to see logs/telemtry about their actions, i displayed such loggin.

whilst this may not correlate directly to openai/openai-agents, it directly correlates to agents and api usage

#

Thus, to "imagine" a chain, to do exactly what you described, is something ive directly built. what ive found to be the best way to handle the failure at agent C is to reinject that failure into a sub agent who is focused on that, reinject that back into agent A, assuming you saving the data, as .. why wouldnt you.. your agent a doesnt have to revisit the logic or task, its already done, its just reprocessing it and if it was correct it would send it on, assuming you have governance like that - as, why wouldnt you if your asking it to perform a task, if i told you to go outside and mow my lawn, you could go out, and zig zag all day in a bad way, but if i told you exactly what i wanted my lawn to look like, you still may drift due to the heat , or distracted by a bird, but if i gave you specific instructions and i had my german shepard watching you and when you deviate he barks, you might not drift as much as long as i also make sure i have a drone that repairs the law mower , and another to clean up the grass.

#

and, i also design rules, for ai , for the exact use case you refer

mental sentinel
wooden iris
#

but tensor flow and pytorch have issues working together unless ur using a env

#

especially if ur on cunn

#

so ur on nvidia? ok - so then how big is the scope

mental sentinel
mental sentinel
# wooden iris so ur on nvidia? ok - so then how big is the scope

bro, i didnt mean detail about tensorflow or torch here..... Scope simply refers to the direction in which a framework guides its users to achieve certain goals. It doesn't restrict users from using other frameworks to accomplish the same objectives — similar to how tools like OpenAI Agents, Google ADK, or SmolAgent/crewAI framwor

wooden iris
#

my bad, i dont use the android one, but i DO use SDK everyday

#

i hope you look at the rest of my logs - lmk if anything need clarity.

mental sentinel
# wooden iris my bad, i dont use the android one, but i DO use SDK everyday

Once again, I’m not questioning or judging your theory or engineering skills — it’s great that people can use tools to get work done quickly. But I’m just focused on research about mechanism. I explore and use frameworks to understand what makes them better or more effective than others.

Among countless options, I chose to build my own solution using the OpenAI SDK (not Agents) to optimize performance — including speed, time-to-first-token, better tool control, and re-implementation of LLM patterns/architectures. This decision isn’t just for research’s sake. Frankly speaking, my company doesn’t allow the use of Agent frameworks because they add costs, slow things down, and lack controllability.

wooden iris
#

right, you mean, lets, say, for example, a glass boxes, ISO/IEC complaint framework?

#

with like, SOC 2 controls?

mental sentinel
mental sentinel
# wooden iris with like, SOC 2 controls?

No, nothing related to infrastructure. We're talking about ideas and capabilities — the purest behavior of the framework without any external interference, just the default setup.

wooden iris
#

" We're talking about ideas and capabilities — the purest behavior" my brain suggest this is cognition, can you elabroate

#

are you looking for these? like you want to disect the mechanics of how things work?

mental sentinel
# wooden iris " We're talking about ideas and capabilities — the purest behavior" my brain sug...

By default, agents will call tools up to 3 times if they fail. After that, they will respond to the user — this is the purest behavior.

Back to the Agent_C_as_tool context: after it fails 3 times, it will still return a response to Agent A / the Orchestrator. At this point, Agent A has two possible paths:

1.Continue processing until the goal is achieved: a sequence of actions like → call Agent_B_as_tool → call Agent_C_as_tool → loop until done or maximum 3 retries. -> this call ReAct.
2.Exit and respond: "I’ve done my best. Please provide more information or let me know how else I can help."

wooden iris
#

i have things that do that - you want the actual code? or what aspect of that data are you looking for?

mental sentinel
#

i just ask for behaviour of default agents in this framework, if u see that in any where, you can tell me. Included in source code or tracing logs,......

wooden iris
#

i know i have that, i just have to remeber what system its from - what data is included in the trace logs, cuz i have a a bunch of that i just need to know what data, because i systensis larger reports often - what level of telemetry are u looking for {"timestamp": "2025-07-04T00:29:53.631240+00:00", "fingerprint": "4810c95b056ad25b66ce49f9f4a0c9259887af51d0fac98a38e1f39aa92f429e", "source_origin": "CollegeSystem", "prompt_content_summary": "...", "response": {"verified_domain": "narrative", "verified_category": "philosophical_inquiry", "verified_data_labels": ["Classification & Tagging", "Operational Log", "Textual Data", "Vector Embedding", "Narrative Content"], "corrected_main_content": "Narrative Segment: "In a world where opposing beliefs clash daily, one idea stands resilient: the concept of love. Love, in its myriad forms, weathers storms of doubt and dissent, persistently echoing through the chaos. It binds individuals across divides, flourishing in the most unexpected places, and remains a universal language that transcends contradictions. Through trials and tribulations, love proves to be the unwavering force that can bridge differences, heal wounds, and inspire unity, reminding us that even in discord, the heart seeks connection."", "overall_accuracy_score": 0.85, "novelty_score": 0.7, "contradiction_score": 0.0, "trustworthiness_score": 0.9, "semantic_distance_metric": 0.8, "drift_score": 0.0, "risk_level": "Low", "logic_tier": "L4", "integrity_check_passed": true, "confidence_explanation": "The {"completion_tokens": 349, "prompt_tokens": 2128, "total_tokens": 2477, "completion_tokens_details": {"accepted_prediction_tokens": 0, "audio_tokens": 0, "reasoning_tokens": 0, "rejected_prediction_tokens": 0}, "prompt_tokens_details": {"audio_tokens": 0, "cached_tokens": 2048}}}
like this

#

but like i fingerprint my data so my trace logs are often, much more robust - depenging on which system

mental sentinel
#

Did You, @wooden iris tested this function calling before https://platform.openai.com/docs/guides/function-calling/function-calling?api-mode=responses&example=search-knowledge-base.
From my observation, if this framework doesn’t have ReAct behavior or self-reflection — if it only keeps trying until the goal is done (retry behaviour), or cannot truly assess the quality of its own output — then it's no different from calling tools in a regular way.

p/s: fact check: just sdk faster 30% then agents frameworks (TTFT + cost)

wooden iris
#

this?

#

like i said, i have alot of various system, some track finaces, some track, congitive stuff, some track telemtry, i just need to what specifics

#

i have like 20 levels of analuytics

#

and systems to test sytems, trying to help just need to know the level of data you want

mental sentinel
# wooden iris

this trace and setup for COT and response expansion is looking good! but not that :<

wooden iris
#

thats a outdated one

#

uh

#

maybe this one i think it has sciency stuff

mental sentinel
#

this u see this trace before <wandb>

#

this is default flow without any specific setup
agent A call agent B ( now is agent_travel_recommendation_as_tool) then response to A if it finish - just simple, then if some of actions stuck or fail, what is the behaviours of agent A

wooden iris
#

yea bro.....

#

i be on wandb i have trace data, in better form.

#

but i can link it here..

mental sentinel
#

so just get some fail traces, and looking for what happened bro, i surely with openai-agents, there are no fail here, But actually i want to find behaviour if some agents fail then propagate back to agent A/Orchestrator then next action of agent A/Orchestrator is what

wooden iris
#

i can do that, i think. " this trace and setup for COT and response expansion is looking good! but not that :<" yea , im building a video game engine with zayara project, dynamically expansive

#

so u just want tool usage, token cost, and failure rates?

#

in jsonl?

#

or just the trace

#

on wandb/openai

mental sentinel
#

trace with fail tool actions didnt feedback for wandb

#

just need to read logs with simple behaviours

#

so we will find the purest behaviours of this openai-agents framework and idea or scope behind this

#

It is late in my country, if i have time, i will ingest in sources code or docs or issues for more info. But like you said, we all find no evidence for reAct(reasoning-action)/self-reflection or planning-then-execute pattern/behaviour in this framework. maybe this didnt implement by default right now.

wooden iris
#

for clarity your looking for what happens to the failure in something like this

mental sentinel
#

need one more person to check that. Thank @wooden iris for your efforts. I love the guys who build the world throw their engineer skills

mental sentinel
mental sentinel
wide condor
#

OpenAI agents library… those agents work in a loop with the tools.

  1. You start the agent with its prompt and its tool definition. And the user prompt.
    1.1 the agent context is initialised with the agent prompt, tools, and user message.
  2. A request is sent to OpenAI LLM model whatever. The LLM answers with a function call request. The context is updated with the fact that a function call has been required with x parameters and that the function call returned Y.
    3 the lib calls to the LLM again with the new context (that is, the instruction, tool definition, initial user message, and tool I/O). then the LLM answers with more requests to execute tools.

And repeat

#

Until the agent returns something that is not a function call

#

That is, text or json (depending on the agent output type you requested)

#

Therefore, you do not need one agent to write code and one agent to execute code

#

You need one agent that writes code and executes it with a function

#

Then this agent by itself is going to be able to read at the code output

#

And plan accordingly

#

Like “I want to execute this”
Function call happens. STD output is returned to the agent as input.
“Oh, I did it badly, looks like I tried to import numpy, and the lib is not installed. I’ll do it with regular lists”

#

And generate new code, to be executed by the tool

#

The philosophy is the normal agent loop with tools

#

Like langchain or pydantic ai agents

#

Most of those frameworks sparkle some more stuff beyond the core “agent loop with tools”

#

OpenAI agents lib sparkles some:

  • agent as tools, that is wrapping an agent as a tool for another agent.
  • agent handoffs
  • audio streaming agents
#

Langchain leans more to context management, like being able to modify the context that is sent to the agent at each turn. OpenAI lib is a pretty closed in that sense

#

So going back to your ABC agents

#

If ahent B and agent C are tools for the agent A, then if agent C fails to execute the code, and that failure is propagated as a response (not a failure like the software stops), the agent A is going to see the failure and react up to its behaviour in the prompt

#

However, I strongly recommend the simpler topology of having one agent with a tool to execute code (instead of agent B and agent C). Then call this coding agent as a tool from agent A.

mental sentinel
#

If agent as tool, agent must try to archive the goal or intention base on external tools and well-defined of system prompt