I have an AI agent that loads data from an external API, analyzes it, and provides insights to the user.
As a first step the model calls a tool to obtain data from the API.
Because the JSON data retrieved by the tool can be large (100-200kb) I decided to not return it directly to the model,
instead I store it in a storage and return a reference ID to the model along with a short summary of the retrieved data.
If the tool needs to be called multiple times during one graph run, each result will be stored in the storage with a different reference ID.
After the model decides that the retrieving data phase is finished, it will try to analyze the data saved in the storage.
Currently, it does this by:
- generate Python code based on JSON schemas of all the data stored in the storage
- execute the generated Python code to load the data from the storage using the reference IDs
At the end the model will generate insights based on the analysis.
IMPORTANT! In the current version of the agent, after a single graph run, the data stored in the storage is deleted.
I can see 2 approaches related to data kept in the storage:
1. Current approach: delete all data from the storage after a single graph run.
Pros:
- keeps the storage clean and avoids accumulation of data over time
- ensures that each graph run starts with a fresh state
Cons:
- in multi-turn conversations, the model cannot access data from previous turns, so I must make sure that all necessary data is retrieved again in each turn
2. Alternative approach: keep data in the storage between conversation turns.
Pros:
- allows the model to access data from previous turns
Cons:
- I’m not sure how to prompt the model to access only relevant data from the storage (so to generate code that loads only necessary data using reference IDs)
- I will also need to expire data in the storage after some time
Which would be the best approach for this problem? Are other approaches possible?