How AI systems perceive their environment
- You may own a doorbell video camera that can tell if a human is approaching your door, or when a package has been left. Ever wonder how that works? Environmental interaction involves the processes of sensing and acting where the AI agent perceives its surroundings through sensors, interprets the data, makes decisions, and then takes action to influence the environment based on those decisions. Sources include image data, audio, video, or anything that provides comprehensive understanding of surroundings of the AI agents. Once the data is gathered, the AI system processes and interprets it to generate insights. This means that the agent can interpret what that data means, such as understanding the structure, relationships, and potential implications of the gathered data. Is the person approaching your doors someone who is delivering a package, or stealing one? Then, a feedback loop occurs where the outcomes of actions, such as mistakenly expecting a person to be stealing a package, are sensed and reassessed. And based on that feedback loop, the AI system can redefine its future responses. The system learns that the action of the person once thought to be stealing is actually not stealing, and won't make those assumptions in the future. Another example would be a fraud detection system that works on our phones, which happens to leverage agentic AI technology. Listening to a voicemail may leave the agent with the conclusion that it needs to protect the user, but the user could provide feedback to the agent that this is actually a known friend, and not actually a fraud threat. After that occurs, the agent won't suggest that your friend is attempting to steal from you again. It leverages a feedback loop to become smarter. Agents also communicate with each other. Actually, agent-to-agent communication is more necessary the more complex a goal becomes. A system that runs independently without communicating with other agents, it's not worth much. In our camera agentic application, we could have the agent in the camera communicating with an agent in another camera, or agents that exist on a remote server, your personal computer, your phone, you get the idea. So how do agents talk? They use agent communications, which is essentially all the methods and protocols the agents use to exchange information. If done correctly, this ensures that all agents work cohesively. Information exchanged may include
messaging between agent-to-agent. This is basically two agents sharing commands, data, or anything else needed, allowing them to do their jobs as agents.
Databases are shared by agents or places where persistent data is stored and retrieved. This can be any database type and brand. Direct interaction, or an agent making a durable connection to share information. This is basically the middleware of agents. Of course, this means that
agents have been programmed to coordinate. Coordination involves the strategies and processes that manage interactions between many agents. For example, one agent monitoring our doorbell video, another agent spotting objects that need to be determined, another agent still deciding on what object is seen on the video.
Agentic AI architecture has two significant components. First, the architecture that processes things within the agents, and second, mechanisms that allow the agents to talk and coordinate to work together.