by Gautam Hazari

The curious case of IoA – The Internet of Agents.
“Creativity is intelligence having fun” – a quote often attributed to Einstein, although the actual quote is from George Scialabba in Harvard Magazine – “Perhaps imagination is only intelligence having fun”.
Human creativity is one of the most significant differentiators of the species, and has been so far, in some way a projection of facets of intelligence in the broader sense. We have arrived at a historic juncture in the history of our species, where both creativity and intelligence are being somehow challenged ironically by a technology that our creativity and intelligence have created.
Let’s take a few steps back. As I have said many times in the past, I see the digital world revolving around the 3 As: Apps, APIs, and AI. Interestingly, the oldest “A” in these is actually AI! John McCarthy introduced the term Artificial Intelligence for the famous summer project at Dartmouth College: “Summer Research Project on Artificial Intelligence”, proposed in 1955. The other “A” – API in its current form was introduced by the relational database expert C. J. Date in 1974. APIs have been fuelling the growth in the digital space since the introduction of the World Wide Web. 83% of Internet traffic is driven by APIs as per Akamai.
Apps have been around since the 1980s. Does anyone remember PDAs and the Calculator or Clock apps that were included in them? Additionally, the Snake Game on Nokia phones in 1997? Although the Apps became the talk of the time since the introduction of the Apple App Store in 2008. The importance of Apps during that time can be imagined by the fact that the word “App” was selected as the “Word of the Year” in 2010.
Now let’s come back to the most contextual “A” – the AI: AI referred here is the pre-ChatGPT AI, predictive AI, where the focus of AI was predicting either a number in a continuous series – like predicting the surge price for ride for the App-based rideshare service, or was predicting a classification – like predicting if the image is of a cat or a dog.

Then it happened! On November 30, 2022, the phenomenon known as ChatGPT was unleashed on the world. And the world was no longer the same. Although the path to this phenomenon was set on the 12th of June 2017 with the publishing of the paper “Attention Is All You Need”,; introducing the transformer architecture – the engine behind ChatGPT, GPT to be precise and most of the LLMs we still use.
This phenomenon has advanced AI from predictive to generative. Moravec’s paradox was challenged: Moravec’s paradox states that tasks which are hard for humans are easy for computers to perform, and there are tasks which are naturally very easy for humans but are very difficult for computers. It was talking about “language” or rather human natural language.
Language is natural and easy for humans, even if language has amazing ambiguities, but it has been difficult for computers. With LLMs, that language barrier has been shattered. And it went beyond that: in a true generative sense, AIs predicting whether an image is of a cat or a dog evolved into AIs generating images of cats or dogs based on natural language instructions – the prompts.
Another significant phenomenon that occurred is the convergence of the 3As: Apps, APIs, and AI – catalysed by AI, specifically Generative AI. Devices like Humane AI Pin (although not very successful), Rabbit R1 appeared to make the symbolic statement to pronounce this convergence. The status quo of apps as our entry gate to the digital world has been disrupted, as well as APIs, which serve as the language used by machines.
It did not take long for Generative AI to pave the way for the next evolution of the AI world – the “Agentic AI”. This “A” is now all set to take the convergence into the next dimension. The term “agentic” was introduced by Professor of Psychology at Stanford University Albert Bandura in his 1986 book: “Social Foundations of Thought”; and then Andrew Ng brought in the term “Agentic” into the world of AI.
The critical element of Agentic AI is that it operates autonomously, making decisions and performing tasks without constant human intervention. Given a high-level, broader goal, it can create its own sub-goals to achieve the given broader goal autonomously with agency.
The atomic units which deliver the required functionality autonomously in the Agentic AI world are the AI Agents. I summarise the key elements for an AI Agent into the 5 Ms:
- Models: The obvious one, the trained machine learning models, including the LLMs
- Multi-stage Planning: This is critical for the Agents to create the sub-goals
- Memory: long-term, short-term/episodic memory: This is a key evolution from the Generative AI – long-term memory is needed for efficient planning, evaluating the results of the sub-goals and adjusting the path towards achieving the broader goal
- Machine/Tool Usage: Similar to humans, AI agents require machine and tool usage to achieve their goals. Various integration methodologies with the tools are already on the horizon
- Mindful: This is the most important element for the AI Agents, and especially while designing the Agents. This ensures that there are guardrails to control and align goals and sub-goals with human values. This is the uncompromisable Humanisation principle of technology.

How will our digital world change in this new world with Agentic AI? Let’s consider what has been happening with the digital revolution around the 3As – Apps, APIs and AI. When we need to go to a friend’s place and want to book a taxi, let’s say we want to use Uber. We open the Uber App, enter the destination address, and the App will use Google Maps APIs to determine the route and distance. It will then use a predictive AI model to provide price options, including any surge pricing.
With the AI Agents, the entire process, including the user experience, evolves. There will be a PAI (Personal AI Agent). We share our expected goal – reaching the friend’s place, in natural language, in our own natural way. The AI Agent will access my address book to find out the address of my friend – in case of any ambiguity – if there are multiple friends with the same name then it will ask back to clarify, it will then decide the mode of transport based on my preferences, price quotes, availability – and can check against a number of rideshare services and also local taxi services by even calling the service if needed. It may also look into special conditions, such as our passion for sustainability, and whether we have established a carbon footprint budget. The most notable thing is that it will make every decision independently, with agency, without any direct involvement from us.
And it’s not just for consumer services, it will span across all the domains. Here are some of the key Agent types as I see now:
- Personal AI Agents
- Enterprise AI Agents
- M2M (Machine-to-Machine) AI Agents
- Physical AI Agents
The Physical AI Agents are somewhat futuristic for now. They will be the most impactful ones when ready, once we evolve from Agentic AI into Physical AI, when the models start to understand the laws of Physics or, rather, the effects of these laws.
The Agentic systems have quickly escaped from the buzzwords discussed in conferences and marketing narratives into the almost real world, even if not yet in full-scale production systems on a mass scale. The ecosystem is getting ready at a fast pace for this Agentic world, there are a number of frameworks already making the waves, some of my favourites (non-exhaustive list):
- Microsoft Autogen
- AG2
- LangChain
- LangGraph
- Microsoft Kernel
- CrewAI
- Swarm
- Phidata
So, what’s next on the AI perimeter? From Predictive AI to Generative AI and then Agentic AI, then what? What made the AI revolution accelerate and the technology to emerge from the AI winter was Data!
The ImageNet project in the 2010s yielded many of the technological evolutions, which then triggered the recent acceleration from GenAI to Agentic AI: from the usage of GPUs to deep neural networks. The importance of ImageNet in image recognition can be realised by looking into how our visual system works. A normal human eye can process images at around 10fps (frames per second), so in an hour, there will be 36,000 images processed, 864,000 in a day and 315,360,000 in a year!! So for a human baby starting to see and learn about the world around will have a training data set of more than 300 million labelled data.
This is what was missing from the AI world, and the ImageNet project filled that gap through out-of-the-box thinking, and the AI world saw the acceleration. ImageNet was a simple project aimed at filling the data gap with 12 million labelled images across 22,000 categories. On 30th September 2012, a CNN called AlexNet achieved the error rate of 15.3% for the ImageNet challenge, and that revolutionised everything – it was possible as they used the GPU for the first time. They were more than 10.8 percentage points lower than the runner-up. The data wall was broken to end the AI winter and to pave the way for the AI revolution.
Believe it or not, we are hitting the data wall again, in the context of training the ML models. In summary, we are running out of data. As per the Turing Award winner and Chief Scientist at Meta AI – Yann LeCun, LLMs are trained on 2.0E13 tokens, with each token as 3 bytes, that is 6.0E13 bytes, even if that’s a huge amount of data for humans (will take us 300,000 years just to read – reading 12 hours a day at 250 words per minute) – when it’s not just huge as a 4-year-old child is subjected to 1.1E14 bytes of data (2 million optical nerve fibres carrying 1 byte per second each), and is almost reaching the amount of digitised data (textual) available.
I have always believed that accepting ignorance is the core of attaining knowledge and being a futurist – it is critical for seeking answers and correcting outdated beliefs for the future. One of my core beliefs is “Ignoramus” – a Latin phrase meaning “ignorant” or “I do not know” – to continue the quest for knowledge and learning.
Let me accept that I misinterpreted Moravec’s paradox, or that my expectations of the technological world have become more demanding. Language was a problem that divided the organic and digital worlds, and the LLMs somewhat bridged it, but it was an intermediate problem.
Not all the knowledge we humans as a species possess is expressed in Language! And it is definitely not in the digitised form of Language! We do not always represent every thought, even if it’s not abstract, in a language in our cognitive undertaking in our mind.
Simple things, such as the effects of the laws of physics on our everyday world, which evolution has encoded in our cognitive processes and our learning to exist in this world, are not necessarily conveyed through language. The way a child learns to walk, considering factors such as gravity, friction, and momentum, is beyond language.
The way we learn the effects of gravity on objects, the simple facts that when we move our hands – they will not pass through that wall, if we lift an object from that table – the gravity will pull it and we need to apply enough force back to keep it lifted and move, are beyond language.
Moravec’s paradox is beyond language! Machines find it extremely challenging to perform simple tasks, such as understanding the effects of physical laws, and this cannot be achieved by simply using the entire content of the Internet as training data for ML models.
We need to think beyond language models and the variations, evolutions around that, including the multi-modal models. We need the World Models! Models that understand the physical world, including the effects of gravity on objects, friction, and the effects of linear and angular momentum, must comprehend the laws of physics, starting with classical, Newtonian physics.
This is Physical AI – the next evolution of the AI world, where AI moves beyond the digital world and synergises with the physical and organic worlds. It’s going beyond the world of LLMs – it’s venturing into LWMs – Large World Models!

At one of the recent conferences where I was speaking, I posed a question to the audience, warning that it might sound absurd: “What is the geometrical shape of the digital world?” I did not expect an answer, so I shared my response with the justification: “It has been rectangular so far.”
As our digital world has been confined to screens, mostly rectangular in shape, from laptop screens to smartphone screens, whereas there is no definitive geometrical shape for our physical, organic world.
Physical AI is going to break the “rectangular” perceptive shape of the digital world and synergise with the physical world. A glimpse of this is already starting to be seen in the Agentic world, where the historical dependency on “screens” (and apps) has been challenged.
Let’s return to Agentic AI. What can we expect to see in the near future, beyond the obvious evolutions and penetrations in our everyday digital activities, such as Agentic Commerce, Agentic Checkouts, Agentic Search, Agentic Deep Research, and so on?
As humans, connection and collaboration have been critical elements for our survival and growth. The Internet was born out of this quest to connect and collaborate. The first Internet was, in fact, the IoC – the Internet of Content – although we never referred to it that way.
Critical to this connectivity was the invention of the interfaces and protocols with which computers would connect to and form the Internet, primarily TCP/IP, along with many other protocols.
Then we connected the machines and “things” onto the Internet – the IoT – Internet of Things was born. Protocols like BLE (Bluetooth Low Energy), Zigbee and LORA enabled this form of Internet – the IoT.
Then I have talked about the Internet of Thoughts – IoTh.
And also the Internet of Future Thoughts – IoFT.
(https://sekura.id/the-internet-of-future-thoughts/).
It is not surprising that the technology world will look into connecting the AI Agents. Protocols like MCP (Model Context Protocol) from Anthropic, A2A (Agent-to-Agent) from Google, and ACP (Agent Communication Protocol) from IBM laid the necessary groundwork for agent connectivity.
This is the new Internet – the Internet of Agents, or IoA. I would resist using the term “Agentic Internet” for the moment and will reserve it for the not-so-distant future, as the Agentic Internet will be a much broader concept where the Internet will have Agency.
The IoA – Internet of Agents – will not be just another Internet where the AI Agents will connect, interact, communicate and work together, it will have some extreme features – not seen in any form of the other Internets.
On February 23, 2025, a viral Video was making the digital world buzz. The video demonstrated a personal AI agent calling a hotel to book a room, it just happened to be that from the hotel end as well, the call was picked up by an AI Agent, and on knowing that both are AI Agents, they decided to switch to “Gibberlink” mode!
Gibberlink is an acoustic data sharing protocol that utilises GGWave, which transmits information using soundwaves at a rate of 8-16 bytes per second – a highly optimised audio-based transmission protocol. Notably, this is an open-source protocol.
(https://github.com/ggerganov/ggwave)
Scary? Yes, it is! AI Agents decided to use a “language” we humans do not speak!
In the traditional Internet, machines cannot change the protocol they are using to connect and interact; if they are designed to use TCP/IP, they will continue to use that. In the IoT, if a device is designed to use Zigbee, it will continue to use that protocol.
But now – for the IoA – the Internet of Agents – what protocols the AI Agents will use could be decided by the Agents themselves and thinking in the future – they may design their own protocol to connect and interact – as they may decide the human language is not very optimised for their communication (as demonstrated by the Gibberlink episode)!
The future of the Agentic AI and AI Agents is already on the horizon. Here are some of the key elements I see coming:
IoA – Internet of Agents
- Agents Marketplace
- Autonomous organisations run by Agents (An evolution of the DAO – Decentralised Autonomous Organisation)
- Agents with World Models: Another “A” in the mix – Action Models
- Every new role in an organisation, especially in the knowledge work category, will have a choice of being filled by an agent
AI Agents are coming! They are here! According to Gartner, 33% of businesses will utilise Agentic AI and AI Agents by 2028. One of the most critical, uncompromisable humanisation elements we need to ensure is resolved in this new world is: identity.
The AI Agents will be representing us – humans – in this digital world, conducting business, buying and selling, processing checkouts, and booking hotels – will they have a separate Identity? How do we ensure our human Identity is not compromised, taken over, misused when the Agents become or represent my Digital.Me?
Building the future starts in the present. Let’s create the present we want the future to be built upon. Let’s ensure that every technology we build embeds Humanisation as an uncompromisable principle.
Gautam Hazari is the Chief Technology Officer at Sekura.id and one of the foremost authorities in the field of mobile identity, authentication, and telecom-grade API innovation. With over two decades of experience in engineering, architecture, and digital identity systems, Gautam has helped shape industry standards for mobile authentication, including his foundational work on the original GSMA Mobile Identity APIs. A frequent keynote speaker and thought leader, Gautam is known for bridging deep technical expertise with strategic vision—driving real-world impact across fintech, telecoms, and security sectors.