Developer of the Natural Language Interface behind Apple’s Siri
Babak Hodjat is known as the primary inventor of the natural language technology behind Apple's voice-controlled assistant, Siri. His unique approach to natural language interfaces enabled AIs to 'listen' to commands without needing to understand the entire language.
In 1997, under the guidance of Professor Emeritus Makoto Amamiya, Hodjat began publishing his groundbreaking works on a key aspect of that technology, called Adaptive Agent Oriented Software Architecture (AAOSA). This pivotal work laid the foundation for his startup, Dejima, which Hodjat founded with friends from Kyushu U. In 2012, he began a new venture with Sentient Investment Management, where he introduced the world's first AI-run hedge fund: a pioneering project utilizing the world’s largest evolutionary AI.
Today, Hodjat conducts AI research and development at Cognizant. He focuses on exploring the potential of artificial life to address pressing global challenges such as countermeasures for the climate crisis. Through his work at Cognizant, he endeavors to find innovative solutions to the world’s problems and contribute to the advancement of AI technology.
* This article appeared in Issue 5 of CONNECT, published in February 2024.
How did your research enable Siri’s development?
Dejima played a vital role in the CALO project that ultimately led to the development of Apple's Siri. CALO stands for “Cognitive Assistant that Learns and Organizes” and it was a project that attempted to integrate numerous AI technologies into a cognitive assistant.
In our work, we developed an AAOSA system, of which I was the primary inventor. This system involved mapping text converted from speech waveforms to functions and user interactions. In this approach, an 'agent' represented a specific function, like managing volume controls on your radio, controlling TV channels, or playing back programs. These agents "listened" to language commands from the user and staked claims on how they could assist, which ultimately resulted in a coherent response to the user's intention.
What set this approach apart from prior methods was that the system didn't require a complete understanding of an entire language. The AAOSA system was employed in the original Siri as its natural language understanding component, effectively mapping natural language commands to Siri’s functionality.
What were some memorable moments for you at Kyushu University?
I arrived at Kyushu University in April 1997, and it was a remarkable experience. Initially, I was enrolled in a Japanese language course for six months and commuted to the main campus on a bicycle. The course not only taught me Japanese but also provided valuable insights into the life and culture of Japan through exceptional instructors.
Was it difficult to learn Japanese and adjust to a new country?
The Japanese people are known for their remarkable kindness and generosity, but for a newcomer, it can be a bit challenging to decipher subtle cultural cues. Language can be learned, but understanding the intricacies of cultural signals can be elusive at times, leading to comical misunderstandings. Nonetheless, I had a supportive network of friends, instructors, and Professor Makoto Amamiya, who guided me through this journey.
Could you tell us more about your work with Professor Amamiya?
Even before coming to Japan, I had been corresponding with Professor Amamiya, a leading expert in distributed AI. He graciously invited me to start working at his laboratory, despite myself still learning the Japanese language.
I had a clear vision of scaling up AI technology to make it accessible to more people, and Amamiya encouraged me to try challenging and novel methods. Without his support, I might have been a lot more conservative. I had always been a more practical scientist, but Amamiya’s insistence on the soundness of the math and proof enabled the strong foundation for my work.
After your involvement with Siri, what other projects have you pioneered?
After Siri, I felt a desire to explore new avenues beyond natural language processing. A colleague proposed the idea of using AI for stock trading, which intrigued me. Between 2007 and 2008 we established Genetic Finance, where we built the world's largest evolutionary AI system. We utilized spare computing resources from internet cafes and game centers around the world to run simulations and develop algorithmic approaches.
By 2009, we started trading with real money and later launched a successful hedge fund. However, we decided to remain a technology-driven entity and spun off Sentient Investment Management. Another notable spin-off, Evolv.ai, focuses on website optimization and is widely utilized today.
How important is Integrative Knowledge to you?
Integrative knowledge is crucial, as it lets us bridge different disciplines and leverage diverse perspectives. I had the opportunity to lead a workshop organized by the World Economic Forum that brought together AI scientists and neurologists. These small, interdisciplinary interactions have proven to be immensely fruitful, as they facilitate the exchange of ideas and generate innovative collaborations. Combining expertise from various fields provides valuable insights and helps us navigate AI’s complexities responsibly.