Posted By Wanda Rich
Posted on April 12, 2024

AI and Open-Source: How AI Architect Anjanava Biswas’ Latest Research is Shaping Open-Source Adoption
December 06, 2023
By Mae Cornes
Amazon AI engineers like Anjanava Biswas are helping make open-source AI mainstream through cutting-edge research
The year 2023 was marked by a whirlwind of artificial intelligence (AI) innovations, most especially with OpenAI’s ChatGPT going mainstream. People saw a long list of generative AI models, including Meta’s Llama, Google’s Bard, and many other large language models (LLMs) with varying abilities. The industry just witnessed its most prolific technical conference, AWS re:Invent 2023, from November 27 to December 1, 2023.
The conference drew a staggering attendance this year, with more than 50,000 professionals from around the world pouring into Las Vegas. The conference focuses heavily on AI and Machine Learning, and introduced a slew of products now powered by generative AI. Amazon is throwing its hat in the generative AI ring with its own home-grown Titan LLM, Amazon Bedrock, Amazon Q, and a large selection of open-source LLMs made accessible to anyone looking to use these powerful models via the cloud giant’s platform.
While there was much excitement about these models’ capabilities, there were also questions about what they could and could not do. Much of this is tied to the fact that it has been historically challenging to integrate machine learning models into traditional software due to the steep learning curve involved in using them.
However, it was different this time. Not only are capable models available to solve many real-life problems, but a host of open-source tools are now being made available that simplify using these models.
Although several AI-related sessions were listed on the conference’s website, the most sought-after sessions were related to document processing with AI. Automated document processing has emerged as a leading AI use case over the past few years, mainly because people in different industries are increasingly interacting with documents. This is especially true in the banking and financial services industry, where organizations often have to deal with large amounts of paperwork.
Spotlight on Anjanava Biswas’s Open-Source Contributions
During the sessions related to document processing, the highlight was the research work done by AI engineer and solutions architect Anjanava Biswas. Biswas presented his work in two separate sessions that week, entitled “Intelligent Document Processing at Scale with Generative AI,” which drew almost 400 attendees spread across the two days, and many were turned away due to the sessions being at capacity.
Biswas delivered an insightful presentation on his contributions to an open-source system related to a computer vision-based AI model named Amazon Textract. He dove deeper into the science behind the AI model and the open-source system’s utility to a room full of researchers and engineers.
In another session entitled “Enable Generative AI Trust and Safety with Amazon Comprehend,” Biswas’ research and contribution to a popular open-source generative AI framework was highlighted. This contribution helps implement AI trust and safety through a well-known generative AI open-source framework called LangChain. This tool uses a natural language processing AI model called Amazon Comprehend to help enforce trust and safety mechanisms for software utilizing LLMs.
Both Biswas’ AI open-source tools and his contributed tools are available via the Python Package Index (PyPi). PyPi is the most popular repository of software for the Python programming language, through which these open-source tools are made available to the public.
According to PyPi’s publicly available statistics, Biswas’ contributions to the open-source tools have drawn a large audience, with the software download numbers rising from a few thousand to tens of thousands and, in some cases, up to a million downloads per week.
For instance, according to pypistats.org and piptrends.com, one of Biswas’s AI open-source software, named amazon-textract-textractor has an average of 1500 daily downloads. Another popular open-source software to which Biswas is a major contributor, named LangChain, has an average 200,000 daily downloads.
Going by the sheer numbers, it is clear that there is a growing appetite for open-source software that others can utilize not just for research purposes but also to build commercial applications that solve a plethora of real-life challenges with AI.
Biswas’ open-source contribution is noteworthy and is considered a driving force behind the safe adoption of both open-source and proprietary AI models. What particularly enhances the value of this contribution is the transparency and public accessibility of the source code for these tools. This openness promotes transparency and establishes public trust in AI software.
Not only are these open-source tools an asset to the AI community at large, but they also come with highly permissive licensing terms, making it possible to customize them to fit any use case, audit them with traditional audit-trail mechanisms, which is essential for accountability, and address vulnerabilities, if any, through the help of the open-source community.
The Future of AI and Open Source
In his sessions at the AWS re:Invent 2023, Biswas provided insights into the future of AI and open-source technology. They underscored how AI tools and open-source adoption are transforming various sectors. Open-source frameworks, enhancing AI’s accessibility and scalability, are leveling the playing field for AI development, allowing more organizations to innovate and deploy AI solutions.
Experts like Biswas are significant in this evolution, guiding the development and deployment of safe AI with an emphasis on ethics and responsibility. Their research and contributions are essential for ensuring that AI systems are not only advanced and practical but also fair, transparent, and accountable.