

Danish Bansal
AI/ML Consultant, Unthinkable Solutions
Danish Bansal is an AI/ML Consultant at Unthinkable Solutions, specializing in applying generative AI to enterprise-scale data engineering problems. With hands-on experience across industries like food tech, real estate, and education, he has built LLM-powered solutions that process millions of records in real-time.

Anmol Satija
Host
Anmol Satija is driven by curiosity and a deep interest in how tech impacts our lives. As the host of The Unthinkable Tech Podcast, she breaks down big tech trends with industry leaders in a way that’s thoughtful, clear, and engaging.
Episode Overview
What if your data team could spend less time on cleaning and tagging, and more on solving strategic business problems?
That’s exactly the kind of shift Generative AI is bringing to the world of data engineering—and in this episode of The Unthinkable Tech Podcast, we unpack how.
Host Anmol Satija is joined by Danish Bansal, AI/ML Consultant at Unthinkable Solutions, who shares his experience building GenAI-powered data solutions across sectors like food tech and real estate. The conversation explores how GenAI is used to enrich data pipelines, automate structuring of massive datasets, generate insights, and empower non-technical users to query data using natural language. Danish also dives deep into ethical considerations, governance frameworks, and compliance hurdles that come with adopting GenAI at scale.
Chapters covered:Â
- What sets generative AI apart from traditional AI?
- Evolution of GenAI: From transformer models to real-world impact
- Ethical and governance challenges in Generative AI
- Measuring ROI from AI-powered data enrichment
- Long-term strategic benefits of GenAI in data workflows
- The road ahead
Transcript
Anmol: Did you know that organizations leveraging Generative AI can see productivity increases of up to 40%? In this existing episode of the Unthinkable Tech Podcast, host Anmol Satija is joined by Danish Bansal (AI/ML Consultant, Unthinkable Solutions) to explore how this innovative technology is transforming data engineering practices. From automating mundane tasks to enriching data insights, discover the practical applications of Generative AI and the ethical considerations every organization must note. Don’t miss out on insights that could reshape your approach to data!
Anmol: Hello and welcome back to the Unthinkable Tech Podcast, the pulse on technology that is our future. I’m your host, Anmol Satija, and today we’ll be covering a topic everybody is going crazy about. Any guesses?
So yeah, you guessed it right. It is Generative AI. Gen AI has emerged as a game changer across various domains of product engineering, revolutionizing the way we approach design, development, and even data management. In fact, a recent report by McKinsey reveals that organizations implementing AI in their operations have seen productivity increases of up to 40%. That is huge, right?
From automating routine tasks to generating sophisticated predictive models, GenAI is reshaping the way teams collaborate and innovate. But today, we’ll be discussing something specific: how GenAI is reshaping the field of data engineering as data volumes continue to explode and are expected to reach somewhere around 175 zettabytes by 2025. Companies are under immense pressure to extract meaningful insights quickly and efficiently.
So, Gen AI is acting as a stepping stone to streamline data pipelines, automate data cleansing, and even assist in data modeling, allowing data professionals to focus on strategic tasks rather than just doing the mundane operations. To help us explore this dynamic intersection of Gen AI and data engineering, I am excited to welcome our guest for today’s episode, Danish Wansel, a tech expert from Unthinkable Solutions. Danish brings a wealth of experience in leveraging AI technologies to derive data-driven decision-making and has been at the forefront of integrating GenAI into data engineering practices.
Anmol: Hi, Danish. Welcome to the show.
Danish: Hi, Anmol. Thanks for inviting me. I am very excited to be here.
GenAI vs traditional AI: A paradigm shift
Anmol: Great! So, let’s get the excitement rolling and dive straight into our conversation. To set the stage for our discussion, let’s dive into the specifics of GenAI. I want to ask, what really sets Gen AI apart from traditional AI? And as you have been closely following this space, how have you seen Gen AI evolve over the years in terms of its capabilities and applications?
Danish: So, Anmol, that’s a great question. First of all, it’s essential to know that it’s not just a change in a prediction or something like that; it’s a whole paradigm shift. In traditional AI, we generally classify whether it’s an image of a cat or a dog or whether there is a person in an image. However, in the context of generative AI, we actually generate text, images, videos, and audios from scratch. We are not just predicting something.
How GenAI evolved?
That is the main difference between traditional AI and generative AI. To talk about the evolution of generative AI, I first want to mention the transformer models that came before the GPT and other large language models (LLMs). The shift from transformers to large language models required significant computational resources. For instance, to train a model like GPT-3, it required around 45 terabytes of textual data, and we need dozens or even hundreds of GPU memory to handle that.
To put it into perspective, training these models requires a very large infrastructure. Not only for training but also for deployment and inference, we need a high number of GPUs. When it comes to the impact on industries, for example, in finance, generative AI can generate synthetic data for fraudulent transactions. This synthetic data can be used to train fraud detection models, enhancing our datasets.
While we have seen improvements and advancements in generative AI, it’s important to note that the quality of generated content, particularly in video, is still not up to the mark. We have stable diffusion models for generating images and voice models that can generate voices, but research on video generation is ongoing and is expected to improve in the future.
Anmol: That was a great overview, Danish. The breadth of Generative AI’s application is certainly impressive. Now let’s dive deeper into this technology and see how it is specifically contributing to data engineering. What are some of the key ways you see Generative AI enhancing this particular field?
Real-world use cases of GenAI in data enrichment
Danish: Generative AI is indeed revolutionizing data engineering, particularly in the realm of data enrichment. When I say data enrichment, I mean adding new features or information that is not currently available with the data. One of the most significant contributions of generative AI is its ability to automate and enhance the refining process without the need for human review.
For instance, if I provide GPT or an LLM with unstructured data, it can structure that data. For example, if I give a textual review, it can generate a JSON response indicating whether the review is positive, neutral, or negative, along with a rating. This structured response allows organizations to scale their data operations efficiently.
I’d like to give an example of a client in the food tech industry in the Middle East. They wanted to add a new feature for cuisines to their existing restaurant data. They had data on all the restaurants and their meals, but they lacked information on which cuisine each meal belonged to. We used generative AI to enhance this data.
While one might think this is a simple problem to solve, in data engineering, we had around 8 million meals to process. These meals were updated every hour with about 150,000 new meals, so we needed to include that in our processing. With generative AI, they were able to provide a cuisine filter with only a 20-minute total pipeline delay.
Another example is a UK client in the real estate sector. They had data on all properties in the UK and wanted to show users the best possible image of each property. They had around 10 to 15 images for each property, but selecting the best one was a manual task. We implemented a pipeline utilizing LLMs to provide the best-suited image based on recommendations.
Ethical risks and governance challenges in GenAI adoption
Anmol: Thank you for those insightful examples, Danish. It is evident that GenAI is not only enhancing data enrichment but also enabling organizations to operate at remarkable scales and efficiencies. However, with such powerful technology comes a set of responsibilities that can’t be overlooked, right?
As companies integrate Gen AI into their data engineering processes, they must grapple with a range of ethical and governance challenges. What specific ethical and governance considerations should organizations be aware of when adopting Gen AI? How can companies effectively address these challenges to harness the benefits of this transformative technology while maintaining integrity and compliance?
Danish: That’s a crucial topic to address as organizations embrace generative AI. There are several ethical and governance challenges that companies should be mindful of, especially regarding data safety. Nowadays, data is key for everything—it’s often referred to as the new oil. Protecting that data is the topmost priority for every organization, startup, and product.
Data privacy is paramount. Many APIs that you might be using, whether it’s OpenAI, Google’s Gemini, or any other model, generally save your data for research purposes. You need to be careful when sending your information out to these APIs. There are compliances like GDPR in Europe and CCPA in California that define how you can share user information without consent.
Companies can face hefty fines if they fail to comply with these regulations. Many organizations, especially larger ones, opt to deploy their models to avoid using external APIs, thereby protecting their data. However, deploying large language models requires significant computational resources. For instance, we found that accommodating just 10-15 users could cost around $8,000 to $9,000 per month.
For startups without such funding, there’s a trend towards edge deployments. In edge deployments, processing occurs on the user’s device or within their browser, meaning data doesn’t need to be sent to external servers. This way, organizations can protect their data.
Another significant concern is the phenomenon of hallucination. In traditional AI, we dealt with fixed outputs, but generative AI generates text and images. Sometimes, this leads to misleading or harmful content based on biased training data. This can result in reputational damage and ethical dilemmas for organizations.
Organizations should proactively conduct thorough audits of their data. They should implement review systems to ensure that generated data is checked by humans before pushing it into production.
Compliance is also critical, especially with upcoming regulations like the EU AI Act, which aims to regulate AI technologies based on risk levels. Organizations should prepare for these changes now because similar regulations are likely coming to other regions.
How to measure the ROI of GenAI in data engineering?
Anmol: That was quite an insightful answer, Danish. It’s clear that while the potential benefits are significant, the responsibilities that come with this technology are equally substantial. Now, shifting gears a bit, let’s talk about the tangible outcomes of these efforts. For executives looking to justify their investments in AI, how can they effectively measure and quantify the value generated by AI-powered data enrichment initiatives within their organizations?
Danish: It’s vital to quantify the value generated by AI-powered data enrichment. There are several key areas executives should focus on. First, let’s talk about development cost savings. One of the immediate benefits of integrating generative AI is the reduction of manual labor costs.
I can share a personal experience. In 2020, I worked on a project where teachers could upload images of questions, which would then be digitized into a question bank. That solution took about 500 to 600 hours to implement. If I were to implement that solution now, with the help of generative AI, I believe we could do it in around 200 hours.
The second aspect is faster time to market. Initially, development took 600 hours, but with AI, we can significantly reduce that time. According to research by McKinsey, companies leveraging AI in their operations can see a 50% improvement in time to market.
Another critical metric is enhanced data accuracy. Traditional data handling is prone to human error, but AI can help achieve accuracy rates exceeding 95%. And let’s not forget about customer satisfaction. Ultimately, the goal is to improve the customer experience. When you enhance the accuracy and relevance of your data, you’re likely to see improved customer satisfaction and retention.
Anmol: Thank you for breaking that down, Danish. Those are some key metrics that every organization should consider while measuring AI’s impact. Now that we have explored the immediate value and metrics associated with AI-powered data enrichment, what strategic or long-term benefits should executives anticipate from integrating Gen AI with their data engineering practices?
Strategic, long-term benefits of GenAI adoption
Danish: That’s a great question, Anmol. When we talk about long-term benefits of integrating Gen AI with data engineering, it’s essential to understand that this technology transforms organizations operationally as well.
First, let’s address a misconception: generative AI itself doesn’t analyze data in the traditional sense. However, it is highly effective in generating scripts, recommendations, or code for models that can help organizations analyze data. By integrating with existing data infrastructures, generative AI can streamline complex workflows and automate tasks, allowing data teams to focus on higher-value strategic tasks.
The future of data workflows powered by GenAI
For instance, we developed an application for a client that needed a system capable of understanding queries in human language while providing output from their own data. C-level executives know what they want to know, like revenue breakdowns, but they don’t know how to code that. We created a portal where they can simply write their requests, and generative AI would convert that into code, run it in the data warehouses, and get the numbers.
This allows organizations to explore new ideas and market opportunities in ways that were previously unimaginable. Imagine being able to simulate various product scenarios based on historical data before launching a product. This iterative testing allows companies to refine their offerings to meet customer demand right from the start.
Another significant benefit is scalability and agility. As businesses grow, their data needs become more complex. Generative AI can help them scale their data operations efficiently, handling larger datasets without a corresponding increase in manpower. That’s one of the most profitable aspects of using generative AI. Once you create a solution that can run on either 1 GB of data or 1 terabyte of data, you just need to increase the computational resources, and that’s all.
Let’s not overlook the competitive advantage. Companies that effectively harness generative AI can stand out in their markets. They can deliver better customer experiences and respond to market changes faster, which can increase their market share.
Lastly, integrating generative AI fosters a culture of innovation within the organization. Teams can think more creatively and use AI tools to expedite their work. It encourages a mindset of experimentation and agility.
Anmol: I think it’s fascinating to see how Gen AI can not only address immediate challenges but also lay the groundwork for long-term strategic advantages.
As we wrap up today’s episode, I want to leave you all with a thought: the journey of innovation is often filled with challenges, but it is also where the magic happens. Gen AI has the potential to unlock new pathways for creativity, efficiency, and growth.
Thank you, Danish, for your insights. I loved talking to you, and it was a pleasure to have you here.
Danish: Thank you, Anmol, for having me. I really enjoyed the conversation, and I hope my experience has addressed at least some curious minds.
Anmol: For sure! And thank you to our audience for joining us. We hope you are inspired to take the next step in your AI journey. Don’t forget to subscribe and share your thoughts with us. Until next time, keep listening to the Unthinkable Tech Podcast. Take care and goodbye!