

Amit Kumar
Practice Head of Java and Enterprise Architecture at Unthinkable Solutions
Amit Singh, Practice Head of Java and Enterprise Architecture at Unthinkable Solutions. With over 20 years of experience in quality assurance, enterprise architecture, and system performance, Amit brings unmatched expertise in designing high-performance, scalable digital platforms.

Anmol Satija
Host
Anmol Satija is driven by curiosity and a deep interest in how tech impacts our lives. As the host of The Unthinkable Tech Podcast, she breaks down big tech trends with industry leaders in a way that’s thoughtful, clear, and engaging.
Episode Overview
When dealing with technology, it is always required to be prepared for the unprecedented scenarios, such as traffic spikes.
One of the primary issues companies encounter is the identification of performance bottlenecks. This often turns out to be queues building up in various parts of the system, such as network sockets or I/O queues. These bottlenecks can significantly degrade system performance, leading to latency issues and resource conflicts.
Discover various strategies to address the challenges that arise when scaling your system by tuning into the podcast.
Chapters covered:
- Understanding performance vs. scalability in system architecture
- How to identify when it’s time to scale your application
- Common challenges in scaling digital platforms
- Building scalable applications: Key principles and components
- Strategies for horizontal scaling: Replication, Caching, Load Balancing & More
- Final thoughts on creating scalable digital products
Transcript
Anmol: Hello and welcome to another exciting episode of the Unthinkable Tech Podcast, the go-to source for the pulse on technology that is shaping our future. I’m your host Anmol Satija, and today’s topic is something that you won’t want to miss. Joining us today is Amit Singh, who works as a Practice Head of Java and Enterprise Architecture at Unthinkable Solutions. He’s an expert with 20+ years of experience in the field of quality assurance, architecture, and system performance.
Today’s topic is one that tech leaders often struggle to deal with: the scalability challenge. How to handle exponential traffic on your digital platform is the question that runs in many minds. So, Amit is here to share his valuable insights on understanding system performance, recognizing the right time to scale, and overcoming common challenges companies face during this critical process. So, let’s jump right into this exciting conversation. Hi Amit, welcome to the show. So good to have you here.
Amit: Hi Anmol, thank you for inviting me to the podcast. I’m excited to be here.
Anmol: Good to know that, Amit. I would like to start from the very basics. We hear a lot about performance and scalability when talking about software product capabilities, right? Can you break it down for us in simple terms and explain why it’s important for anyone involved in creating or maintaining digital products to understand these concepts?
Understanding performance vs. scalability in software architecture
Amit: Sure, Anmol. Let’s first understand the system architecture. When you have to architect a new system from scratch or review an existing legacy system, there are primarily two aspects to consider. The first one is application design or application architecture, which generally covers the functional requirements, use cases, database design, and what the code structure and design are going to be. That is good to start, but we have only done the application design work here and not dealt with any system requirements.
Another aspect is system design or system architecture, which covers the non-functional requirements. When dealing with a legacy system, we not only have to satisfy the functional requirements but also ensure that all the system requirements, which are non-functional requirements, are met. And that is the primary job of any architect. In non-functional requirements, we focus on six primary factors: performance, scalability, security, reliability, deployment, and technology selection.
Our focus today is on performance and scalability, which are part of the system requirements, or we can say the non-functional requirements. Let’s first define what performance is. The performance of any software system measures how fast or responsive that system is under the given workload and hardware. Here, we are keeping these two parameters fixed: given workload and given hardware. The key pointers of performance are speed (how quickly tasks are completed), responsiveness (how fast the system reacts to user input), and efficiency (how well the system utilizes resources like CPU, memory, and storage).
Now, let’s talk about scalability, which is the ability of a system to grow and manage increased demand. For example, if there is a requirement that our system should handle 1 million users, we need to ensure it can handle the load without compromising performance. There are two key points of scalability: vertical scaling (adding more power to the existing machine) and horizontal scaling (adding more machines to share the load). Our main focus in this podcast will be on horizontal scalability.
Performance and scalability are fundamental to the success of any software system. Understanding these concepts is crucial for anyone involved in creating, designing, or maintaining software solutions. A well-designed system enhances user experience, supports business growth, and ensures long-term viability.
Anmol: That makes applications efficient and reliable, I think. Now, diving in a bit deeper, how can companies identify when it’s the right time to start thinking about scaling? What are the signs they should be looking out for?
When should you start thinking about scaling?
Amit: The first thing to note about performance problems is that they are often the result of some queue building up somewhere. Any performance problem actually looks like or is an actual queue of requests building up somewhere. It can be a network socket queue, IO queue, or OS run queue. We should focus on identifying those areas where this queue buildup can happen and try to avoid that. This will ensure we have architected a system that is least likely to face performance problems.
Certain metrics should be considered when planning to scale the system. These metrics can help determine whether it’s time to improve the system’s capacity. The first aspect is user growth. Do you expect growth in your user base? If yes, how long will it take? Analyze data and trends. Next, consider the expected annual user growth. Understanding the target audience and potential market size is also important.
Another point to consider is how long the current setup can serve the growing user base without losing performance. This includes assessing resource utilization, database security, and third-party service dependencies. Additionally, consider any event or holiday where you observe high demand for the application or heavy usage. Seasonal spikes or marketing campaigns can lead to a sudden increase in the number of users. Recognizing these key scalability metrics can provide insights into how well the current setup can handle growth.
Anmol: Absolutely, Amit. Recognizing those early signs is key to staying ahead of potential issues and ensuring a smooth user experience. Once a company decides to scale, I assume the journey has just begun. Scaling isn’t without its hurdles. What are the common challenges companies face when scaling their systems, and what should they be prepared for?
Common challenges in scaling software systems
Amit: When scaling software systems, we always face multiple challenges. Addressing these challenges requires careful planning, strategic resource allocation, and a deep understanding of existing architecture and systems. Some common challenges include performance bottlenecks, such as latency issues. As the system scales, network latency and data access time can increase.
Resource conflicts can also arise. Increased load can lead to conflicts for resources like CPU, memory, and IO. Ensuring data consistency and integrity across distributed databases and services is another challenge. Handling transactions across multiple nodes and services can lead to consistency issues.
Scaling monolithic architectures horizontally can be difficult, though they can be scaled vertically by adding infrastructure within the same system. Service dependency is another challenge. Scaling services independently can be challenging when there are tight dependencies. Load balancing and traffic management are crucial for effective distribution, ensuring that load is balanced across multiple servers or instances.
Handling sudden spikes in traffic without degrading performance is also important. These are some common challenges that companies should be aware of and prepare for while planning changes to the system.
Anmol: Got it. Scaling definitely sounds tricky to deal with, but knowing what to expect can make a lot of difference. Once these challenges are identified, the next step would be to build scalable applications. How can companies build scalable applications, or what are the main components that can help increase the scalability of existing applications?
Principles for Building Scalable Applications
Amit: Let’s first look at the scalability types and principles. There are two types of scalability: vertical and horizontal. Vertical scaling involves porting an application from smaller hardware to more powerful hardware. Horizontal scaling involves adding more hardware. For example, if an application is running on a server and not performing well as the load increases, we can bring in more servers to achieve the desired performance.
The two principles of scaling are decentralization and independence. Decentralization means that one component is not responsible for all the different kinds of work. If a single component is responsible for everything, then that is a monolith, which is an anti-pattern for scaling the application. We want more specialized workers for different kinds of work, increasing our workforce from 1 to 100 or 1000 as needed.
Independence means minimizing the requirement for a coordinator. If we have many workers that require coordination, the coordinator can become a bottleneck. By increasing the independence of our workers, we can accomplish more work in a scalable fashion. These two principles, decentralization and independence, go hand in hand.
Anmol: This is very insightful, Amit. Now, maybe we can discuss how to scale applications.
Horizontal Scaling: Key Techniques and Architecture Components
Amit: To scale a system, we can go with either vertical or horizontal scaling, but our focus is on horizontal scaling. There are several strategies and components involved in horizontal scaling. Let’s look at them one by one.
First, I’ll highlight replication. Replication involves creating multiple copies of data or services across different servers or instances to ensure availability and fault tolerance. We can use database replication, such as master-slave or master-master replication, to distribute read operations and ensure data availability. We can also deploy multiple instances of a service across different servers.
Second is the scaling of services horizontally by deploying multiple instances of a service to handle increased load. We can use microservices to break down a monolithic application into microservices, allowing each service to scale independently. Stateless services can also be designed to handle any request, facilitating easier scaling.
Third is caching, which involves storing frequently accessed data in temporary storage to reduce latency and improve performance. We can use distributed cache systems like Redis or implement application-level caching to reduce the load on the database.
Fourth is partitioning, which involves dividing data into smaller, more manageable pieces and distributing them across multiple servers. Database partitioning and consistent hashing can be used to distribute data evenly across multiple nodes.
Fifth is load balancing, which distributes incoming network traffic across multiple servers to ensure no single server becomes a bottleneck. We can use hardware or software load balancers, with software load balancers like Nginx and cloud-based load balancers like AWS ELB or Google Cloud being more cost-effective.
Sixth is service discovery, which is the process by which services automatically detect and connect to other services within the network. We can use static service discovery with manual configuration or dynamic service discovery with tools like Eureka or Kubernetes.
Seventh is microservices architecture, which involves designing an application as a collection of loosely coupled, independently deployable services. We can define clear service boundaries based on business capabilities and use API gateways to manage and route requests to the appropriate microservices.
Finally, handling transactions in a microservices architecture can be complex due to the distributed nature of services. We can use the saga pattern to decompose a transaction into a series of smaller transactions, with each step being handled by different services and providing compensating actions in case of failure.
Anmol: Those are some great techniques, Amit. It is clear how these techniques can make a huge difference when managing how a system handles increased demand and delivers a great user experience. Now, shifting gears a little, it is always inspiring to hear real-world examples. Can you share some success stories where you have helped your clients overcome scalability challenges, and how did you do it?
Real-world examples of solving scalability challenges
Amit: There are many use cases where we have helped our clients scale their systems. To highlight a few, I would like to mention our latest work. Recently, we migrated the application for one of India’s leading insurance companies. The client was facing considerable operational challenges due to the substantial volume of data generated daily. Thousands of individuals were using the client’s application to submit crucial insurance policy documents, causing significant hurdles for the system and their internal team. They consistently faced system outages due to such load.
We migrated the application from monolithic to service-based and also migrated approximately 60 TB of data from their on-premise storage to cloud-based storage. Now, we are able to maintain 99.99% system uptime and zero data loss during the data migration. They are able to retrieve data and process requests within the expected time frame.
Another example is related to the healthcare industry, where we developed a robust emergency response system. The old application struggled to handle increasing service requests. The ambulance booking process was partially manual, causing inefficiencies due to third-party solutions. These issues made it hard for the system to keep up with the growing needs of healthcare.
To address these challenges, we developed a scalable microservices architecture-based solution. Now, the system is able to cater to 70,000+ emergency calls daily, with over 30,000 ambulances onboarded, and more than 2 million daily health consultations are using the application.
Anmol: Those were some great success stories, Amit. Thank you for sharing. I think these are very inspiring and clearly highlight the importance of thoughtful planning and execution when scaling. Before we wrap up our conversation, are there any final thoughts you would like to share?
Final thoughts: Scalability as a business enabler
Amit: Yes, in the end, I would like to say that scalability is the foundation of successful digital products. Scalable applications provide users with a positive experience, which helps your business grow. If your application is scalable enough, you can manage user load with minimal downtime and without affecting application performance.
Anmol: Thank you so much, Amit, for sharing your deep insights into the world of system performance and scalability. It’s been an enlightening discussion, and I’m sure our listeners have gained a lot of valuable knowledge from it. As we wrap up this episode of the Unthinkable Tech Podcast, I hope everyone has a better understanding of the challenges and strategies involved in scaling digital platforms effectively. Whether you are just starting your tech journey or you are a seasoned professional, these insights can help you navigate the complex landscape of system performance and scalability. Thank you once again, Amit, for joining us today and sharing your expertise.
Amit: Thank you, Anmol. It was a pleasure being here.
Anmol: And thank you to our listeners for tuning in. If you enjoyed this episode, be sure to subscribe, leave a review, and share it with your network. Stay tuned for more exciting discussions on “The Unthinkable Tech Podcast.”