From Performance Tuning to System Design: The Power of Asynchronous Workflows
Unlock the power of asynchronous system design. This article revisits a classic interview question, shifting focus from code performance to building scalable, responsive applications.
We're excited to announce Aiven Inkless, the first production-ready open-source implementation of Diskless Kafka, is now Generally Available (GA)! It revolutionizes Kafka by eliminating costly broker replication, writing directly to cloud object storage. This makes Kafka leaner, faster, and up to 80% cheaper to run. Discover the future of diskless Kafka today: aiven.io/inkless
Master Your SDLC: Delivering Enterprise-Ready APIs
Join industry leaders Shashank Awasthi (Postman) and Allen Helton (Momento) for an exclusive fireside chat on Thursday, November 6, 2025 (10am PST / 1pm EST / 6pm GMT). Gain strategic guidance on structuring your Software Development Lifecycle (SDLC) to deliver APIs that are consistent, compliant, and truly enterprise-ready. This session will provide insights into closing critical lifecycle gaps, preventing specification drift, and embedding security from the outset. Learn how to deliver secure, predictable, and future-proof APIs trusted by both humans and AI agents. Register now
About six or seven years ago, I encountered an interview question for a mid-level .NET role that has resonated with me ever since:
"A user clicks a button on the UI to generate an Excel or PDF report. The report generation takes around five minutes (time can be arbitrary). The user has to wait for it to finish. How would you optimize this flow?"
At the time, my immediate focus was on performance. I began brainstorming ways to accelerate the report generation: optimizing SQL queries, minimizing data transformations, or caching parts of the results. I thought if I could reduce the process from five minutes to one, it would be a significant achievement. However, even with a five-fold speed increase, the user would still be forced to wait. Furthermore, potential issues like browser crashes, network drops, or closing the tab would lead to lost progress. I realized then that it wasn't a performance bottleneck; it was fundamentally a design issue.
What I Missed Back Then
Reflecting, I was stuck in the mindset of "make the code faster." While performance optimization is an invaluable skill, I initially overlooked the bigger problem: the application was performing all this work synchronously, effectively holding the user captive until completion. With some helpful nudges from the interviewer, I eventually grasped the core issue.

The more pertinent question wasn't "How can I make this faster?" but rather "Why is the user waiting in the first place?"
If a process requires minutes (or even hours or days) to complete, it should never block the user. Instead, it should execute in the background, outside the main request flow, allowing the user to continue with their work. Of course, this doesn't negate the importance of optimizing the code itself. Database queries, data processing, and file generation still matter. There might be a missing index, an inefficient loop, or a more suitable library for creating Excel files. However, these optimizations are just one component of the solution, not the complete picture.
How I'd Solve It Today
Today, I would still begin with the same UI button. The user clicks "Generate Report," but instead of waiting, the backend accepts the request, saves it (perhaps as a job record in a database), and returns immediately. This approach embodies the essence of building asynchronous APIs.
The job is then picked up by a background worker. This worker could be a hosted service, a Quartz job, or even an AWS Lambda Function triggered by a queue message. It undertakes the heavy lifting: pulling data, building the file, and uploading it to cloud storage like S3 or Azure Blob.
Once the report is ready, the worker updates the job status to "completed" and notifies the user. This notification could be an email containing a download link or a real-time SignalR message displayed within the application. The link securely points to the stored report, served from the backend.

With this asynchronous flow, the user is no longer waiting on a long-running HTTP request, nor is the server holding open connections for extended periods. If something fails, it can be automatically retried. You also gain the flexibility to track progress or cancel the job if necessary. Moreover, if a hundred users simultaneously request reports, the system can scale effortlessly without locking up. The user experience feels significantly faster, even if the actual report generation time remains unchanged, because ultimately, users prioritize responsiveness over raw performance metrics.
Why I Still Use This Question
Several years later, I began incorporating this exact question when interviewing other developers. My aim isn't to trick anyone, but rather to reveal their thought processes. Some candidates immediately focus on optimizing code and queries, much like I did initially. This indicates a solid understanding of performance tuning, and allows for deeper technical discussions around algorithms, data structures, or database optimization.
Others pause, considering user experience, background processing, and fault tolerance. This is where the truly engaging conversations begin, exploring topics such as queues, retries, notifications, and secure file sharing. This single scenario offers numerous avenues for broader system design discussions. There's no single right answer, but there's a significant distinction between a developer who solely focuses on code and one who can design a scalable system.
The Lesson
When I first encountered this question, my thoughts gravitated toward making the code faster. Now, I consider how to make the entire user experience better. While optimizing a query or loop can certainly help, it doesn't address the fundamental issues of waiting, potential failures, or scalability challenges. If many users initiate the same report concurrently, a synchronous design quickly collapses. An asynchronous flow, however, ensures the system remains responsive and resilient, regardless of the load.
This crucial shift from optimizing individual functions to designing scalable systems is often the hallmark of a great developer. If you're interested in building systems that truly scale, my Clean Architecture course provides comprehensive guidance on structuring applications, separating concerns, and designing systems that evolve without breaking.
I hope this was insightful. See you next week.