Working with Clients on Complex Software Estimations
The following is a retrospective account of working with team leads, project managers, and a client, on the most challenging technical refactor I have ever worked on. Though I write this as a developer, my hope is that this will be useful for individuals in any role or level who run into a similar situation.
This will not be a deep dive into the technical challenges, but I will explain them in enough detail to hopefully communicate what made them particularly tricky.
In this blog, I will discuss:
- What made this work into a perfect storm of complexity
- How I gathered information and communicated some concerns about timeline with my Team Lead and Project Manager
- How our team worked with our client in order to ensure that we all had alignment on how the work would be done
- A summary of how the refactoring process actually went
A Perfect Storm of Complexity
In 2024, our team became aware that the core feature of one of our clients’ mobile applications might be due for a refactor. The EM:RAP (Emergency Medicine Reviews and Perspectives) application provides high quality, multimedia playback targeted primarily for emergency medicine physicians.
For many years, the application had used the ExoPlayer 2 framework to drive the core functionality of media playback. However, ExoPlayer 2 was being deprecated in favour of its successor, the Media3 framework. Although we knew that migrating such a large feature had risks, we also knew that not migrating could lead to application breaking errors for new Android versions down the road.
In 2025, we were given the go-ahead by EM:RAP to reduce technical debt, which was when this work resurfaced to top priority. Naturally, our EM:RAP wanted a timeline estimate, and we set about trying to provide one.
With that background in mind, we will set the resulting process aside for a moment and discuss what made this into a particularly challenging refactoring and estimation process.
What were the Technical Challenges?
The following is a non-exhaustive list of the different technical (using a broad definition of the term) challenges which we encountered with this proposed work:
- Our team values fixing problems instead of patching over them, so we wanted a balance between avoiding unnecessary rewrites while fixing technical debt wherever possible
- ExoPlayer 2 was the primary driver of the core functionality of the application, which was presenting and playing high quality multimedia
- The existing implementation was complex, brittle, and needed to meet the unique requirements of EM:RAP’s product specification and content structures
- The documentation for Media3 was very sparse and the technical samples we found, while helpful, never fully captured our requirements (technical, architectural, or functional)
- Various complex subcomponents relied on a core media player in such a way that, until the media player was refactored, we could not effectively estimate the work to update them.
This truly is an incomplete list. However, I think this is enough to get us into some practical discussions on how we all (including our client) worked through this together. The following short sections are meant to give a bit more insight into these technical challenges, but feel free to skip ahead if you are not here for the technical stuff. The next non-technical section is titled “Tempering Expectations.”
Where Do We Even Get Started?
I knew that no matter how difficult it seemed at first, I needed to wrap my head around the existing system. My approach was to go through each of roughly seven medium-to-large sized classes and document their behaviour, interactions, and structure in a simple way. I did so in a program, called Excalidraw, which allows me to create quick, UML-like diagrams. Boxes, text, and arrows essentially.
Remarkably, it turned out to be substantially more complex than I imagined. In any case, having this diagram was important for more than just my own process. As I will discuss later, it helped me to be able to communicate this complexity with my team.
Reducing Tech Debt
The existing implementation had serious breakages of some core principles of software design such as:
- Shared mutable state (relying on variables or shared instances of classes which could be modified in non-deterministic ways)
- At times, a lack of separation of concerns (classes doing too many things or having too many responsibilities)
- Tight coupling of classes and data across the software architecture (making the architecture brittle and difficult to follow)
Though there were some areas we could improve upon, many of these issues were a result of how the ExoPlayer 2 was designed. It had UI components, client side back end components (network I/O), internal state (which was only reliable as a source of truth in some cases), and service level components (notifications and playback while in background, among other things).
Our Android team (including our tech lead) came up with a proposed new architecture which substantially reduced these issues. Though we did end up having to rewrite the majority of the driver and service layer, we were able to keep the interface of the media playback subsystem largely intact, reducing the impact of the refactor on the rest of the application substantially.
Navigating the Documentation
The documentation for Media3 was often unhelpful despite it being officially supported. To summarize what the documentation for most classes and functions looked like, imagine something like:
/**
* A Capacitor for Flux
* */
class FluxCapacitor //…
In any case, I always prefer to learn from working implementations and was able to track down the official samples on Github. Though finding these samples helped, they often demonstrated simplified usage and toy problems. Fortunately, the issues section in the Github repository ended up being a good resource for the extensive roadblocks and challenges we encountered.
Using an LLM was extremely helpful for describing how these different components worked together. I would paste code directly from the public samples and also encourage the LLM to reference the hundreds of github issues for context. To ensure safety of IP, I did not paste in any existing application code.
I cannot stress how useful it was to leverage an LLM here. Though I did not trust the implementations it provided, its output functional became the documentation I would use to complete the work in most cases.
Tempering Expectations
To the best of my recollection, our early discussions with EM:RAP indicated that this was likely to take roughly three weeks to complete. Part of the reason for our initial estimate of three weeks was that Media3 advertised itself as “simplifying” many aspects of its interface compared to the previous version. Though this was true in isolated cases, overall it remains to be the most complex aspect of the codebase.
While this was a reasonable estimate at that stage, I communicated to both my project manager and team lead that I was concerned it could take significantly longer. The following section describes how that process went from my perspective.
Working with the Developer Team
As mentioned before, I put together a diagram which captured the complexity of the existing system as my first task. I also made sure to have relevant official documentation on hand to reference my opinions.
Next, I requested a meeting with our Tech Lead and my fellow Android team members. The key issue I wanted to highlight is that I did not believe the new framework simplified most aspects of our existing flows, mainly just how notifications worked. I also wanted to emphasize that while it was clear we would need a whole new architecture, it was not clear to me what the new architecture would look like.
In short, after doing some initial research, I was pretty strongly convinced that three weeks was not going to be enough time. This was not presented with the intention of arguing with anyone; it was my genuine assessment with the information I had at that point.
Apart from this initial meeting, I was extremely grateful to have my team members come together for us to all work out a new architecture that made sense. We have a very good dynamic where the emphasis is on finding the best solution instead of ego tripping. I took advantage of that as much as I could.
Working with our Project Managers
Once the development team was in agreement that this was almost certainly not a three week task, we discussed it with our project manager.
The development and project management team did our best to put together what we thought would be the general order and shape of tasks in a task management system, with any uncertainty stated explicitly within the tasks. For example, suppose we were building something called MediaService. We might have five or six subtasks describing the responsibilities of MediaService, yet we did not know whether some of those might actually need to be built within MediaBrowser, which represented a different task.
While we tried to break things down into small tasks when possible, a huge challenge was that we needed several large components to be at least partially built, in order to get a testable baseline. In such cases we explicitly stated why the tasks seemed larger than the ideal, bite-sized chunks that we prefer.
Working with our Client
Keeping in mind that I was not privy to the direct discussions with EM:RAP in most cases, they gave us the go ahead on this plan. Our goals were to provide frequent updates and keep our task management system in the best shape possible. We also were sure to do this work in such a way that it would not block any bugfixes or minor releases along the way.
Not every client relationship will have this level of trust but I do believe this was a fundamentally important part of this process. EM:RAP needed to trust that we were not drawing anything out arbitrarily and that this work was better done now rather than scrambling to do later.
In having that trust, as the developer doing much of the refactoring work, I felt like I could do what I thought was the most important thing in the long term: A good job. To me, that means balancing all of the concerns at play including the clients goals, a reasonable trajectory, and the long term stability of the application code.
How did it go?
The whole process went through roughly three phases:
- Establishing a functional baseline with the new architecture and basic playback operations
- Gradually implementing every remaining feature in an order where least used or non-core features were done last
- Fixing dozens of bugs which were often one or two line fixes that required days of investigation
Including things such as investigation steps, beta testing, and fixing regressions, the whole process took roughly 12 weeks. Particularly for the first phase, it was impractical to have more than one developer doing the implementation work. This was due to the nature of the problem and to avoid blocking concurrent work and releases.
Almost every step of the way, we had to work around complexities, shortcomings, and bugs within Media3. It became somewhat of a mantra for our team: “Everything with Media3 takes longer.”
In the end, we were able to successfully migrate to Media3, maintain and optimize existing functionality, improve this aspect of the application’s architecture, and introduce some subtle changes to improve the user experience. Perhaps most importantly, we know that the next Android OS will not break the application due to a dependency on an unmaintained library!
Summary
I had never envisioned myself in a situation where I would have to come to my team and explain to them that I had no idea how long it would take to migrate to a new version of a single framework. I assume my project manager did not envision having to communicate to EM:RAP that our initial estimate might be far too small. In any case, we all took a hard look at what was in front of us, did not sugar coat it, and worked out something that we were all happy with.
To me, the most important qualities in this situation were that we did our best to demonstrate honesty and accountability. The honest truth was that we did not know how long this refactor was going to take. However, I kept myself accountable by researching the problem and ensuring that I had some understanding of the scope and complexities before I shared my assessment. Our project management team kept us accountable by providing regular progress updates and tracking the work as best we could.
Whether you are a developer, project manager, or client, my hope is that this article provided some guidance on working with complex software estimations.
