At Massive Scale, Nothing Is Just a Technical Decision
There is a version of engineering that most developers understand intuitively: you identify a problem, you find an efficient solution, you ship, you iterate. The logic ..
Derry Berni Cahyady
There is a version of engineering that most developers understand intuitively: you identify a problem, you find an efficient solution, you ship, you iterate. The logic is clean. The feedback loops are tight. If something breaks, the cost is contained.
Then there is engineering at media scale. And almost everything about how you were trained to think will mislead you.
I do not mean that the problems become harder, though they do. I mean that the nature of the problem changes entirely. At a certain threshold of daily traffic, the distance between a technical decision and a business outcome collapses to zero.
A configuration choice that any senior engineer could make in an afternoon stops being a technical choice. It becomes a revenue argument. And most engineers, including very good ones, are not equipped to recognise that shift when it happens.
Let me be specific about why this matters.
When you operate a system serving tens of millions of sessions per month, time-to-first-byte is not a performance metric. It is an inventory metric.
Digital advertising revenue on a media property is tied to how many ads actually render in front of a human being before they navigate away.
A latency increase of 200 milliseconds, compounded across a session volume that large, creates a measurable reduction in rendered impressions. That gap is not an engineering number sitting in a monitoring dashboard. It is a finance number sitting in a revenue report, and the two teams will rarely connect them to the same root cause.
The same logic applies further down the system.
Crawl budget, for instance, is a concept that lives almost entirely in the SEO domain of most organisations. Engineers touch it only when they are explicitly asked to. But at media scale, crawl budget is not an SEO concern. It is a content asset utilisation concern. If a platform publishes four hundred articles in a day and Google's crawler processes a fraction of them before moving on, a significant portion of that editorial output generates no organic distribution. The economics of content production assume indexation. The engineering defaults often do not account for it. That gap is structural, not communicative. More meetings between the SEO team and the engineering team will not close it. Rethinking how the system is designed to be read by machines is the only thing that closes it.
Structured data follows the same pattern. Schema markup implementation at most media organisations is treated as a feature, something that gets added to a product backlog and shipped when bandwidth allows. At scale, it is closer to a distribution mechanism. The signal quality of your structured data determines how confidently downstream systems can classify, rank, and surface your content. A media property that does not treat structured data as a core engineering standard is, in effect, volunteering to be misunderstood by the systems that allocate its audience.
I am describing a pattern, not pointing at a single failure.
What I am pointing at is a failure of conceptual framing that is endemic to how engineering and media intersect.
The vocabulary of technical decisions is precise and bounded. The vocabulary of revenue and distribution is diffuse and lagging. When you operate at scale, those two vocabularies are describing the same phenomena. The engineer who understands this is not a better engineer.
They are a different kind of engineer.
And now there is a new layer of the same problem, and almost no one at a media organisation is treating it with the seriousness it deserves.
Generative AI has introduced a new class of discovery. When a user opens a conversational AI interface and asks a question that your platform should be the authoritative answer to, the system does not crawl the web in real time. It retrieves from a model of the web, built on signals that are structurally different from traditional search signals. Entity clarity, citation density, document structure, and semantic consistency across a corpus matter in ways that raw domain authority does not automatically transfer to.
Here is the uncomfortable observation: the largest media properties in a given market are not automatically well-positioned for this layer of discovery. Scale gives you authority in traditional search because historical backlink structures and brand recognition translate into ranking signals. In generative retrieval, what matters is whether your content is structurally legible to a model trying to summarise, cite, and attribute. That is a different kind of legibility, and it is almost entirely an engineering concern.
A platform that publishes millions of pieces of content but maintains inconsistent entity references, contradictory structured data across page types, and legacy template patterns that obscure authorship and context from machine readers is, from the perspective of a generative model, a noisy signal source. Brand scale does not override structural noise at this layer.
The implication is this:
The media organisations that will be cited, surfaced, and recommended by AI systems are not necessarily the ones with the longest history or the largest traffic base.
They are the ones whose engineering teams understood that machine readability is an infrastructure concern, treated it as such years before it became urgent, and built systems that emit clean, structured, attributable signals by default.
Most engineering roadmaps at media companies do not contain a line item for this. Most product discussions do not frame it in these terms. The work that determines your distribution footprint in the next paradigm of discovery is happening, or not happening, inside conversations that are still labelled as technical.
That is the problem worth solving.
As a note, YPYM experts have successfully completed multiple direct projects for media companies. If you are a decision-maker in the media industry and require direct consultation, please contact us.
The team I work alongside at YPYM operates across this entire range of complexity, from properties with ten thousand monthly visits to those measuring in the millions, as a matter of ordinary work. What that breadth produces is not a library of solved problems. It produces a way of reading new ones.
The configurations that tend to surface the most structural tension in the SEO-and-engineering intersection are not random. Website revamps that were scoped as design projects and shipped as traffic problems. Traffic recovery cycles triggered by algorithm updates that exposed what the underlying architecture was quietly doing wrong for years.
New product or channel launches where discoverability was treated as an afterthought until the numbers made it impossible to ignore. Content localisation efforts that broke entity consistency at scale.
CMS migrations where the technical handoff was clean and the search footprint was not. Market share recapture attempts that treated ranking positions as the target rather than the signal.
YMYL compliance frameworks that required engineering and editorial and legal to finally speak the same language. Multilingual architectures and domain consolidation decisions where the wrong structural choice compounded for years before anyone traced the revenue impact back to it.
These are not edge cases. They are the recurring coordinates of online publishing at any meaningful scale.
When we encounter a challenge we have not seen before in a specific configuration, we do not treat that as a boundary. We treat it as the beginning of something worth thinking about seriously. We are, frankly, less interested in problems we already know how to solve. The routine is not where the thinking happens.
If the argument in this piece describes something familiar inside your organisation, the structural gap between how your engineering decisions are made and how they actually behave as business and distribution choices at scale, then it is worth a conversation. Not because we have a prepared answer waiting. But because we find these problems genuinely interesting, and that tends to produce better outcomes than confidence alone ever does.
— Derry Berni Cahyady is a software engineer with a career spanning Indonesia's largest digital media operations into education technology product development.