The sweat on the back of my neck is turning cold as the air conditioning in the 19th-floor boardroom kicks into overdrive, hummed by a system that probably thinks it is being efficient while I am actually freezing. I am watching the CTO of a 109-year-old logistics firm stare at a spreadsheet that represents 29 months of wasted labor. He just realized that the ‘proprietary intelligence’ they built is not actually theirs. It is a collection of 999,999 multidimensional vectors sitting in a black box, and the key belongs to a vendor who just raised their seat-price by 49 percent.
It is the kind of realization that makes your stomach do a slow, nauseating roll, similar to the time I accidentally laughed during a funeral because the priest used a Microsoft PowerPoint transition that looked like a spinning star. Sometimes, the absurdity of our technical ‘progress’ is so profound that the only biological response left is a nervous, misplaced chuckle.
We are currently living through the Great Re-Locking. For a brief, shimmering moment, the world of open-source models suggested a future of total data sovereignty, but the enterprise software playbook is 89 years old and still functions perfectly. The strategy is simple: don’t lock the door; just make the air inside so specific that the customer dies if they try to breathe anywhere else.
The Copyrighted Wavelength of Light
In the world of Generative AI, this air is the embedding. We were told that ‘data is the new oil,’ but that is a lie designed by people who sell engines. Data is actually more like raw sunlight; it is everywhere, but it is only useful if you have the right panels to capture it.
Encoding Cost Comparison
The vendors have sold us the panels, but they have also copyrighted the specific wavelength of light they absorb. If you want to move your ‘knowledge base’ to a different provider, you find that your 59 gigabytes of curated enterprise wisdom have been transformed into a proprietary embedding format that no other model can interpret without a 99-day manual re-indexing process that costs more than the original implementation.
The Meteorologist Prisoner
“
I think about Blake J.D. often when I look at these architectural traps. Blake is a cruise ship meteorologist who spends 219 days a year staring at the horizon and trying to reconcile what the satellites say with what the ocean is actually doing.
– The Prisoner of Format
He was a prisoner of a format, a man who knew exactly where the hurricane was but couldn’t tell the new computer how he knew it. This is exactly what is happening in the current AI boom. Companies are rushing to build Retrieval-Augmented Generation (RAG) systems, pouring their most sensitive internal documents into vector databases provided by third-party ‘AI-first’ platforms.
But the actual value-the mathematical relationship between those pieces of information-is stored in a format determined by the specific embedding model the vendor forced them to use. If that model is a closed-source, proprietary black box, then your entire corporate brain is now a tenant in someone else’s basement. You are not buying a tool; you are renting a consciousness.
The Anti-Commodification Ecosystem
The irony is that the technology itself is ostensibly open. You can download a Llama or a Mistral model and run it on your own hardware, but the ecosystem surrounding these models is designed to re-introduce the friction that open source was supposed to eliminate. It is a sophisticated form of anti-commodification. If the models themselves are becoming commodities, the vendors must commodify the implementation.
The Digital Cul-de-Sac
I asked him what would happen if the vendor went bankrupt or doubled their prices. He stopped chewing his calamari and just blinked at me. It hadn’t occurred to him that the ‘Space’ they were building in didn’t belong to them. It was a digital cul-de-sac.
This lack of foresight is why organizations like
AlphaCorp AI are seeing a surge in interest from leaders who have been burned before. The smart money is moving toward architectures that prioritize portability-using open embedding standards, maintaining raw data pipelines, and ensuring that the ‘knowledge’ is stored in a way that can be re-indexed by a different model in under 59 hours if necessary.
Instant, Shallow Answer
Sovereign, Deep Integration
We often mistake ‘ease of use’ for ‘innovation.’ But that hardness [of modularity] is where your freedom lives. If you cannot take your embeddings and move them to a local instance of Chroma or Qdrant without losing 89 percent of your curation work, then you don’t have an AI strategy; you have a dependency.
The Vendor Lock-In Feature
When a salesperson tells you their platform is ‘seamless,’ what they often mean is that there are no seams for you to grab onto when you want to pull yourself away. The vendor lock-in isn’t an accidental byproduct of the technology; it is the primary feature being sold to the venture capitalists funding these platforms.
Blake J.D. once told me that the most dangerous part of a storm isn’t the wind; it’s the debris. In the corporate world, the debris is the fragmented, proprietary data formats left behind after a failed partnership. We need to stop being enamored with the ‘magic’ of the output and start being obsessed with the hygiene of the input and the portability of the processed state.
The Tension: Sharing vs. Capturing
Build for the day you want to walk away. Ensure your vector dimensions are documented, your metadata is standard, and your ‘intelligence’ isn’t just a series of 199-character strings that only one API can understand. The cursor is still blinking on that CTO’s screen. 59 times a minute. He has realized he is a captive. Don’t be the one sitting there 29 months from now, wondering why you paid $999,999 to lose control of your own thoughts.