As advancements in AI continue to surprise and accelerate, one of the areas we continue to be most excited by is new “brains” for legacy industries. Specifically, we mean domain-specific models, knowledge bases, advanced semantic search, RAG systems, and agents to automate manual workflows. The potential impact of this type of AI ranges from significantly improving productivity to reducing the carbon intensity of our physical economy.
We believe the opportunity here is immense — given the size and diversity of real-world sectors, we expect domain-specific AI solutions to proliferate accordingly. In this vein, we’ve made investments in companies like Trunk Tools and Parspec in the construction industry, Afresh in food and agriculture, and Cadstrom in hardware engineering. And we expect that there are many analogous companies to be built across industrial segments like energy (e.g., oil and gas, SAFs, geothermal), real estate, chemicals, mining, manufacturing, and the like.
Why are these AI advancements particularly suited to revolutionizing the operations of legacy sectors?
- Legacy systems-of-record — There is substantial range for what ‘system of record’ (SOR) means in industrials. For some organizations, it means millions of PDFs stored in highly siloed databases, often on-prem. More sophisticated companies live in a variety of systems: ERP, WMS, BIM and more — it’s an alphabet soup! Even for these more advanced organizations, the solutions are very brittle and fragmented compared to the best-in-class found in modern enterprises. There is tremendous opportunity to not only migrate to the cloud and enable open, API-friendly integrations, but also to layer on AI. In our experience, it can be very challenging to entirely displace an existing system of record (e.g. SAP seems to be cemented down in many organizations) but we think there will be huge value creation in developing AI agents and workflows on top of or between these systems of record. This could look like a knowledge system accessing and synthesizing information from fragmented data silos, across numerous data types, or an agent automating a nagging workflow where it reads from or writes to the SOR. Eventually, this AI-layer can even become the command and control center, such that actually going into the legacy systems is unnecessary.
- Vast amounts of unstructured documentation — Inside or outside of these systems of record, we know these industries have huge amounts of unstructured and highly varied data: think schematics, diagrams, engineering specs, blueprints, drawings, logs, and inspection reports. Much of this data is non-textual, and highly technical. In many cases, expensive engineers and analysts spend 30%+ of their time just trying to locate documents in decades-old filing systems. Even for organizations that store everything in the cloud, the struggle of knowledge management remains, as their cloud instances aren’t designed to facilitate the kinds of industry-specific, unified data models customers need for easy querying. This problem is particularly exacerbated in industries like energy and real estate, where assets are constantly bought and sold, and the volume of heterogeneous documents is commensurate with this transaction velocity. Emerging AI can readily structure this treasure trove of data, and make simple questions like “what’s the history of this asset?” easily answerable.
- Walled gardens = little high-quality public data — Legacy industry is distinctly tight-lipped around their data, given they view it as their core IP. Everyone from construction companies, to auto OEMs, to energy majors, keep their decades of documentation under lock and key. Major model developers like OpenAI have effectively crawled the entire internet, but don’t have access to the specialized, technical documentation that is necessary for training performant models in these contexts. While tools like Glean have built great products for the enterprise, they too fall short for the unique needs of these specialized industries and don’t perform as well for companies with more immature digitization. The opportunity for industry veterans to develop specialized models and workflows – trained and tuned on customer data behind the firewall — is incredibly compelling.
- The “great crew change”, and loss of industry knowledge — Traditional industry faces a crisis of retiring subject matter experts (SMEs), and a lack of new talent to fill their shoes. In conversations with leaders at these companies, many of them view the impending loss of expertise, dubbed the “great crew change,” as the single greatest threat to their business. In physical industries like energy, construction and manufacturing, around 30-50% of skilled workers currently in the labor force are set to retire by the end of the decade, taking their decades of specialized (and often never written down) expertise with them. Building knowledge retention and operational systems to even just maintain, let alone scale, these precious resources is an urgent market challenge. We believe this task could be facilitated by advanced knowledge bases.
- DeepSeek and the falling costs of domain-specific models — The first month of 2025 has demonstrated a new scaling law for AI, one that is not based purely on compute capacity. While some bold technologists are proclaiming the ‘death of RAG’ due to increasing context windows, one thing is clear: the cost to train a highly performant, optimized, smaller model for a specific domain is rapidly declining. Given customer data is non-public, we expect some amount of proprietary training, tuning, embedding or model-building is required to unlock the full value of AI for the industrial domain. Going forward, we hope it will be economically feasible for startups with the right expertise to build not only amazing applications, but also highly tuned models and infrastructure that will likely outperform the marquee foundation models when deployed into this customer segment.
The opportunity for ambitious teams with deep industry knowledge to build domain-specific AI tools for the physical economy is immense. Industrial giants are unlikely to build these assets themselves: modern software DNA is critical to build these tools, and rarely resides within legacy industries. We are excited to partner with bold teams building new brains for these domains. If you’re working on something, or just want to trade notes, please reach out.