Skip to main content
Back to Blog
aienterprise-aiamazon-bedrocklangchainrag-architectureworkflow-automationgenerative-aiclaude-sonnet

What Halliburton's 95% Workflow Cut Tells Us About Custom Enterprise AI

Halliburton cut seismic workflow time by 95% using custom enterprise AI on AWS Bedrock. Learn the routing, RAG, and agent architecture behind this blueprint.

Zyfolks Team ·

Most enterprise AI pitches sound the same: a chatbot bolted onto a knowledge base, a copilot that summarizes your documents, a demo that wows the boardroom and gathers dust by Q3. Halliburton just published something different. The oilfield services giant cut seismic workflow creation time by over 95% — not by replacing geoscientists, but by letting them stop wrestling with 100 manual tool configurations and start talking to their software in plain English. That’s the real story of custom enterprise AI in 2026, and it has very little to do with chatbots.

Why Halliburton’s Seismic Engine Project Is a Blueprint, Not a Demo

Halliburton partnered with the AWS Generative AI Innovation Center to build an AI-powered assistant for its DS365 Seismic Engine, a cloud-native seismic data processing platform that previously required manual configuration of approximately 100 specialized tools to build a single workflow. The new system, built on Amazon Bedrock, Bedrock Knowledge Bases, Amazon Nova, and DynamoDB, converts natural-language requests into executable YAML workflows by selecting from 82 available Seismic Engine tools.

This matters because it’s the opposite of the generic copilot pattern. Halliburton didn’t drop a foundation model into a chat window and call it transformation. The team built an intent router on Amazon Nova Lite that classifies queries into three buckets — Workflow_Generation, QnA, or General_Question — and routes each to a purpose-built pipeline. Q&A goes through Bedrock Knowledge Bases with OpenSearch Serverless and Titan Text Embeddings V2. Workflow generation goes through Claude agents orchestrated with LangChain. Chat history lives in DynamoDB so users can refine results across turns.

If you’re an enterprise software vendor with a complex configuration surface — think ETL builders, ERP modules, security policy editors, CI/CD pipeline configs — this architecture is the playbook. You don’t need to rebuild your product. You need a routing layer, a RAG layer, and an agent layer sitting in front of the existing tools. Our take: in 18 months, every serious enterprise application with more than 50 configurable components will ship something resembling this pattern, or it will lose ground to competitors that do.

What the 95% Workflow Acceleration Number Actually Means

The numbers in Halliburton’s evaluation are worth reading carefully. According to the published results, Claude Sonnet 3.5 V2 hit a 97% success rate on medium-complexity workflows with a median generation time of 16.6 seconds. Claude Haiku 3.5 came in at 90% success on the same complexity tier, with a median time of 9.1 seconds. The human baseline? Experienced users took 2 minutes for simple flows and 5 minutes for complex ones, with an 85% success rate. New users took 4 to 20 minutes with a 70% success rate. The AI solution finishes in 0.13 to 0.28 minutes — over a 95% time reduction — and beats both user groups on accuracy.

The productivity story isn’t really about speed. It’s about accessibility. A new geoscientist at a Halliburton client used to need ramp-up time before producing reliable workflows. Now they hit experienced-user accuracy on day one. That collapses training costs, reduces error-driven rework, and expands the pool of people who can do meaningful work with the tool. Practically, if you’re a mid-market energy operator paying for Seismic Engine seats, you can put the software in front of analysts who previously couldn’t justify the learning curve.

The Haiku-versus-Sonnet split also has a practical implication for anyone planning AI-integrated software solutions: cheaper, faster models are now good enough for most production paths, and you only escalate to the heavier model when complexity demands it. Our prediction: by late 2026, the default architecture for enterprise AI assistants will be tiered routing — small models for triage and simple generation, large models reserved for the 10–20% of queries that genuinely need the horsepower.

The Architectural Decisions That Actually Matter

Strip away the marketing and three engineering choices stand out in Halliburton’s design. First, Bedrock Knowledge Bases as managed RAG. The team explicitly chose it to avoid running their own vector database, chunking pipeline, and embedding infrastructure. They used hierarchical chunking with default settings for long manuals, and left short tool docs unchunked to preserve full context. That’s a deliberate trade-off: less control, faster delivery, lower operational cost.

Second, agent-based workflow generation with tool binding. The Claude agent is given detailed specifications for each of the 82 Seismic Engine tools — inputs, parameters, outputs — and asked to compose them into valid YAML. This is where most generic copilots fail. They hallucinate API shapes because nobody fed them the schema. Halliburton’s agent works because the tool definitions are first-class context, not an afterthought. If you’re building anything similar, this is the line in the sand: your agent is only as good as the structured tool catalog you hand it. The same principle drives the choice between AI agents and AI automation — agents earn their cost when they need to reason over a structured tool surface, not when a deterministic pipeline would do.

Third, conversation state in DynamoDB. Multi-turn refinement isn’t a nice-to-have for technical workflows; it’s the entire UX. A geoscientist rarely gets the workflow right on the first prompt. They iterate: “now add a bandpass filter,” “change the input dataset,” “merge with the workflow from yesterday.” Without persistent state, every turn becomes a fresh battle. With it, the assistant feels like a colleague.

For SaaS teams building configuration-heavy software, the pattern translates directly. Imagine a cybersecurity vendor whose customers configure firewall policies through 60 different rule types. The Halliburton pattern says: index the rule documentation, define each rule type as a tool, route incoming questions through an intent classifier, and let an agent compose policy YAML from natural language. You ship in months, not years. Our take: the companies still building drag-and-drop visual workflow editors in 2027 will look like the ones still shipping jQuery plugins in 2018.

Why This Pattern Generalizes Beyond Oil and Gas

Halliburton’s engineers note explicitly that the approach generalizes to other domains with complex, multi-step agentic workflows requiring specialized tool knowledge. They’re right, and the implication is bigger than they’re letting on. Any vertical with high-skill configuration work — clinical trial design, manufacturing process planning, financial product structuring, legal document automation — has the same shape: a long tail of specialized tools, deep documentation, expert users, and a painful onboarding curve. The Halliburton solution is a template for cutting that curve.

That’s where the build-versus-buy question sharpens. Bedrock Knowledge Bases, Nova Lite, Claude on Bedrock, and DynamoDB are all off-the-shelf. The custom work is the orchestration: the routing logic, the tool catalog, the prompt engineering, the evaluation harness. That’s where domain expertise lives, and that’s where most enterprises should invest. Buying the model layer and building the integration layer is the right split for 2026, and projects that need custom API development and integration work to stitch existing tool surfaces into an agent-friendly catalog are exactly the kind of glue work this pattern depends on.

One concrete prediction: within 12 months, AWS will package this exact architecture — intent routing plus managed RAG plus tool-bound agent plus session state — as a reference solution or even a higher-level service. Halliburton’s post namechecks Strands Agents SDK and Amazon Bedrock AgentCore as next steps for multi-agent extensions. That’s the direction the platform is heading, and early adopters building with this pattern now will be ahead when those higher-level services ship.

FAQ

Q: What is custom enterprise AI? A: Custom enterprise AI refers to AI systems built around a specific company’s tools, data, and workflows rather than generic chatbots or off-the-shelf copilots. The Halliburton Seismic Engine assistant is a textbook example: it uses general-purpose foundation models from Amazon Bedrock but wraps them in a routing layer, a domain-specific knowledge base, and an agent bound to 82 proprietary tools. The result is an AI that understands one company’s problem deeply, not one that understands everything shallowly.

Q: How much faster is the Halliburton AI assistant compared to manual workflow creation? A: According to Halliburton’s published evaluation, the AI assistant generates workflows in 0.13 to 0.28 minutes, compared to 2 to 20 minutes for human users. That’s a time reduction of over 95%, with success rates of 84–97% depending on model and complexity, which matches or exceeds the 70–85% success rate of manual creation by new and experienced users.

Q: Can this approach work for software outside the energy industry? A: Yes. The architecture — intent routing on a small model, RAG over indexed documentation, an agent bound to a structured tool catalog, and persistent conversation state — is domain-agnostic. Any software with a large configuration surface, deep documentation, and a steep learning curve can adopt the same pattern. Healthcare, fintech, manufacturing, and legal tech are obvious next candidates.

Key Takeaways

  • Treat your tool catalog as a first-class data asset; agents that compose workflows are only as accurate as the structured specifications you give them.
  • Use tiered model routing — small models for classification, larger models for generation — instead of sending every query to the most expensive endpoint.
  • Invest engineering time in orchestration, evaluation harnesses, and conversation state, not in rebuilding RAG infrastructure that managed services already handle.
  • Measure your AI’s success against both new-user and expert baselines; productivity gains that only beat novices won’t survive procurement scrutiny.
  • Enterprise software vendors with complex configuration surfaces should plan for natural-language interfaces as table stakes by 2027, not as differentiators.

Have a project in mind?

Tell us what you're building — we reply within 24 hours.