Building an enterprise AI company on a "foundation of shifting sand" is the central challenge for founders today, according to the leadership at Palona AI.
Today, the Palo Alto-based startup—led by former Google and Meta engineering veterans—is making a decisive vertical push into the restaurant and hospitality space with today's launch of Palona Vision and Palona Workflow.
The new offerings transform the company’s multimodal agent suite into a real-time operating system for restaurant operations — spanning cameras, calls, conversations, and coordinated task execution.
The news marks a strategic pivot from the company’s debut in early 2025, when it first emerged with $10 million in seed funding to build emotionally intelligent sales agents for broad direct-to-consumer enterprises.
Now, by narrowing its focus to a "multimodal native" approach for restaurants, Palona is providing a blueprint for AI builders on how to move beyond "thin wrappers" to build deep systems that solve high-stakes physical world problems.
“You’re building a company on top of a foundation that is sand—not quicksand, but shifting sand,” said co-founder and CTO Tim Howes, referring to the instability of today’s LLM ecosystem. “So we built an orchestration layer that lets us swap models on performance, fluency, and cost.”
VentureBeat spoke with Howes and co-founder and CEO Maria Zhang in person recently at — where else? — a restaurant in NYC about the technical challenges and hard lessons learned from their launch, growth, and pivot.
For the end user—the restaurant owner or operator—Palona’s latest release is designed to function as an automated "best operations manager" that never sleeps.
Palona Vision uses in-store security cameras to analyze operational signals — such as queue lengths, table turnover, prep bottlenecks, and cleanliness — without requiring any new hardware.
It monitors front-of-house metrics like queue lengths, table turns, and cleanliness, while simultaneously identifying back-of-house issues like prep slowdowns or station setup errors.
Palona Workflow complements this by automating multi-step operational processes. This includes managing catering orders, opening and closing checklists, and food prep fulfillment. By correlating video signals from Vision with Point-of-Sale (POS) data and staffing levels, Workflow ensures consistent execution across multiple locations.
“Palona Vision is like giving every location a digital GM,” said Shaz Khan, founder of Tono Pizzeria + Cheesesteaks, in a press release provided to VentureBeat. “It flags issues before they escalate and saves me hours every week.”
Palona’s journey began with a star-studded roster. CEO Zhang previously served as VP of Engineering at Google and CTO of Tinder, while Co-founder Howes is the co-inventor of LDAP and a former Netscape CTO.
Despite this pedigree, the team’s first year was a lesson in the necessity of focus.
Initially, Palona served fashion and electronics brands, creating "wizard" and "surfer dude" personalities to handle sales. However, the team quickly realized that the restaurant industry presented a unique, trillion-dollar opportunity that was "surprisingly recession-proof" but "gobsmacked" by operational inefficiency.
"Advice to startup founders: don't go multi-industry," Zhang warned.
By verticalizing, Palona moved from being a "thin" chat layer to building a "multi-sensory information pipeline" that processes vision, voice, and text in tandem.
That clarity of focus opened access to proprietary training data (like prep playbooks and call transcripts) while avoiding generic data scraping.
1. Building on ‘Shifting Sand’
To accommodate the reality of enterprise AI deployments in 2025 — with new, improved models coming out on a nearly weekly basis — Palona developed a patent-pending orchestration layer.
Rather than being "bundled" with a single provider like OpenAI or Google, Palona’s architecture allows them to swap models on a dime based on performance and cost.
They use a mix of proprietary and open-source models, including Gemini for computer vision benchmarks and specific language models for Spanish or Chinese fluency.
For builders, the message is clear: Never let your product's core value be a single-vendor dependency.
2. From Words to ‘World Models’
The launch of Palona Vision represents a shift from understanding words to understanding the physical reality of a kitchen.
While many developers struggle to stitch separate APIs together, Palona’s new vision model transforms existing in-store cameras into operational assistants.
The system identifies "cause and effect" in real-time—recognizing if a pizza is undercooked by its "pale beige" color or alerting a manager if a display case is empty.
"In words, physics don't matter," Zhang explained. "But in reality, I drop the phone, it always goes down... we want to really figure out what's going on in this world of restaurants".
3. The ‘Muffin’ Solution: Custom Memory Architecture
One of the most significant technical hurdles Palona faced was memory management. In a restaurant context, memory is the difference between a frustrating interaction and a "magical" one where the agent remembers a diner’s "usual" order.
The team initially utilized an unspecified open-source tool, but found it produced errors 30% of the time. "I think advisory developers always turn off memory [on consumer AI products], because that will guarantee to mess everything up," Zhang cautioned.
To solve this, Palona built Muffin, a proprietary memory management system named as a nod to web "cookies". Unlike standard vector-based approaches that struggle with structured data, Muffin is architected to handle four distinct layers:
Structured Data: Stable facts like delivery addresses or allergy information.
Slow-changing Dimensions: Loyalty preferences and favorite items.
Transient and Seasonal Memories: Adapting to shifts like preferring cold drinks in July versus hot cocoa in winter.
Regional Context: Defaults like time zones or language preferences.
The lesson for builders: If the best available tool isn't good enough for your specific vertical, you must be willing to build your own.
4. Reliability through ‘GRACE’
In a kitchen, an AI error isn't just a typo; it’s a wasted order or a safety risk. A recent incident at Stefanina’s Pizzeria in Missouri, where an AI hallucinated fake deals during a dinner rush, highlights how quickly brand trust can evaporate when safeguards are absent.
To prevent such chaos, Palona’s engineers follow its internal GRACE framework:
Guardrails: Hard limits on agent behavior to prevent unapproved promotions.
Red Teaming: Proactive attempts to "break" the AI and identify potential hallucination triggers.
App Sec: Lock down APIs and third-party integrations with TLS, tokenization, and attack prevention systems.
Compliance: Grounding every response in verified, vetted menu data to ensure accuracy.
Escalation: Routing complex interactions to a human manager before a guest receives misinformation.
This reliability is verified through massive simulation. "We simulated a million ways to order pizza," Zhang said, using one AI to act as a customer and another to take the order, measuring accuracy to eliminate hallucinations.
With the launch of Vision and Workflow, Palona is betting that the future of enterprise AI isn't in broad assistants, but in specialized "operating systems" that can see, hear, and think within a specific domain.
In contrast to general-purpose AI agents, Palona’s system is designed to execute restaurant workflows, not just respond to queries — it's capable of remembering customers, hearing them order their "usual," and monitoring the restaurant operations to ensure they deliver that customer the food according to their internal processes and guidelines, flagging whenever something goes wrong or crucially, is about to go wrong.
For Zhang, the goal is to let human operators focus on their craft: "If you've got that delicious food nailed... we’ll tell you what to do."
Anthropic said on Wednesday it would release its Agent Skills technology as an open standard, a strategic bet that sharing its approach to making AI assistants more capable will cement the company's position in the fast-evolving enterprise software market.
The San Francisco-based artificial intelligence company also unveiled organization-wide management tools for enterprise customers and a directory of partner-built skills from companies including Atlassian, Figma, Canva, Stripe, Notion, and Zapier.
The moves mark a significant expansion of a technology Anthropic first introduced in October, transforming what began as a niche developer feature into infrastructure that now appears poised to become an industry standard.
"We're launching Agent Skills as an independent open standard with a specification and reference SDK available at https://agentskills.io," Mahesh Murag, a product manager at Anthropic, said in an interview with VentureBeat. "Microsoft has already adopted Agent Skills within VS Code and GitHub; so have popular coding agents like Cursor, Goose, Amp, OpenCode, and more. We're in active conversations with others across the ecosystem."
Skills are, at their core, folders containing instructions, scripts, and resources that tell AI systems how to perform specific tasks consistently. Rather than requiring users to craft elaborate prompts each time they want an AI assistant to complete a specialized task, skills package that procedural knowledge into reusable modules.
The concept addresses a fundamental limitation of large language models: while they possess broad general knowledge, they often lack the specific procedural expertise needed for specialized professional work. A skill for creating PowerPoint presentations, for instance, might include preferred formatting conventions, slide structure guidelines, and quality standards — information the AI loads only when working on presentations.
Anthropic designed the system around what it calls "progressive disclosure." Each skill takes only a few dozen tokens when summarized in the AI's context window, with full details loading only when the task requires them. This architectural choice allows organizations to deploy extensive skill libraries without overwhelming the AI's working memory.
The new enterprise management features allow administrators on Anthropic's Team and Enterprise plans to provision skills centrally, controlling which workflows are available across their organizations while letting individual employees customize their experience.
"Enterprise customers are using skills in production across both coding workflows and business functions like legal, finance, accounting, and data science," Murag said. "The feedback has been positive because skills let them personalize Claude to how they actually work and get to high-quality output faster."
The community response has exceeded expectations, according to Murag: "Our skills repository already crossed 20k stars on GitHub, with tens of thousands of community-created and shared skills."
Anthropic is launching with skills from ten partners, a roster that reads like a who's who of modern enterprise software. The presence of Atlassian, which makes Jira and Confluence, alongside design tools Figma and Canva, payment infrastructure company Stripe, and automation platform Zapier suggests Anthropic is positioning Skills as connective tissue between Claude and the applications businesses already use.
The business arrangements with these partners focus on ecosystem development rather than immediate revenue generation.
"Partners who build skills for the directory do so to enhance how Claude works with their platforms. It's a mutually beneficial ecosystem relationship similar to MCP connector partnerships," Murag explained. "There are no revenue-sharing arrangements at this time."
For vetting new partners, Anthropic is taking a measured approach. "We began with established partners and are developing more formal criteria as we expand," Murag said. "We want to create a valuable supply of skills for enterprises while helping partner products shine."
Notably, Anthropic is not charging extra for the capability. "Skills work across all Claude surfaces: Claude.ai, Claude Code, the Claude Agent SDK, and the API. They're included in Max, Pro, Team, and Enterprise plans at no additional cost. API usage follows standard API pricing," Murag said.
The decision to release Skills as an open standard is a calculated strategic choice. By making skills portable across AI platforms, Anthropic is betting that ecosystem growth will benefit the company more than proprietary lock-in would.
The strategy appears to be working. OpenAI has quietly adopted structurally identical architecture in both ChatGPT and its Codex CLI tool. Developer Elias Judin discovered the implementation earlier this month, finding directories containing skill files that mirror Anthropic's specification—the same file naming conventions, the same metadata format, the same directory organization.
This convergence suggests the industry has found a common answer to a vexing question: how do you make AI assistants consistently good at specialized work without expensive model fine-tuning?
The timing aligns with broader standardization efforts in the AI industry. Anthropic donated its Model Context Protocol to the Linux Foundation on December 9, and both Anthropic and OpenAI co-founded the Agentic AI Foundation alongside Block. Google, Microsoft, and Amazon Web Services joined as members. The foundation will steward multiple open specifications, and Skills fit naturally into this standardization push.
"We've also seen how complementary skills and MCP servers are," Murag noted. "MCP provides secure connectivity to external software and data, while skills provide the procedural knowledge for using those tools effectively. Partners who've invested in strong MCP integrations were a natural starting point."
The Skills approach is a philosophical shift in how the AI industry thinks about making AI assistants more capable. The traditional approach involved building specialized agents for different use cases — a customer service agent, a coding agent, a research agent. Skills suggest a different model: one general-purpose agent equipped with a library of specialized capabilities.
"We used to think agents in different domains will look very different," Barry Zhang, an Anthropic researcher, said at an industry conference last month, according to a Business Insider report. "The agent underneath is actually more universal than we thought."
This insight has significant implications for enterprise software development. Rather than building and maintaining multiple specialized AI systems, organizations can invest in creating and curating skills that encode their institutional knowledge and best practices.
Anthropic's own internal research supports this approach. A study the company published in early December found that its engineers used Claude in 60% of their work, achieving a 50% self-reported productivity boost—a two to threefold increase from the prior year. Notably, 27% of Claude-assisted work consisted of tasks that would not have been done otherwise, including building internal tools, creating documentation, and addressing what employees called "papercuts" — small quality-of-life improvements that had been perpetually deprioritized.
The Skills framework is not without potential complications. As AI systems become more capable through skills, questions arise about maintaining human expertise. Anthropic's internal research found that while skills enabled engineers to work across more domains—backend developers building user interfaces, researchers creating data visualizations—some employees worried about skill atrophy.
"When producing output is so easy and fast, it gets harder and harder to actually take the time to learn something," one Anthropic engineer said in the company's internal survey.
There are also security considerations. Skills provide Claude with new capabilities through instructions and code, which means malicious skills could theoretically introduce vulnerabilities. Anthropic recommends installing skills only from trusted sources and thoroughly auditing those from less-trusted origins.
The open standard approach introduces governance questions as well. While Anthropic has published the specification and launched a reference SDK, the long-term stewardship of the standard remains undefined. Whether it will fall under the Agentic AI Foundation or require its own governance structure is an open question.
The trajectory of Skills reveals something important about Anthropic's ambitions. Two months ago, the company introduced a feature that looked like a developer tool. Today, that feature has become a specification that Microsoft builds into VS Code, that OpenAI replicates in ChatGPT, and that enterprise software giants race to support.
The pattern echoes strategies that have reshaped the technology industry before. Companies from Red Hat to Google have discovered that open standards can be more valuable than proprietary technology — that the company defining how an industry works often captures more value than the company trying to own it outright.
For enterprise technology leaders evaluating AI investments, the message is straightforward: skills are becoming infrastructure. The expertise organizations encode into skills today will determine how effectively their AI assistants perform tomorrow, regardless of which model powers them.
The competitive battles between Anthropic, OpenAI, and Google will continue. But on the question of how to make AI assistants reliably good at specialized work, the industry has quietly converged on an answer — and it came from the company that gave it away.
Enterprises can now harness the power of a large language model that's near that of the state-of-the-art Google’s Gemini 3 Pro, but at a fraction of the cost and with increased speed, thanks to the newly released Gemini 3 Flash.
The model joins the flagship Gemini 3 Pro, Gemini 3 Deep Think, and Gemini Agent, all of which were announced and released last month.
Gemini 3 Flash, now available on Gemini Enterprise, Google Antigravity, Gemini CLI, AI Studio, and on preview in Vertex AI, processes information in near real-time and helps build quick, responsive agentic applications.
The company said in a blog post that Gemini 3 Flash “builds on the model series that developers and enterprises already love, optimized for high-frequency workflows that demand speed, without sacrificing quality.
The model is also the default for AI Mode on Google Search and the Gemini application.
Tulsee Doshi, senior director, product management on the Gemini team, said in a separate blog post that the model “demonstrates that speed and scale don’t have to come at the cost of intelligence.”
“Gemini 3 Flash is made for iterative development, offering Gemini 3’s Pro-grade coding performance with low latency — it’s able to reason and solve tasks quickly in high-frequency workflows,” Doshi said. “It strikes an ideal balance for agentic coding, production-ready systems and responsive interactive applications.”
Early adoption by specialized firms proves the model's reliability in high-stakes fields. Harvey, an AI platform for law firms, reported a 7% jump in reasoning on their internal 'BigLaw Bench,' while Resemble AI discovered that Gemini 3 Flash could process complex forensic data for deepfake detection 4x faster than Gemini 2.5 Pro. These aren't just speed gains; they are enabling 'near real-time' workflows that were previously impossible.
Enterprise AI builders have become more aware of the cost of running AI models, especially as they try to convince stakeholders to put more budget into agentic workflows that run on expensive models. Organizations have turned to smaller or distilled models, focusing on open models or other research and prompting techniques to help manage bloated AI costs.
For enterprises, the biggest value proposition for Gemini 3 Flash is that it offers the same level of advanced multimodal capabilities, such as complex video analysis and data extraction, as its larger Gemini counterparts, but is far faster and cheaper.
While Google’s internal materials highlight a 3x speed increase over the 2.5 Pro series, data from independent benchmarking firm Artificial Analysis adds a layer of crucial nuance.
In the latter organization's pre-release testing, Gemini 3 Flash Preview recorded a raw throughput of 218 output tokens per second. This makes it 22% slower than the previous 'non-reasoning' Gemini 2.5 Flash, but it is still significantly faster than rivals including OpenAI's GPT-5.1 high (125 t/s) and DeepSeek V3.2 reasoning (30 t/s).
Most notably, Artificial Analysis crowned Gemini 3 Flash as the new leader in their AA-Omniscience knowledge benchmark, where it achieved the highest knowledge accuracy of any model tested to date. However, this intelligence comes with a 'reasoning tax': the model more than doubles its token usage compared to the 2.5 Flash series when tackling complex indexes.
This high token density is offset by Google's aggressive pricing: when accessing through the Gemini API, Gemini 3 Flash costs $0.50 per 1 million input tokens, compared to $1.25/1M input tokens for Gemini 2.5 Pro, and $3/1M output tokens, compared to $ 10/1 M output tokens for Gemini 2.5 Pro. This allows Gemini 3 Flash to claim the title of the most cost-efficient model for its intelligence tier, despite being one of the most 'talkative' models in terms of raw token volume. Here's how it stacks up to rival LLM offerings:
Model | Input (/1M) | Output (/1M) | Total Cost | Source |
Qwen 3 Turbo | $0.05 | $0.20 | $0.25 | |
Grok 4.1 Fast (reasoning) | $0.20 | $0.50 | $0.70 | |
Grok 4.1 Fast (non-reasoning) | $0.20 | $0.50 | $0.70 | |
deepseek-chat (V3.2-Exp) | $0.28 | $0.42 | $0.70 | |
deepseek-reasoner (V3.2-Exp) | $0.28 | $0.42 | $0.70 | |
Qwen 3 Plus | $0.40 | $1.20 | $1.60 | |
ERNIE 5.0 | $0.85 | $3.40 | $4.25 | |
Gemini 3 Flash Preview | $0.50 | $3.00 | $3.50 | |
Claude Haiku 4.5 | $1.00 | $5.00 | $6.00 | |
Qwen-Max | $1.60 | $6.40 | $8.00 | |
Gemini 3 Pro (≤200K) | $2.00 | $12.00 | $14.00 | |
GPT-5.2 | $1.75 | $14.00 | $15.75 | |
Claude Sonnet 4.5 | $3.00 | $15.00 | $18.00 | |
Gemini 3 Pro (>200K) | $4.00 | $18.00 | $22.00 | |
Claude Opus 4.5 | $5.00 | $25.00 | $30.00 | |
GPT-5.2 Pro | $21.00 | $168.00 | $189.00 |
But enterprise developers and users can cut costs further by eliminating the lag most larger models often have, which racks up token usage. Google said the model “is able to modulate how much it thinks,” so that it uses more thinking and therefore more tokens for more complex tasks than for quick prompts. The company noted Gemini 3 Flash uses 30% fewer tokens than Gemini 2.5 Pro.
To balance this new reasoning power with strict corporate latency requirements, Google has introduced a 'Thinking Level' parameter. Developers can toggle between 'Low'—to minimize cost and latency for simple chat tasks—and 'High'—to maximize reasoning depth for complex data extraction. This granular control allows teams to build 'variable-speed' applications that only consume expensive 'thinking tokens' when a problem actually demands PhD-level lo
The economic story extends beyond simple token prices. With the standard inclusion of Context Caching, enterprises processing massive, static datasets—such as entire legal libraries or codebase repositories—can see a 90% reduction in costs for repeated queries. When combined with the Batch API’s 50% discount, the total cost of ownership for a Gemini-powered agent drops significantly below the threshold of competing frontier models
“Gemini 3 Flash delivers exceptional performance on coding and agentic tasks combined with a lower price point, allowing teams to deploy sophisticated reasoning costs across high-volume processes without hitting barriers,” Google said.
By offering a model that delivers strong multimodal performance at a more affordable price, Google is making the case that enterprises concerned with controlling their AI spend should choose its models, especially Gemini 3 Flash.
But how does Gemini 3 Flash stack up against other models in terms of its performance?
Doshi said the model achieved a score of 78% on the SWE-Bench Verified benchmark testing for coding agents, outperforming both the preceding Gemini 2.5 family and the newer Gemini 3 Pro itself!
For enterprises, this means high-volume software maintenance and bug-fixing tasks can now be offloaded to a model that is both faster and cheaper than previous flagship models, without a degradation in code quality.
The model also performed strongly on other benchmarks, scoring 81.2% on the MMMU Pro benchmark, comparable to Gemini 3 Pro.
While most Flash type models are explicitly optimized for short, quick tasks like generating code, Google claims Gemini 3 Flash’s performance “in reasoning, tool use and multimodal capabilities is ideal for developers looking to do more complex video analysis, data extraction and visual Q&A, which means it can enable more intelligent applications — like in-game assistants or A/B test experiments — that demand both quick answers and deep reasoning.”
So far, early users have been largely impressed with the model, particularly its benchmark performance.
With Gemini 3 Flash now serving as the default engine across Google Search and the Gemini app, we are witnessing the "Flash-ification" of frontier intelligence. By making Pro-level reasoning the new baseline, Google is setting a trap for slower incumbents.
The integration into platforms like Google Antigravity suggests that Google isn't just selling a model; it's selling the infrastructure for the autonomous enterprise.
As developers hit the ground running with 3x faster speeds and a 90% discount on context caching, the "Gemini-first" strategy becomes a compelling financial argument. In the high-velocity race for AI dominance, Gemini 3 Flash may be the model that finally turns "vibe coding" from an experimental hobby into a production-ready reality.
Presented by T-Mobile for Business
Small and mid-sized businesses are adopting AI at a pace that would have seemed unrealistic even a few years ago. Smart assistants that greet customers, predictive tools that flag inventory shortages before they happen, and on-site analytics that help staff make decisions faster — these used to be features of the enterprise. Now they’re being deployed in retail storefronts, regional medical clinics, branch offices, and remote operations hubs.
What’s changed is not just the AI itself, but where it runs. Increasingly, AI workloads are being pushed out of centralized data centers and into the real world — into the places where employees work and customers interact. This shift to the edge promises faster insights and more resilient operations, but it also transforms the demands placed on the network. Edge sites need consistent bandwidth, real-time data pathways, and the ability to process information locally rather than relying on the cloud for every decision.
The catch is that as companies race to connect these locations, security often lags behind. A store may adopt AI-enabled cameras or sensors long before it has the policies to manage them. A clinic may roll out mobile diagnostic devices without fully segmenting their traffic. A warehouse may rely on a mix of Wi-Fi, wired, and cellular connections that weren’t designed to support AI-driven operations. When connectivity scales faster than security, it creates cracks — unmonitored devices, inconsistent access controls, and unsegmented data flows that make it hard to see what’s happening, let alone protect it.
Edge AI only delivers its full value when connectivity and security evolve together.
Businesses are shifting AI to the edge for three core reasons:
Real-time responsiveness:Some decisions can’t wait for a round trip to the cloud. Whether it’s identifying an item on a shelf, detecting an abnormal reading from a medical device, or recognizing a safety risk in a warehouse aisle, the delay introduced by centralized processing can mean missed opportunities or slow reactions.
Resilience and privacy:Keeping data and inference local makes operations less vulnerable to outages or latency spikes, and it reduces the flow of sensitive information across networks. This helps SMBs meet data sovereignty and compliance requirements without rewriting their entire infrastructure.
Mobility and deployment speed:Many SMBs operate across distributed footprints — remote workers, pop-up locations, seasonal operations, or mobile teams. Wireless-first connectivity, including 5G business lines, lets them deploy AI tools quickly without waiting for fixed circuits or expensive buildouts.
Technologies like Edge Control from T-Mobile for Business fit naturally into this model. By routing traffic directly along the paths it needs — keeping latency-sensitive workloads local and bypassing the bottlenecks that traditional VPNs introduce — businesses can adopt edge AI without dragging their network into constant contention.
Yet the shift introduces new risk. Every edge site becomes, in effect, its own small data center. A retail store may have cameras, sensors, POS systems, digital signage, and staff devices all sharing the same access point. A clinic may run diagnostic tools, tablets, wearables, and video consult systems side by side. A manufacturing floor might combine robotics, sensors, handheld scanners, and on-site analytics platforms.
This diversity increases the attack surface dramatically. Many SMBs roll out connectivity first, then add piecemeal security later — leaving the blind spots attackers rely on.
When AI is distributed across dozens or hundreds of sites, the old idea of a single secure “inside” network breaks down. Every store, clinic, kiosk, or field location becomes its own micro-environment — and every device within it becomes its own potential entry point.
Zero trust offers a framework to make this manageable.
At the edge, zero trust means:
Verifying identity rather than location— access is granted because a user or device proves who it is, not because it sits behind a corporate firewall.
Continuous authentication — trust isn’t permanent; it’s re-evaluated throughout a session.
Segmentation that limits movement — if something goes wrong, attackers can’t jump freely from system to system.
This approach is especially critical given that many edge devices can’t run traditional security clients. SIM-based identity and secure mobile connectivity — areas where T-Mobile for Business brings significant strength — help verify IoT devices, 5G routers, and sensors that otherwise sit outside the visibility of IT teams.
This is why connectivity providers are increasingly combining networking and security into a single approach. T-Mobile for Business embeds segmentation, device visibility, and zero-trust safeguards directly into its wireless-first connectivity offerings, reducing the need for SMBs to stitch together multiple tools.
A major architectural shift is underway: networks that assume every device, session, and workload must be authenticated, segmented, and monitored from the start. Instead of building security on top of connectivity, the two are fused.
T-Mobile for Business solutions shows how this is evolving. Its SASE platform, powered by Palo Alto Networks Prisma SASE 5G, blends secure access with connectivity into one cloud-delivered service. Private Access gives users the least-privileged access they need, nothing more. T-SIMsecure authenticates devices at the SIM layer, allowing IoT sensors and 5G routers to be verified automatically. Security Slice isolates sensitive SASE traffic on a dedicated portion of the 5G network, ensuring consistency even during heavy demand.
A unified dashboard like T-Platform brings it together, offering real-time visibility across SASE, IoT, business internet, and edge control — simplifying operations for SMBs with limited staff.
As AI models become more dynamic and autonomous, we’ll see the relationship flip: the edge won’t just support AI; AI will actively run and secure the edge — optimizing traffic paths, adjusting segmentation automatically, and spotting anomalies that matter to one specific store or site.
Self-healing networks and adaptive policy engines will move from experimental to expected.
For SMBs, this is a pivotal moment. The organizations that modernize their connectivity and security foundations now will be the ones best positioned to scale AI everywhere — safely, confidently, and without unnecessary complexity.
Partners like T-Mobile for Business are already moving in this direction, giving SMBs a way to deploy AI at the edge without sacrificing control or visibility.
Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contactsales@venturebeat.com.
Presented by SAP
In an era where anyone can spin up an LLM, the real differentiator isn’t the AI technology itself, but the institutional knowledge it’s grounded in. Internal and partner consultants leading operational transformation can’t risk hallucinated guidance when their recommendations impact integrated processes across supply chain, manufacturing, finance, and other core functions.
"Grounded AI is non-negotiable, because accuracy isn’t optional when we’re doing million-dollar transformation projects within the SAP ecosystem, for example," says Natalie Han, VP and chief product officer, gen AI at SAP Business AI. "Retrieval-augmented generation technology, and the ability to anchor responses in trusted enterprise knowledge, helps ensure accurate code interpretation, best-practice guidance, and clean-core decision support. It's how we bring real trust into AI-powered consulting."
A fully grounded AI assistant like SAP Joule for Consultants has tremendous value in production use cases, she adds. SAP Joule has terabytes of institutional data that's continuously curated and updated, so a consultant is assured they're getting up-to-the-minute SAP best practices and methodologies when relying on Joule, while at the same time accelerating project delivery.
"We’re saving rework time by 14%, and saving consultants 1.5 hours per day per user, which is huge when you consider how expensive consultants are now," Han says. "Early adopters like Wipro have estimated they've saved 7 million hours on a manual basis for their consultants."
SAP Joule is as certified as any consultant, says Sachin Kaura, chief architect, SAP Business AI. The tool was born in 2023, when GPTs famously passed a simulated bar exam and ignited buzz around the ability of LLMs to handle large amounts of context. It is widely acknowledged that the SAP ecosystem, along with its associated domain ontology and taxonomy, is incredibly vast and can be very complex to navigate. The question became, how could an AI co-pilot be used to navigate that complexity when it was actually grounded within the SAP ecosystem itself?
Sachin Kaura began experimenting with frontier LLM models by putting them through the same certification exams SAP consultants take. The early results were poor, but after extensive context tuning and a focus on delivering value to the partner ecosystem, Joule now consistently scores 95% or higher.
"Not only were we testing from a data perspective, but we were able to work with all of our consultants to get what we call the golden data set," Han added. "It’s non-deterministic, language-based, and thoroughly grounded in human consultant expertise. We partnered with the whole consulting organization to manually label the golden data set across all of the products. That’s become the foundation for everything we do even now."
Joule for Consultants stays up-to-date in real time. A state-of-the-art indexing pipeline pushes new SAP documentation and release content into the model as soon as it’s published, giving consultants confidence that every answer reflects the most current guidance.
"This is pure engineering work done by our data scientists and engineers, using a lot of underlying SAP technology," Kaura explains. "We leverage the SAP business foundation layer, document grounding services, and a lot of purpose-built systems to stay on top of current events in the system."
SAP Business AI also has board-level alignment, ensuring this isn’t just a one-team effort but a company-wide priority. They’ve built strong internal partnerships with content owners across SAP — including SAP Learning, SAP Community, SAP Help, product teams, and consultant teams. Together, they continuously update proprietary content such as SAP Notes, Knowledge Base Articles (KBAs), and other domain-specific guidance that reflects SAP’s evolving best practices.
All of this means Joule for Consultants can take that continuously refreshed data and deliver answers in near real time. It's the kind of research that would otherwise take a consultant hours. But information pulled directly from the source gives consultants the most current and authoritative guidance available, helping eliminate the early-stage missteps that can derail a project months later when scoping wasn’t aligned with the latest capabilities.
SAP is building a product that is relevant, reliable, and responsible, Han says. As a company founded in Europe, it takes data privacy seriously, adhering to the GDPR and other EU company regulations. At the core of SAP Business AI is the AI Foundation, the AI operating system that governs AI with built-in security, ethics, and orchestration, using automation and intelligence to manage lifecycles, optimize resources, and boost resilience.
All the LLMs SAP and its customers use operate within the AI foundation, which protects private and proprietary data from being leaked. Beyond data protection, SAP treats bias, ethics, and security at an enterprise level as well, with humans in the loop to run checks and balances.
"We have an enterprise-grade security framework as well as prompt injection and guardrail testing," Kaura says. "The orchestration layer, built within the AI Foundation, anonymizes inputs as well as moderates them to prevent malicious content. That ensures that the output we give to our customers is relevant to the SAP ecosystem, relevant to the domain they’re asking about, and not just generic LLM excess. This set of tools, from the framework layer to the application layer to the product standards, and also the very thorough testing is critical to securing our product. Then and only then can it reach our customers and partners."
"We’re barely scratching the surface of what LLMs and agentic AI can offer," Han says. "Accessing knowledge is just the beginning. We’re going to have a much deeper understanding of customers’ SAP systems and be able to help them implement and transform their journey. The product team and our engineers are working to make the tool more transformative, able to unearth more insights, connect with customers’ systems, and understand and optimize their processes, including generating code and handling customer code migration."
The next step is adding a second layer of grounding. SAP’s customer base is vast, and its partner ecosystem has implemented countless business scenarios. Grounding Joule in SAP’s institutional knowledge was the first milestone; the next is layering in each customer’s own proprietary context — historical system data, process designs, implementation blueprints, and internal documentation. This turns Joule from SAP-aware to customer-aware, delivering guidance that aligns with how a business actually operates.
“Think of it as grounding your knowledge on top of SAP knowledge — giving you more accurate and relevant guidance,” Kaura says. “Information that might otherwise be lost can sit on top of Joule for Consultants. Our system processes it and ensures it comes to you in the right manner and at the right time.”
This expanded grounding also lets Joule adjust its guidance to the consultant’s role — whether they’re working as an architect, a functional consultant, or a technical consultant.
"We deliver the information they need for a particular customer configuration," Han explains. "Then we can not only answer generic questions, but we can answer their particular configuration. From there it’s one step ahead to generating more insights and taking more actions."
Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contactsales@venturebeat.com.