Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Join the event that trusts business leaders for almost two decades. VB Transform brings together people who build a real business AI strategy. Learn more
In particular in this emerging era of generative AI, the costs of the clouds are at a record level. But it is not only because companies use more calculation – they do not use it effectively. In fact, just this year, companies should waste $ 44.5 billion On unnecessary cloud expenditure.
This is a amplified problem for Akamai Technologies: The company has a significant and complex cloud infrastructure on several clouds, not to mention many strict security requirements.
HAS Resolve this, the cybersecurity and content delivery provider turned to the Kubernetes automation platform You launchwhose AI agents help optimize the cost, security and speed in cloud environments.
In the end, the platform helped Akamai to reduce between 40% and 70% of cloud costs, depending on the workload.
“We needed a continuous way to optimize our infrastructure and reduce our cloud costs without sacrificing performance,” said DEKEL SHAVIT, principal Cloud engineering director at Akamai, in Venturebeat. “We are those who treat security events. Delay is not an option. If we are unable to respond to a real -time security attack, we have failed. ”
Kubernetes manages the infrastructure that performs applications, which facilitates them, evolve and manage them, in particular in cloudy and microservice architectures.
Cast Ai has integrated the Kubernetes ecosystem to help customers evolve their clusters and workloads, select the best infrastructure and manage calculation life cycles, explained the founder and CEO Laurent Gil. Its main platform is the Automation of Performance Application (APA), which operates through a team of specialized agents who monitor, analyze and take measures to improve performance, safety, efficiency and cost of applications. Companies only provide for the calculation they need AWS, Microsoft, Google or others.
APA is fueled by several automatic learning models (ML) with strengthening learning (RL) based on historical data and models learned, improved by an observability and heuristic battery. It is associated with infrastructure tools such as code (IAC) on several clouds, making it a completely automated platform.
Gil explained that the APA was built on the principle that observability is only a starting point; As he called, observability is “the foundation, not the goal”. The IA CAST also supports incremental adoption, so that customers do not have to tear and replace; They can integrate into existing tools and workflows. In addition, nothing ever leaves a client infrastructure; All analyzes and actions occur in their dedicated Kubernetes clusters, offering more security and control.
Gil also stressed the importance of human centricity. “Automation completes human decision-making,” he said, APA retaining human work flows in the community.
Shavit explained that the great and complexes of Akamai Cloud infrastructure Powers Content Delivery Network (CDN) and the cybersecurity services provided to “some of the most demanding customers and industries in the world” while complying with strict service level agreements (SLAS) and performance requirements.
He noted that for some of the services they consume, they are probably the biggest customers of their supplier, adding that they have made “tons of basic engineering and relegation” with their hyperscaler to meet their needs.
In addition, Akamai serves customers of different sizes and industries, including large financial institutions and credit card companies. The company’s services are directly linked to the security posture of its customers.
In the end, Akamai had to balance all this complexity with the cost. Shavit noted that real attacks against customers could generate a capacity of 100x or 1,000x on specific components of its infrastructure. But “the scale of our Cloud capacity of 1,000x in advance is simply not financially feasible,” he said.
His team considered optimizing the side of the code, but the inherent complexity of their business model required to focus on the main infrastructure itself.
What Akamai really needed is a Kubernetes automation platform that could optimize the management costs of its entire central infrastructure In real time on several clouds, Shavit explained and adre applications from top to bottom depending on the demand for constant evolution. But all of this had to be done without sacrificing the performance of applications.
Before implementing the casting, Shavit noted that the Devops team of Akamai had manually missed all of its Kubernetes workloads a few times a month. Given the scale and complexity of its infrastructure, it was difficult and costly. By analyzing only the workloads sporadically, they clearly missed any potential for optimization in real time.
“Now, hundreds of distribution agents make the same setting, except that they do it every second of each day,” said Shavit.
The main features of the APA used by Akamai are the automatic automation of kubernetes with the packaging of the bins (minimizing the number of bins used), automatic selection of the most profitable calculation bodies, the rights of the workload, the automation of occasional instances throughout the life cycle body and the cost analysis capacities.
“We had an overview of the cost analysis two minutes from integration, which we had never seen before,” said Shavit. “Once the active agents are deployed, the optimization was launched automatically and the savings started to enter.”
Punctual bodies – where companies can access an unused cloud capacity at reduced prices – obviously have a commercial meaning, but they turned out to be complicated due to the complex workloads of Akamai, in particular Apache Spark, noted Shavit. This meant that they needed to overgee workloads or put more hands on them, which turned out to be financially counter-intuitive.
With Cast IA, they were able to use occasional instances on Spark with “zero investment” of the engineering team or operations. The value of the occasional instances was “super clear”; They just needed to find the right tool to be able to use them. It was one of the reasons why they advanced with Cast, noted Shavit.
Although 2x or 3x savings on their cloud bill is excellent, Shavit stressed that automation without manual intervention is “invaluable”. This led to “massive” time savings.
Before implementing AI Cast, his team “was constantly moving into pimples and switches” to ensure that their production environments and their customers were up to the service in which they had to invest.
“By far, the biggest advantage was the fact that we no longer need to manage our infrastructure,” said Shavit. “The team of distribution agents now does this for us. This released our team to focus on what matters most: the publication of features faster to our customers. ”
Publisher’s note: this month VB transformGoogle Cloud CTO will be Grannis and Highmark Health SVP and director of chief analysis Richard Clarke will discuss the new AI battery in health care and real challenges of the deployment of multi-model AI systems in a complex and regulated environment. Register today.