iFood Scales Machine Learning With Amazon SageMaker Inference

Category :

AI

Posted On :

Share This :

 

With its national private headquarters located in São Paulo, Brazil, iFood is the leading food-tech company in Latin America, handling millions of orders every month. iFood has distinguished itself by integrating state-of-the-art technology into its business processes. iFood has established a strong infrastructure for machine learning (ML) inference, effectively developing and implementing ML models with services like Amazon SageMaker. Through this collaboration, iFood has been able to provide its delivery partners and restaurants with cutting-edge solutions in addition to streamlining its internal operations.

The tools, procedures, and workflows that make up iFood’s ML platform were created with the following goals in mind:

Increase the speed at which AI/ML models are developed and trained to increase their dependability and reproducibility.
Ensure that the production deployment of these models is traceable, scalable, and dependable.
Provide a clear, easily available, and consistent way to test, monitor, and assess models in production.

iFood leverages SageMaker, which makes model deployment and training easier, to accomplish these goals. Additionally, crucial procedures like creating training datasets, training models, delivering models to production, and regularly assessing their performance are automated by the incorporation of SageMaker features into iFood’s infrastructure.

In this article, we demonstrate how iFood is transforming its machine learning processes with SageMaker. From model training to deployment, iFood optimizes the whole machine learning lifecycle by utilizing SageMaker’s capabilities. This connection automates important operations and streamlines complicated processes.

 

At iFood, AI Inference

iFood has improved the customer experience across all of its touchpoints by utilizing the strength of a strong AI/ML platform. The business has created a suite of revolutionary solutions to handle a wide range of client use cases by utilizing the state-of-the-art in AI/ML capabilities:

Personalized recommendations: To identify the best restaurants and menu items, iFood’s AI-powered recommendation algorithms examine a customer’s past order history, preferences, and contextual information. Customers are certain to discover new cuisines and foods that suit their interests thanks to this individualized approach, which also increases customer happiness and order volumes.

Intelligent order tracking: iFood’s AI systems keep track of orders in real time and make very accurate delivery time predictions. The AI can proactively inform clients of their order status and anticipated arrival by comprehending variables such as traffic patterns, restaurant preparation periods, and courier locations. This helps to alleviate customers’ worry and uncertainty throughout the delivery process.

Automated customer service: iFood has created a chatbot driven by artificial intelligence (AI) to promptly address the thousands of consumer inquiries it receives each day. By comprehending spoken language, retrieving pertinent information, and responding with tailored responses, this intelligent virtual assistant offers prompt, reliable assistance without putting an undue strain on the human customer care staff.

Help with grocery shopping: By using sophisticated language models, iFood’s app enables users to write or speak their grocery list or recipe requirements, and the AI will automatically create a thorough shopping list. Customers’ overall shopping experience is improved by this voice-activated grocery planning tool, which saves them time and effort.

With the help of these various AI-powered projects, iFood is able to predict client demands, optimize important procedures, and provide a continuously outstanding experience, solidifying its position as the top food-tech platform in Latin America.

 

Overview Of The Solution

The legacy architecture of iFood, which featured distinct workflows for data science and engineering teams, is depicted in the following figure. This made it difficult to effectively implement precise, real-time machine learning models into production systems.

 

The engineering and data science departments at iFood used to work separately. Using notebooks, data scientists would create models, modify weights, and post them to services. After then, engineering teams would find it difficult to incorporate these models into operational systems. It was difficult to implement precise real-time ML models because of this mismatch between the two teams.

iFood developed an internal machine learning platform to help close this gap and overcome this difficulty. The process of developing, training, and delivering models for inference has been made smoother by this platform. It offers a centralized integration where data scientists can easily create, train, and implement models using an integrated approach while taking the teams’ development workflow into account. These models might be consumed by engineering teams and incorporated into offline and online apps, allowing for a more simplified and effective workflow.

 

AI platforms enabled iFood to leverage their data to its fullest potential and expedite the development of AI applications by removing the boundaries between data science and engineering. SageMaker’s scalable inference capabilities and automated deployment ensured that models were always accessible to support intelligent applications and deliver precise predictions when needed. For iFood, the centralization of machine learning services as a product has changed everything by freeing them up to concentrate on creating high-performing models rather than the minute intricacies of inference.

Providing the infrastructure to support predictions is one of iFood’s ML platform’s primary functions. The inference provided by ML Go supports a number of application cases! in charge of setting up SageMaker endpoints and pipelines. While the latter are used to develop model services that are used by the application services, the former are used to schedule offline prediction jobs. The redesigned architecture of iFood is depicted in the accompanying figure. It includes an internal machine learning platform designed to facilitate the deployment of machine learning models into production systems by streamlining workflows between data science and engineering teams.

 

One important step to help data scientists and ML engineers install and manage those models was to incorporate model deployment into the service development process. The development of ML systems is facilitated by the ML platform. To improve the overall user experience, a number of additional integrations with other significant platforms, such as the feature and data platforms, were provided. Consuming ML-based decisions was made easier, but the process is not finished. ML Go, the ML platform from iFood! is currently concentrating on novel inference capabilities, which are bolstered by recent features that the iFood team was in charge of helping to build and conceptualize. The final architecture of iFood’s ML platform is depicted in the accompanying diagram, which also highlights the platform’s linkages to feature and data platforms, emphasis on new inference capabilities, and integration of model deployment into the service development process.

 

The development of ML Go, a single abstraction for integrating with SageMaker Endpoints and Jobs, is one of the most significant modifications. Serving is made quicker and more effective by the Gateway and the division of responsibilities among the Endpoints with the usage of the Inference Components feature. The ML Go is also in charge of managing the Endpoints in this new inference structure! Only model promotions should be handled by CI/CD, leaving the infrastructure itself to the pipelines. It will shorten the time it takes to make modifications and lower the failure rate during deployments.

 

Using Serving Containers For SageMaker Inference Models:

The standardization of AI and machine learning services is one of the main characteristics of contemporary machine learning platforms. These platforms provide consistency and portability across various contexts and ML stages by encapsulating models and dependencies as Docker containers. Data scientists and developers may use pre-built Docker containers with SageMaker, which makes managing and deploying ML services simple. They can spin up new instances and configure them to meet their unique needs as a project develops. Docker containers are offered by SageMaker and are made to integrate easily with SageMaker. These containers offer a scalable and uniform environment for SageMaker ML workloads.

For well-known machine learning frameworks and algorithms, like TensorFlow, PyTorch, XGBoost, and numerous others, SageMaker offers a collection of pre-built containers. With all the required dependencies and libraries pre-installed and performance optimized, these containers make it simple to begin working on your machine learning applications. It offers the ability to import your own bespoke containers, complete with your unique ML code, dependencies, and libraries, into SageMaker in addition to the pre-built containers. If you’re using a less popular framework or have specialized needs that the pre-built containers don’t address, this can be really helpful.

 

iFood placed a lot of emphasis on deploying bespoke containers for ML workload deployment and training, offering a reliable and consistent environment for ML experiments, and making it simple to monitor and duplicate outcomes. Standardizing the ML custom code—which is essentially the code that the data scientists should concentrate on—was the first step in this process. BruceML and the lack of a notebook have altered the process of writing code for training and serving models, requiring it to be packaged as container images from the beginning. The scaffolding needed to smoothly connect with the SageMaker platform and enable the teams to utilize its many features, including model deployment, monitoring, and hyperparameter tuning, was developed by BruceML. Modern platforms democratize machine learning (ML) by standardizing ML services and utilizing containerization, which allows iFood to quickly develop, implement, and scale intelligent applications.

 

Automating the deployment of models and retraining machine learning systems. Having a reliable and automated procedure for deploying and recalibrating ML models across various use cases is essential when using them in production. This ensures that the models continue to be reliable and effective throughout time. This difficulty was thoroughly understood by the iFood team—not just the model is implemented. To keep everything functioning properly, they instead rely on a different idea: ML pipelines.

To provide automated retraining and model deployment, they developed a CI/CD system for machine learning using Amazon SageMaker Pipelines. Additionally, they linked the entire system with the business’s current CI/CD pipeline, enhancing its effectiveness and upholding iFood’s sound DevOps principles. The ML Go is the first step! The most recent code artifacts with the model training and deployment logic are being pushed via the CI/CD pipeline. It comprises the training procedure, which implements the complete pipeline using various containers. The inference pipeline can be run to start the model deployment when training is finished. It could be a brand-new model or the marketing of an updated version to improve the functionality of an already-existing one. Additionally, ML Go automatically registers and secures every model that is available for deployment! in the Amazon SageMaker Model Registry, offering tracking and versioning features.

 

The intended inference requirements determine the last step. In order to execute large-scale predictions for batch prediction use cases, the pipeline generates a SageMaker batch transform job. The pipeline carefully chooses the right instance type and container variant to manage the anticipated production traffic and latency requirements before deploying the model to a SageMaker endpoint for real-time inference. iFood’s ability to quickly iterate on their ML models and securely release updates and recalibrations across their diverse use cases has been made possible by this end-to-end automation. In order to ensure dependable and effective model operationalization, SageMaker Pipelines has offered a simplified method for coordinating these intricate processes.

 

Executing Inference In Various SLA Forms

To run its intelligent apps and provide its clients with precise forecasts, iFood leverages SageMaker’s inference capabilities. iFood has successfully implemented machine learning models and made them accessible for both batch and real-time predictions by incorporating the powerful inference features offered by SageMaker. iFood deploys its models on SageMaker hosted endpoints for its online, real-time prediction use cases. Because these endpoints are incorporated into iFood’s customer-facing applications, incoming user data can be immediately inferred. By managing and scaling these endpoints, SageMaker ensures that iFood’s models are always accessible to deliver precise forecasts and improve user experience.

iFood performs large-scale, asynchronous inference on datasets using SageMaker batch transform in addition to real-time predictions. This is especially helpful for iFood’s batch prediction and data preprocessing needs, such creating suggestions or insights for their restaurant partners. iFood’s data-driven decision-making is further enhanced by the efficient processing of enormous volumes of data made possible by SageMaker batch transform operations.

 

iFood has played a significant role in collaborating with the SageMaker Inference team to develop and improve important AI inference features within the SageMaker platform, building on the success of standardizing to SageMaker Inference. The SageMaker Inference team has benefited from iFood’s invaluable contributions and experience since the beginning of machine learning, which has allowed for the implementation of numerous new features and enhancements:

 

Cost and performance optimizations for generative AI inference: iFood assisted the SageMaker Inference team in creating novel methods to maximize accelerator utilization, which allowed SageMaker Inference to lower latency with inference components by 20% on average and foundation model (FM) deployment costs by 50% on average. Customers using SageMaker to execute generative AI workloads benefit greatly from this innovation in terms of cost reductions and performance enhancements.

Scaling enhancements for AI inference: The SageMaker team has developed sophisticated tools to better manage the scaling needs of generative AI models thanks to iFood’s experience with distributed systems and auto scaling. These enhancements ensure that users can quickly scale their inference workloads on SageMaker to meet demand surges without sacrificing performance by cutting auto scaling detection by six times and auto scaling times by up to 40%.

 

Simplified generative AI model deployment for inference – iFood offer the capability of deploying open-source large language models (LLMs) and FMs with a few clicks after seeing the need for a more straightforward model deployment process. By eliminating the complexity typically involved in implementing these sophisticated models, this user-friendly functionality enables more clients to take advantage of AI’s potential.

The development and implementation of the scale-to-zero feature for SageMaker inference endpoints was made possible in large part by iFood’s partnership with SageMaker Inference. This novel feature enables inference endpoints to quickly spin up on demand when new requests come in and to immediately shut down when not in use. This functionality reduces idle resource costs while preserving the capacity to promptly service requests when required, making it especially advantageous for dev/test environments, low-traffic applications, and inference use cases with different inference demands. The scale-to-zero capability is a significant improvement in AI inference’s cost-efficiency, making it more widely available and financially feasible for a greater variety of use cases.

 

Increasing the efficiency of AI model inference packing iFood collaborated to improve SageMaker’s capacity to package LLMs and models for deployment, greatly streamlining the AI model lifecycle. These advancements speed up the adoption and integration of these AI models by making preparation and deployment simple.

Multi-model endpoints for GPU: iFood and the SageMaker Inference team worked together to introduce multi-model endpoints for instances that are GPU-based. This improvement greatly increases resource usage and cost-efficiency by enabling the deployment of many AI models on a single GPU-enabled endpoint. SageMaker now provides a solution that can dynamically load and unload models on GPUs by leveraging iFood’s experience in model serving and GPU optimization. This can save up to 75% on infrastructure expenses for clients with many models and different traffic patterns.

Asynchronous inference: The iFood team collaborated closely with the SageMaker Inference team to create and implement Asynchronous Inference in SageMaker after realizing the necessity of managing lengthy inference requests. With this functionality, you may handle lengthy inference requests or big payloads without being constrained by real-time API calls. This method, which was shaped in part by iFood’s experience with large-scale distributed systems, enables improved resource-intensive inference task management and the handling of inference requests that could take several minutes to finish. AI inference now has new applications thanks to this capabilities, especially in fields like genomics, video analysis, and financial modeling that deal with intricate data processing jobs.

 

The quick development of AI inference and generative AI inference capabilities in SageMaker has been greatly aided by iFood’s strong collaboration with the SageMaker Inference team. Customers are now able to more easily, affordably, and efficiently realize the transformative potential of inference because to the capabilities and optimizations brought about by this partnership.

Our collaboration with the SageMaker Inference product team has been crucial in influencing the direction of AI applications, and at iFood, we were at the forefront of implementing revolutionary machine learning and AI technologies. We’ve collaborated to create methods for effectively handling inference workloads, which enables us to run models quickly and affordably. Our internal platform, which can act as a model for other businesses wishing to use the potential of AI inference, was developed with the help of the lessons we’ve learnt. By addressing persistent and significant issues in the field of machine learning engineering, we think the collaborative features we have developed will significantly benefit other businesses who use SageMaker for inference workloads, opening up new avenues for innovation and business transformation.

-states iFood’s ML Platform manager, Daniel Vieira.

 

In Conclusion

By utilizing SageMaker’s capabilities, iFood revolutionized its approach to machine learning and artificial intelligence, opening up new avenues for improving the consumer experience. iFood has streamlined the model lifecycle from development to deployment by bridging the gap between its data science and engineering teams through the creation of a strong and centralized machine learning platform. iFood is now able to create ML models for both batch-oriented and real-time use cases thanks to the integration of SageMaker functionalities. SageMaker hosted endpoints are used by iFood for real-time, customer-facing applications in order to improve user experience and offer instant forecasts. In order to efficiently process massive datasets and produce insights for its restaurant partners, the company also employs SageMaker batch transform. The ability of iFood to support a wide variety of intelligent applications has been largely attributed to its flexibility in inference alternatives.

 

Deployment and retraining automation with ML Go! has revolutionized iFood and is backed by SageMaker Pipelines and SageMaker Inference. This has made it possible for the business to confidently release updates, quickly iterate on its machine learning models, and preserve the continuous functionality and dependability of its intelligent apps. Furthermore, the development of AI inference capabilities within the platform has been greatly aided by iFood’s strategic alliance with the SageMaker Inference team. A larger spectrum of customers are now benefiting from the cost and performance optimizations, scalability enhancements, and model deployment feature simplifications that iFood helped develop through this partnership.

By utilizing SageMaker’s capabilities, iFood has been able to unleash the revolutionary potential of AI and ML, providing cutting-edge solutions that improve customer satisfaction and solidify its position as Latin America’s top food-tech platform. The strength of cloud-based AI infrastructure and the importance of strategic alliances in promoting technology-driven business transformation are demonstrated by this journey.

You can maximize SageMaker’s potential for your company, spur innovation, and maintain your competitive edge by taking a cue from iFood.