Marketplace For Buyers For Vendors For Partners

Modular MAX: High-Performance GenAI Serving

Serve the latest GenAI models with MAX Container - a GPU-accelerated serving environment with support for 500+ optimized models (https://builds.modular.com/), OpenAI API compatibility (https://docs.modular.com/max/api/serve), and enterprise-grade performance across diverse hardware and compute services.

Purchase this listing from Webvar in AWS Marketplace using your AWS account. In AWS Marketplace, you can quickly launch pre-configured software with just a few clicks. AWS handles billing and payments, and charges on your AWS bill.

About

The Modular Platform is an open and fully-integrated suite of AI libraries and tools that accelerates model serving and scales GenAI deployments. It abstracts away hardware complexity so you can run the most popular open models with industry-leading GPU and CPU performance without any code changes.

Our ready-to-deploy Docker container removes the complexity of deploying your own GenAI endpoint. And unlike other serving solutions, Modular enables customization across the entire stack. You can customize everything from the serving pipeline and model architecture all the way down to the metal by writing custom ops and GPU kernels in Mojo. Most importantly, Modular is hardware-agnostic and free from vendor lock-in no CUDA require so your code runs seamlessly across diverse systems.

MAX is a high-performance AI serving framework tailored for GenAI workloads. It provides low-latency, high-thoughput inference via advanced model serving optimizations like prefix caching and speculative decoding. An OpenAI-compatible serving endpoint executes native MAX and PyTorch models across GPUs and CPUs, and can be customized at the model and kernel level.

The MAX Container (max-nvidia-full) is a Docker image that packages the MAX Platform, pre-configured to serve hundreds of popular GenAI models on NVIDIA GPUs. This container is ideal for users seeking a fully optimized, out-of-the-box solution for deploying AI models.

Key capabilities include:

High-performance serving: Serve 500+ AI models from Hugging Face with industry-leading performance across NVIDIA GPUs

Flexible, portable serving: Deploy with a single Docker container across various GPUs (B200, H200, H100, A100, A10, L40 and L4) and compute services (EC2, EKS, AWS Batch, etc.) without compatibility issues.

OpenAI API Compatibility: Seamlessly integrate with applications adhering to the OpenAI API specification.

For detailed information on container contents and instance compatibility, refer to the MAX Containers Documentation (https://docs.modular.com/max/container).

To access our full Modular platform, check out https://www.modular.com/

Related Products

Sedai for ECS: Autonomous Optimization & Remediation

Sedai for ECS is an autonomous cloud management platform powered by AI/ML delivering continuous optimization for cloud operations teams to reduce costs by 40%, improve performance by 35%, reduce FCIs by 50% and improve ops productivity 6x

AI based Intelligent Document Processing

Our product is an advanced artificial intelligence solution designed to streamline document processing workflows for businesses of all sizes. Leveraging cutting-edge machine learning algorithms, it automates the extraction, analysis, and interpretation of valuable information from a variety of documents, including invoices, contracts, forms, and more. With this powerful tool, businesses can significantly reduce manual effort, minimize errors, and accelerate decision-making processes.

Automated SAS to Pyspark Conversion

SAS workloads come with high costs, scalability challenges, and limited flexibility. Hexaware’s Amaze® accelerates SAS-to-PySpark migration using Gen AI and LLM-powered automation, ensuring a 70-80% conversion accuracy, 3x-5x faster execution, and up to 40% cost savings.

Vectra Cognito for Service Delivery Partners (Billed Monthly)

Vectra Threat Detection and Response for Partner Provided Consulting Engagements

Real-time & Historical Datafeeds | Global Post-War Contemporary Auctions

This dataset is prepared for statistical factor pricing models and standardized across variables including country, region, currency, vendor, artist for seamless data filtering. It contains 20+ years of all items in the Post-War & Contemporary art category sold on auction by Christie’s, Sotheby’s, Bonhams and Phillips from 2000 to date.

Philter

Philter deidentifies and redacts sensitive information, such as Personally Identifiable Information (PII) and Protected Health Information (PHI), in text.

Tuberculosis - Total number of cases in the US | CDC

Centers for Disease Control and Prevention provides free and open access to various health related data. This release contains total number of Tuberculosis cases reported in the United States, by region and by states, in accordance with the current method of displaying WONDER data. Data on United States will exclude counts from US territories. The data is available for past 2 years.

AWS Security Review

An AWS Security Review is an opportunity to ensure that your cloud infrastructure is optimized for performance, security, reliability, and cost-effectiveness. Our certified AWS architects will identify any potential issues or areas for improvement in your AWS environment and make recommendations to get your back on track.