Increased delay in starting replicas using A100 GPUs

Incident Report for Baseten

Resolved

This incident has been resolved. Start time for replicas using A100 GPUs is back to normal
Posted Dec 08, 2023 - 08:24 PST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Dec 08, 2023 - 08:01 PST

Update

We are rolling out a fix and are seeing improvement in A100 start times.
Posted Dec 08, 2023 - 07:08 PST

Identified

The issue has been identified and a fix is being implemented.
Posted Dec 08, 2023 - 06:46 PST
This incident affected: Model Inference.