Beyond transformers: Nvidia’s MambaVision aims to unlock faster, cheaper enterprise computer vision

Nvidia has expanded its MambaVision range of computer vision and image recognition models, releasing a series of updated models for Hugging Face that include the L and L2 variants, which are scaled-up versions of previous models released in 2024.
These new models are trained on the larger ImageNet-21K dataset, as opposed to the ImageNet-1K library, and can handle higher resolutions of 256 and 512 pixels, as opposed to the original 224 pixels.
Independent AI consultant Alex Fazio told VentureBeat that the new MambaVision models’ training on larger datasets helped make them better at handling more diverse and complex tasks.
The models’ balance of performance and efficiency means they can be deployed more easily at the enterprise level, revolutionising computer vision systems.

Fast Feed