ViT-Base (Patch16, 224px, AugReg, IN21k)
JSON →A Vision Transformer base model with patch size 16, pretrained on ImageNet-21k with AugReg for general-purpose image representation.
Capabilities
vision
Dates
releasedOct 2021
A Vision Transformer base model with patch size 16, pretrained on ImageNet-21k with AugReg for general-purpose image representation.