DPT BEiT Large 512

JSON →
intel vision
image

A monocular depth estimation model using a large BEiT transformer backbone with DPT head, processing 512x512 inputs.

vision
releasedSep 2022