Prompt Depth Anything ViT-L
JSON →A promptable depth estimation model using a Vision Transformer large backbone, enabling conditional depth prediction from user prompts.
Capabilities
vision
Dates
releasedNov 2024
A promptable depth estimation model using a Vision Transformer large backbone, enabling conditional depth prediction from user prompts.