I built a straightforward demo to showcase Gemini's object detection capabilities. Upload any image, get bounding boxes and clean JSON output. It's currently the only model I've found that can accurately return object bounds.
Demo: https://langtail.com/gemini-bounding-boxes
The tool is pretty simple:
- Drop an image or use example ones (apples, llamas)
- Hit detect
- Get visual bounding boxes + JSON output
Happy to answer any questions!