Overview

Taam Cloud supports advanced visual recognition capabilities through multiple state-of-the-art models, enabling rich multimodal interactions in your applications.

Supported Models

GPT-4 Vision

  • High accuracy recognition
  • Detailed scene analysis
  • Complex visual reasoning

Gemini Pro Vision

  • Fast processing
  • Multiple object detection
  • Context understanding

Features

Using Vision Recognition

Select Vision Model

Choose from available vision-enabled models in the model selector

Upload Image

Drag and drop or click to upload your image

Ask Questions

Enter your queries about the image:

  • “What’s in this image?”
  • “Can you describe the scene?”
  • “What text appears in this image?”

Example Interactions

// Example API request
{
  "model": "gpt-4-vision",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What's in this image?"
        },
        {
          "type": "image_url",
          "image_url": "https://example.com/image.jpg"
        }
      ]
    }
  ]
}

Best Practices

Image Quality

Use clear, well-lit images

Specific Questions

Ask focused, clear questions

Context Matters

Provide relevant context

Images are processed securely and temporarily. See our privacy policy for details.

Vision recognition consumes additional credits based on image size and complexity.