API Documentation

Complete guide to using the Video Analyzer API

Overview

The Video Analyzer API provides comprehensive video analysis capabilities using state-of-the-art AI models:

  • Microsoft Florence-2-large: Multi-modal vision model for object detection, activity recognition, and OCR
  • OpenAI Whisper-large-v3: Advanced speech recognition model for audio transcription

The API processes videos frame-by-frame at 1-second intervals, providing detailed time-stamped analysis results.

Main Endpoint

POST /api/analyze

Analyzes a video from an S3 URL and returns comprehensive results.

Request Body:
{
  "file_url": "https://s3.amazonaws.com/bucket/video.mp4"
}
Parameters:
Parameter Type Required Description
file_url string Yes Valid S3 URL pointing to a video file
Supported Video Formats:
  • MP4
  • AVI
  • MOV
  • MKV
  • WEBM
  • FLV

Response Format

Success Response (200 OK):
{
  "objects": [],
  "activities": [],
  "text": [],
  "dialogue": ""
}
Response Fields:
Field Type Description
objects array Objects detected in each second of video
activities array Activities/actions detected in each second
text array Text/OCR results from each second
dialogue object Complete audio transcription with segments
Occurrence Object Structure:
Field Type Description
sec_in integer Start time in seconds
sec_out integer End time in seconds
frame_in integer Starting frame number
frame_out integer Ending frame number
label string Detected object/activity/text label

Error Responses

400 Bad Request:
{
  "error": "Missing parameter",
  "message": "file_url parameter is required"
}
400 Invalid URL:
{
  "error": "Invalid S3 URL",
  "message": "URL must be a valid S3 URL"
}
500 Analysis Failed:
{
  "error": "Analysis failed",
  "message": "An error occurred during video analysis: [error details]"
}
Note: Processing large videos may take several minutes. The API will return results once processing is complete.

Health Check

GET /health

Check API health and model status.

Response:
{
  "status": "healthy",
  "service": "video-analyzer-api",
  "models": {
    "florence2": "microsoft/Florence-2-large",
    "whisper": "openai/whisper-large-v3"
  }
}

Usage Examples

cURL Example:
curl -X POST http://localhost:8000/api/analyze \
  -H "Content-Type: application/json" \
  -d '{"file_url": "https://s3.amazonaws.com/bucket/video.mp4"}'
Python Example:
import requests

url = "http://localhost:8000/api/analyze"
data = {"file_url": "https://s3.amazonaws.com/bucket/video.mp4"}

response = requests.post(url, json=data)
result = response.json()

print(f"Found {len(result['objects'])} object occurrences")
print(f"Dialogue: {result['dialogue']}")
JavaScript Example:
const response = await fetch('/api/analyze', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    file_url: 'https://s3.amazonaws.com/bucket/video.mp4'
  })
});

const result = await response.json();
console.log('Analysis complete:', result);

Performance & Limitations

Performance Notes:
  • Processing time scales with video length
  • GPU acceleration available when supported
  • Memory usage optimized for efficiency
  • Frames analyzed at 1-second intervals
Limitations:
  • S3 URLs must be publicly accessible
  • Video files should be under 1GB for optimal performance
  • Processing timeout: 30 minutes
  • Concurrent requests may be limited