AI powered Object Detection
eCommerce App

The Challenge
Fashion items are complex to identify due to variations in style, color, texture, size and background noise. Traditional image search systems struggle with:
- Accurate detection of multiple fashion items in a single image
- High-precision similarity matching across diverse catalogs
- Real-time performance on mobile platforms
Solution
Advanced Object Detection
- Implemented YOLO + DINO models to detect and isolate fashion items such as tops, trousers, bags, and accessories
- Achieved ~90% detection accuracy across real-world images with complex backgrounds
- Enabled precise cropping and normalization to improve downstream similarity search


Key Features
AI-Based Fashion Object Detection
The platform uses a production-grade OWL-ViT based detection pipeline enhanced with a custom multi-stage filtering architecture.
Supported Detection Categories
The system detects multiple fashion categories including:
Shirts, T-Shirts, Blouses, Jackets, Blazers, Pants, Jeans, Dresses, Shoes, Bags, Hats, Jewelry, Sunglasses
The platform also supports simultaneous multi-object detection within a single image.
Three-Gate Detection Optimization Pipeline
To improve detection quality and reduce false positives, AppleTech engineered a custom three-
stage detection pipeline.
Gate 1 — Label-Specific Thresholding
Each fashion category uses custom confidence thresholds.
Examples:
- Rings require stricter confidence
- Tops allow broader recall
This eliminated low-confidence noisy detections early in the pipeline.
Gate 2 — Layer-Aware Filtering
Different filtering rules were applied for:
Examples:
- Garments
- Accessories
- Jewelry
This improved precision across fashion layers while maintaining recall.
Gate 3 — Semantic Validation
Final semantic filtering ensured:
Examples:
- Stable detections
- Cleaner crops
- Better downstream retrieval quality
This significantly improved similarity-search performance.
Intelligent Bounding Box Optimization
One of the major challenges was inaccurate torso-level detections caused by face and head
overlap.
AppleTech introduced:
- IOU-based overlap suppression
- Face-aware bounding-box corrections
- Garment hierarchy rules
- Dress vs top-bottom conflict resolution
These enhancements improved crop quality and significantly boosted retrieval accuracy.
Fashion Similarity Search Engine
The platform includes a highly optimized fashion similarity retrieval engine.
AI Embedding Pipeline
The workflow includes:
1. Detect fashion item
2. Crop detected region
3. Generate FashionCLIP embeddings
4. Search embeddings in Weaviate
5. Re-rank results using MMR (Maximum Marginal Relevancetu)
This enabled visually relevant and diversified product recommendations.
Vector Database & Semantic Search
The system leveraged:
- Weaviate for vector similarity search
- FashionCLIP embeddings for semantic understanding
- PostgreSQL for metadata storage
Both image embeddings and text embeddings were indexed for:
- Visual search
- Semantic product search
- AI chatbot retrieval
- Outfit matching
The indexing strategy reduced duplicate embeddings and improved retrieval diversity by
indexing unique product-color combinations instead of every SKU individually.
Automated Fashion Data Pipeline
AppleTech built a scalable fashion scraping and indexing infrastructure.
Features Included
- Sitemap-based URL discovery
- Automated product scraping
- Variant extraction
- Image normalization
- Lazy-loading image handling
- Metadata extraction
- Offline recovery pipelines
The system supported multiple fashion brands and could continue operating even when third-
party websites changed frontend structures.
Technology
Results & Benefits
Detection Accuracy
- Achieved approximately 90% detection accuracy for supported fashion categories
- Reliable performance across real-world images with complex backgrounds
Similarity Search Improvements
- High-quality image-to-image retrieval
- Improved recommendation diversity using MMR reranking
- Faster semantic fashion search using vector embeddings
Operational Benefits
- Scalable microservice-based architecture
- Reduced duplicate retrievals
- Improved indexing efficiency
- Faster product discovery experience for end users
Technical Challenges Solved
Complex Fashion Classification
Resolved ambiguity between:
- Dresses vs tops
- Jackets vs blazers
- Shorts vs skirts
Small Object Detection
Improved detection of:
- Rings
- Earrings
- Bracelets
- Sunglasses
Multi-Garment Understanding
Implemented logic to better handle:
- Layered clothing
- One-piece garments
- Overlapping apparel
Retrieval Diversity
MMR-based reranking reduced near-duplicate recommendations and improved user experience.






