The AI photo feature in Feiniu NAS (fnOS) is typically built by integrating mainstream open-source models, rather than training all core algorithms from scratch.
1) Face recognition: InsightFace
For face-related functions, InsightFace is usually the core.
- Common feature-learning method: ArcFace
- Main role: face detection, embedding extraction, clustering, and person recognition
2) Object and scene understanding: YOLO family
Object detection in photos (for example cats, dogs, cars, computers) and part of scene-level understanding are generally handled by YOLO models (often YOLOv8 or lightweight variants).
- Strength: good speed/accuracy balance
- Fit: edge-like NAS environments with limited compute budgets
3) Semantic search: CLIP / Chinese-CLIP
A key capability is natural-language photo search, such as “a dog on the grass” or “a man wearing sunglasses.”
Typical implementation uses CLIP:
- images and text are projected into the same embedding space
- Chinese deployments usually add Chinese-CLIP or similar localized variants
Summary
A simple way to view the fnOS AI photo stack:
- InsightFace for faces
- YOLO for objects and scenes
- CLIP for text-image semantic alignment
The main engineering value is in integration quality, localization, and hardware acceleration, more than from-zero model invention.