News

This paper proposes a simple but practically important and effective approach to improve phoneme duration expansion and contraction control in neural text-to-speech (TTS) systems for modifying the ...
Cross-Modal Search: Query with text to find relevant entities regardless of how they were described Multi-Model Support: Compatible with multiple embedding models (OpenAI text-embedding-3-small/large) ...
Person text-image matching, also known as text-based person search, aims to retrieve images of specific pedestrians using text descriptions. Although person text-image matching has made great research ...