News
Spatio-temporal grounding describes the task of localizing events in space and time, e.g., in video data, based on verbal descriptions only. Models for this task are usually trained with ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results