Utterance too long, it's hard to keep the 15 seconds bottom line. Noise. What is loud noise, what is faint noise? Speaking in background. Should we segment them as normal words?