Let's talk about the now and future of Query by Humming(QbH). I don't want to talk about the past because there was nothing in the past. Now computer scientists are researching QbH, and I think it will be a dream technology in the future.
How far we've gone so far? Let me give real examples. In 2004, KTF(the one of Korean mobile companies) started "Search Music" service, and Verizon(the one of US mobile companies) also started "SongID" service.
http://www.muncle.com/media/mobile_news_view.jsp?seq=1396
“노래 들을 준비됐니?”라는 카피와 함께 가수 서태지의 미발표 신곡이 흐르면서 광고는 시작합니다. 시청자가 ‘휴대전화를 꺼내고’ ‘1515+통화버튼을 누른다’ ‘휴대전화를 TV에 가까이 댄다’ 등 광고 중간중간 나오는 지시를 따르면 신기하게도 휴대전화에 곡명이 뜨고, 노래를 다운받을 수 있습니다. (음악을 6초 이상 들려줘야 검색이 가능)
...KTF has promoted with Seo Taiji, who has the most powerful fandom. He released a new song through this service, but even his fans haven't used this service.
http://www.fiercemobilecontent.com/story/verizon-launches-song-identification-service/2007-05-20
Verizon Wireless announced V Cast Song ID, a new service enabling subscribers to identify a piece of music by capturing a 10-second sample via mobile handset and seconds later receiving title and artist information as well as corresponding download, ringtone and ringback offers.
...This is almost the same of KTF "Search Music". Have you ever heard this service? Maybe not.
And Bugs Music also released "Humming Search" service in October 2007, but it has closed in March 2008.
Why these services were unpopular? Why were they unsuccessful? As I used them, their performance were poor. They possibly find my songs, but not so great.
What are the problems?
1. Singers may sing incorrectly
2. There may be noise when singing
3. The program may abstract the voice to the note incorrectly
4. The program may search incorrectly
5. The Music DB may be too small or incorrect
Among these possible problems, we want to focus on searching way. we don't care noise and the abstraction from the wave sound to the note because they have already researched enough.
I think the abstraction of data to meta data is similar to MPEG-7. Actually, QbH is already included to MPEG-7 standard, it's just a provision though.
http://en.wikipedia.org/wiki/Query_by_humming
What is MPEG-7? It makes machine read audio and video. For example, we can read the jpg file and know it's Beckham, but computers can't. So we add meta-data indicates this part of the jpg file is Beckham.
MPEG-7 is similar to the Semantic Web, abstracting meta-data from data. They try to abstract semantic meanings like "It's Beckham!" from the binary data, but I'm afraid whether it's possible.
Back to QbH, we can divide vocal abstraction in two parts, the soft part and the hard part. We can abstract the wave data to the note data easily, but it's still hard to abstract the note to the semantic data like "It's Beckham!".
I'll talk how they are soft and hard in the next article.
|
How far we've gone so far? Let me give real examples. In 2004, KTF(the one of Korean mobile companies) started "Search Music" service, and Verizon(the one of US mobile companies) also started "SongID" service.
http://www.muncle.com/media/mobile_news_view.jsp?seq=1396
“노래 들을 준비됐니?”라는 카피와 함께 가수 서태지의 미발표 신곡이 흐르면서 광고는 시작합니다. 시청자가 ‘휴대전화를 꺼내고’ ‘1515+통화버튼을 누른다’ ‘휴대전화를 TV에 가까이 댄다’ 등 광고 중간중간 나오는 지시를 따르면 신기하게도 휴대전화에 곡명이 뜨고, 노래를 다운받을 수 있습니다. (음악을 6초 이상 들려줘야 검색이 가능)
...KTF has promoted with Seo Taiji, who has the most powerful fandom. He released a new song through this service, but even his fans haven't used this service.
http://www.fiercemobilecontent.com/story/verizon-launches-song-identification-service/2007-05-20
Verizon Wireless announced V Cast Song ID, a new service enabling subscribers to identify a piece of music by capturing a 10-second sample via mobile handset and seconds later receiving title and artist information as well as corresponding download, ringtone and ringback offers.
...This is almost the same of KTF "Search Music". Have you ever heard this service? Maybe not.
And Bugs Music also released "Humming Search" service in October 2007, but it has closed in March 2008.
Why these services were unpopular? Why were they unsuccessful? As I used them, their performance were poor. They possibly find my songs, but not so great.
What are the problems?
|
1. Singers may sing incorrectly
2. There may be noise when singing
3. The program may abstract the voice to the note incorrectly
4. The program may search incorrectly
5. The Music DB may be too small or incorrect
Among these possible problems, we want to focus on searching way. we don't care noise and the abstraction from the wave sound to the note because they have already researched enough.
I think the abstraction of data to meta data is similar to MPEG-7. Actually, QbH is already included to MPEG-7 standard, it's just a provision though.
http://en.wikipedia.org/wiki/Query_by_humming
What is MPEG-7? It makes machine read audio and video. For example, we can read the jpg file and know it's Beckham, but computers can't. So we add meta-data indicates this part of the jpg file is Beckham.
|
MPEG-7 is similar to the Semantic Web, abstracting meta-data from data. They try to abstract semantic meanings like "It's Beckham!" from the binary data, but I'm afraid whether it's possible.
|
Back to QbH, we can divide vocal abstraction in two parts, the soft part and the hard part. We can abstract the wave data to the note data easily, but it's still hard to abstract the note to the semantic data like "It's Beckham!".
|
I'll talk how they are soft and hard in the next article.