donderdag 8 april 2010

Teezir opgemerkt door muziekindustrie


Teezir, de maker van whorules, is opgemerkt door één van de meest toonaangevende digitale magazines uit de muziekindustrie vanwege haar werk voor de Nederlandse auteursrechtenorganisatie Buma/Stemra. Het magazine is geschreven door Susan Butler van het bedrijf Music Confidential. Daar zijn we beste een beetje trots op. "De door Teezir ontwikkelde technologie voor het opsporen van beschermd muziek is uniek in de wereld" aldus Victor van Tol, Algemeen Directeur van Teezir. Hieronder tref je het artikel aan waarbij we je vragen lid te worden van Music Confidential wanneer je alle ins en outs van de muziekindustrie blijvende wilt volgen.



Trying to picture ‘bots’ crawling around the Web searching for content like spiders weaving sticky traps is… well, kind of creepy. This type of stealth-like activity can trigger an outpouring of protests from privacy advocates and Internet-freedom bloggers. Yet search engines, which are so vital for anyone who wants to find anything on the Web, rely on Web crawlers to find Web sites. Now, companies or agencies that own or administer a significant amount of commercial music or control well-known brands may find that Web crawlers, combined with other technology and services, may potentially increase license revenues.
One company that is pulling together all these parts—find and identify to help license—is the Netherlands-based Teezir.

THE TECH SPEAK
Web Crawling: A Web crawler is a computer program, also called a ‘bot’ or Web robot, that scours Web sites looking for… something.
Different Web crawlers serve different purposes. Sometimes they search for specific types of content, and other times they check sites to make sure they are working properly (e.g., making sure that links on site work properly).
For a search, usually the crawler begins with a list of Internet Web page addresses (e.g., the Uniform Resource Locators, or URLs, such as http://www.thatcompany.com) to visit, adding links from each site to its list of places to also search.
What the crawler does at each site—and whether or not it revisits each site—depends on a set of policies that are essentially built into the particular program.
For example, a ‘selection policy’ would typically define what the crawler should select to look at and how to prioritize each selection. It may, perhaps, browse only the most highly visited pages among a group of linked pages or ignore certain types of computer code found on sites.

Legitimate Web crawlers (e.g., programs that do not collect email addresses from sites to use for sending spam) can normally be identified by a Web site administrator because the crawlers often identify themselves to a Web server by including a URL where more information about the crawler can be found.
One of the benefits of a crawler for music is that the program automates the process of finding audio files. It could potentially eliminate—or at least dramatically reduce—the need for manual searches to find the sites that are using unlicensed music. Another benefit is that crawlers could locate music files that would not appear in a search result on a traditional Internet search engine.
Acoustic Fingerprinting: Acoustic fingerprinting is another type of technology, which many companies have developed to electronically identify audio recordings.
The fingerprint is basically a summary of what is digitally detected from a sample of the sounds. The fingerprint is then compared to a particular company’s database of audio information, such as recorded music titles and right holder identification.
The accuracy of a particular fingerprinting technology will depend on the ability of the technology to detect and distinguish the sounds, the accuracy and amount of data contained in the company’s database and the effectiveness of the matching of sounds with the database.

TEEZIR
Teezir was formed in 2006. When the European government funded a project on text retrieval innovation for the academic community, the company was able to hire its first two employees for six months. This kick-started the company’s technology development.

Managing director Victor van Tol joined the company in 2008 after spending several years working in the travel industry (KLM Airlines and BCD Travel) and consulting (Nolan, Norton & Co.).

Teezir offers ‘tech solutions’ that all revolve around ‘unstructured information,’ says van Tol, whether it relates to content from a large publisher (like a newspaper), consumer products or music.

The Technology: Teezir’s core technology is the robots sent on the Web (i.e., the Web crawler) to find specific information and the various searching and matching algorithms (i.e., methods for solving problems).

In the summer of 2008, Teezir began working on a ‘copyright solution’ with the Netherlands collecting society Buma/Stemra, which van Tol calls the company’s “launching customer” in the copyright detection world.

“We basically used our existing technology and developed a couple new components, as needed, to find music in flash files, but also for more difficult things like address extraction,” he says. “We have been constantly improving these components over the last two years.”

How It Works: The focus is on finding commercial businesses within a specific country that use music online without a license. These companies can then be contacted, the music properly licensed and the use invoiced by Buma/Stemra.

Teezir begins by using online company directories such as the Yellow Pages or ilocal to gather URLs within a country, says van Tol.

“We basically send our crawler onto the Web of a country, [which] goes to the company sites, scrapes those sites and sees whether there is music in some shape or form—downloaded, background, embedded like YouTube [videos], streaming like radio,” he says. “We can detect about 98% of the music.”

While the specific rules used to determine whether or not a site is run as a business may vary county by country, says van Tol, in general such indicators include:
• A site operator listing a Value Added Tax (VAT) number on the site;
• Goods or services offered for sale on the site;
• Use of a shopping cart;
• A statement that the operator is a member of a Chamber of Commerce or similar business group; or
• An address, which rarely appears on a consumer site.

The crawler also determines the country in which the business is likely operating by detecting the domain extension of the country (e.g., nl for Netherlands) and by the Internet Protocol (IP) ranges, which are fixed per country.

“If you know the IP address, you can assess whether it’s French, British, etcetera,” says van Tol.

The technology looks to confirm (with the directory listing) or obtain new ‘aggregate’ information such as the company name, address and other contact data.

The technology may also obtain additional information for a particular customer. For example, it gathers for Buma/Stemra data about the kind of audio file, its length and similar information needed to determine the type of use and necessary license that should be issued. This data is then stored in Teezir’s database.

The crawler also detects links on a site and follows them to find other business sites. The process then begins again—determining whether the site is a consumer or business site, whether music is available, and so on.

Although Teezir detects the kind of audio used, when a customer like Buma/Stemra wants to know exactly what song was found, the file needs to be fingerprinted.

Teezir currently works with SoundAware for fingerprinting.

“There are lots of fingerprinting companies, so we decided not to create our own fingerprinting solution,” says van Tol. “We can partner with them since they have the databases already.”

Teezir sends the files to SoundAware, which identifies the song and sends the information back to Teezir.

The Teezir customer can then automatically import the file with the information into its back office system, perform sample checks, forward the data to its invoicing department and send out an invoice to the business using the music. The customer can also access the Teezir Web site and select any or all of the downloads from the prior month to review or work with, says van Tol.

The Controversy: Last fall, Buma/Stemra and Teezir came under attack in the press and online for the Web crawling that resulted in surprise invoices to some Web site operators.

“We detect different types of music including embedded music,” explains van Tol. “Ninety-five percent of embedded music is from YouTube [videos]. Buma/Stemra decided in November 2009 to invoice for embedded files. From a philosophical point of view, they wanted to invoice consumers since they also have to pay. But in practice, [Buma/Stemra was] not going to follow up; you don’t have the aggregate information on consumers so you can’t invoice them. That wasn’t communicated. The blogging scene thought they would get invoices for using YouTube content on their personal Web sites, and they began shouting at Buma/Stemra and sending emails to us, creating [something very] negative. After a press conference on the topic—Buma/Stemra said we’re only going to do commercial sites—the whole thing went away again.”

Rather than initially sending an invoice demanding payment for newly-discovered music on sites, the new approach is to send a letter. It essentially states that music was found on the site, and the operator has a certain amount of time to remove the music or to license it and pay a fee, says van Tol.

“About 80% of them pay,” he adds.

Although van Tol won’t discuss the price for the Teezir’s services, he says the investment a customer like a copyright holder or collecting society would “need to make is basically earned back within a couple months, and the rest is profit.”

Brands & Live Events: Teezir is also working on developing similar solutions for publishers of text and pictures, which would include logos. For artists and companies trying to protect use of their logos, this could be a welcome addition.

One initiative, which has already begun in the Netherlands and involves text, supplements performance licenses for live events.

The crawler searches event listings from online news sources and event calendar Web sites. Currently, it follows the main calendar sites in the country and more than 300 online local Dutch newspapers.

It then aggregates the event information and delivers the data in an Excel file, which can be directly imported into a customer’s IT system.

Working with Buma/Stemra for this initiative, Teezir’s searches have resulted in about 700 unique music events per week, which includes information about newly discovered venues and artists.

THE MARKET
There are other companies providing Web crawlers or audio fingerprinting technology, but there do not appear to be other companies offering essentially a one-stop shop to detect and identify music from such a broad scope of online uses, and then to sync delivery of that data for companies or agents to increase licensing efforts.

It may take some time to determine whether such searches would result in a significant increase in licensing revenue and a corresponding reduction in costs, but Teezir presents a potential solution that may be worth exploring.

[Editor’s Note: Music Confidential did not demo this technology; the company came to my attention from sources who had positive things to say about the company.]

0 reacties:

Een reactie plaatsen