how digitalWEED works

Technical overview of the system:

The basic concept of this system is the modification of the amplitude of one single frequency component in audio, taking advantage of the limitations of the human hearing system. How is this achieved? The pertinent points to note from the research into the way humans hear is that there are certain limitations in how our ears, and then brain, process sounds. We are also continuously filtering which sounds are processed and which are not. In this regard, one sound in the presence of another might be undetectable by humans if the second sound is very loud. Put simplistically, consider a live concert where the vocalist may not be completely audible because his or her ‘sound’ is drowned out by louder sounds such as electric guitar and drums. This happens because the auditory system reacts more to the louder (higher amplitude) sounds than to quieter sounds. The vocals are there, we just cannot hear them. This knowledge of ‘amplitude masking’ can be used to manipulate the amplitudes of some frequencies of a given signal so that the human auditory system cannot detect them in the presence of others.

In our watermarking system, we design a process that takes in a signal (a music track, for example) and modifies one single frequency in such a way that its relationship to others is defined. Put simply, a single frequency is modified very, very slightly so that they can represent a desired bit sequence as follows:

o If the bit to be encoded is a ‘0’, then frequency A is set to be higher than frequency B if it is not already

o If the bit to be encoded is a ‘1’, then frequency B is set to be higher than frequency A if it is not already

Frequencies are also controlled against the section of audio that they are in. In this way, a continuously varying pair of frequencies in relation to each other can be manipulated so long as they are kept at a low enough amplitude in relation to the rest of the audio frequencies in that particular frame, meaning they will be ‘masked’ by louder sounds and this leads to them being undetectable.

We repeat this process in a sequential pattern that matches the pattern of the 1/0 bits that we want to represent. In this manner we can represent any form of data within the audio. The length of the audio and the length of the watermark (currently approx 3-5 seconds) define the number of times the watermark can be looped into the audio and this increases the likelihood that the audio will be identified, no matter where in the track the monitoring begins. It also increases the overall strength in recovery of the watermark data. In one of the techniques we use, recovery is 99.98% and in the other, recovery is 97%+ but this second technique will allow the watermark to be recovered even after being heavily compressed in MP3 format.

In the current work, we modified the frequencies to represent a bit sequence that identifies the artist and title of the track, but it could be any type of information. In the music industry, all audio that is publicly released has a unique identifier called the International Standard Recording Code (ISRC) which is based on an ISO standard (ISO 3901:2001) and it would make sense to embed that identifier into the audio as it allows for easy and consistent identification of the ownership of the audio under consideration.

Decoding the watermark:

When decoding the audio, it is only necessary to know the two frequencies that are being used - one that is modified very slightly and the other (perhaps used as a ‘public key’) that it is set against. We can then perform continuous analysis of only these frequencies in order to see how they relate to each other. If frequency A is ‘louder’ than frequency B, then a ‘0’ bit is assumed. Conversely, if frequency B is ‘louder’ than frequency A, then a ‘1’ bit is assumed. Once we have performed the analysis on a specified length of the audio signal, we have a complete pattern of bits that can then be re-constructed to reveal the original message.

In the scenario outlined above, whereby the information in the audio is some form of unique identification code, the user would then access the holder of the database of these numbers in order to find the identification of the Copyright owner (these registrars exist for each Country). However, in the case that some other data was embedded in the audio, all that would be needed in order to decode this information would be the two frequencies used to embed the data, one modified and one as reference. Therefore, both these decoding systems would work as a form of semi-blind decoding, meaning there is no need to add anything to any database to be compared against. Once a watermark is in a piece of audio, and assuming the audio was not publicly released without the watermark, then it will be physically impossible to create an unwatermarked audio example by removing the watermark. This is a very important consideration.

Full technical details are available in published papers and in Patent application documents. A collection of published papers can be found at www.digitalWEED.info/papers