With
professional cameras sporting over 39_MPixels, Photograpy
said "Good Bye" to the analogue age. Today, most
cameras sold are
digital and in a few years everybody will
own several ones. One big advantage of storing pictures
in computers
is that there is no cost associated with taking a shot,
other than recharging the battery. The future is an ocean
of snapshots.
These
developments, however, beg for the question of how are
people going to find a certain image from the zillions
stored around them? This implies actually two issues.
The
more difficult one is related to how humans store images
in their memory and how is it possible to formulate (describe)
that internal representation in terms of the "physical" image?
Let us call this the "query protocol" problem.
Google
and other search engines use text tags found close to
or directly related to the image for performing image search
via text-search. Hence, the query is formulated
here
in textual format. By contrast, content based image retrieval
(CBIR) finds images based on their "similarity".
The query is now an image and the system must find and
rank all images similar to the query from a very large
database. This is the "content addressable protocol" problem.
In
the near future - as hardware quality continues its exponential
growth - the CBIR protocol will gain terrain. However,
there will be no real solution to the image-search task,
unless a very accurate and robust image and image-object-scene
representation in terms of simple features is found.
The AMASS platform will excel at searching in a content
dependent, robust way such feature-arrays, coding for the
content of images AND of the correlated text. For example,
the AMASS platform could perform easily the retrieval part
for the biometrical iris-recognition,
because taht problem has a simple and effective representation
in terms of binary features.
Searching
digital videos might be actually simpler, because additional
cues - like text, audio-track, camera motion events -
are also available. |