The viewpoint complexity of an object-recognition task

The viewpoint complexity of an object-recognition task There is an ongoing debate about the nature of perceptual representation in human object recognition. Resolution of this debate has been hampered by the lack of a metric for assessing the representational requirements of a recognition task. To recognize a member of a given set of 3-D objects, how much detail must the objects’ representations contain in order to achieve a specific accuracy criterion? From the performance of an ideal observer, we derived a quantity called the view complexity (VX) to measure the required granularity of representation. VX is an intrinsic property of the object-recognition task, taking into account both the object ensemble and the type of decision required of an observer. It does not depend on the visual representation or processing used by the observer. VX can be interpreted as the number of randomly selected 2-D images needed to represent the decision boundaries in the image space of a 3-D object-recognition task. A low VX means the task is inherently more viewpoint invariant and a high VX means it is inherently more viewpoint dependent. By measuring the VX of recognition tasks with different object sets, we show that the current confusion about the nature of human perceptual representation is partly due to a failure in distinguishing between human visual processing and the properties of a task and its stimuli. We find general correspondence between the VX of a recognition task and the published human data on viewpoint dependence. Exceptions in this relationship motivated us to propose the view–rate hypothesis: human visual performance is limited by the equivalent number of 2-D image views that can be processed per unit time. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Vision Research Elsevier

The viewpoint complexity of an object-recognition task

Vision Research, Volume 38 (15) – Aug 1, 1998

Loading next page...
 
/lp/elsevier/the-viewpoint-complexity-of-an-object-recognition-task-Jt0fNeMulW
Publisher
Elsevier
Copyright
Copyright © 1998 Elsevier Science Ltd
ISSN
0042-6989
eISSN
1878-5646
DOI
10.1016/S0042-6989(97)00255-1
Publisher site
See Article on Publisher Site

Abstract

There is an ongoing debate about the nature of perceptual representation in human object recognition. Resolution of this debate has been hampered by the lack of a metric for assessing the representational requirements of a recognition task. To recognize a member of a given set of 3-D objects, how much detail must the objects’ representations contain in order to achieve a specific accuracy criterion? From the performance of an ideal observer, we derived a quantity called the view complexity (VX) to measure the required granularity of representation. VX is an intrinsic property of the object-recognition task, taking into account both the object ensemble and the type of decision required of an observer. It does not depend on the visual representation or processing used by the observer. VX can be interpreted as the number of randomly selected 2-D images needed to represent the decision boundaries in the image space of a 3-D object-recognition task. A low VX means the task is inherently more viewpoint invariant and a high VX means it is inherently more viewpoint dependent. By measuring the VX of recognition tasks with different object sets, we show that the current confusion about the nature of human perceptual representation is partly due to a failure in distinguishing between human visual processing and the properties of a task and its stimuli. We find general correspondence between the VX of a recognition task and the published human data on viewpoint dependence. Exceptions in this relationship motivated us to propose the view–rate hypothesis: human visual performance is limited by the equivalent number of 2-D image views that can be processed per unit time.

Journal

Vision ResearchElsevier

Published: Aug 1, 1998

References

  • Viewpoint dependency in object representation and recognition
    Liu, Z
  • Face recognition, pose and ecological validity
    Logie, RH; Baddeley, AD; Woodhead, MM
  • Face recognition under varying poses: the role of texture and shape
    Troje, NF; Bülthoff, HH
  • Human efficiency for recognizing 3-D objects in luminance noise
    Tjan, BS; Braje, WL; Legge, GE; Kersten, D

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create folders to
organize your research

Export folders, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off