Search

Filter

  • Advanced Filters:

  • to
  • Specific Data Sources:

    All Edit

    Select All  |  Select None

Reset filters

Data sets in large applications are often too massive to fit completely inside the computers internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. In this article we survey the state of the art in the design and analysis of external memory (or EM) algorithms and data structures, where the goal is to exploit locality in order to reduce the I/O costs. We consider a variety of EM paradigms for solving batched and online problems efficiently in external memory. For the batched problem of sorting and related problems such as permuting and fast Fourier transform, the key paradigms include distribution and merging. The paradigm of disk striping offers an elegant way to use multiple disks in parallel. For sorting, however, disk striping can be nonoptimal with respect to I/O, so to gain further improvements we discuss distribution and merging techniques for using the disks independently. We also consider useful techniques for batched EM problems involving matrices (such as matrix multiplication and transposition), geometric data (such as finding intersections and constructing convex hulls), and graphs (such as list ranking, connected components, topological sorting, and shortest paths). In the online domain, canonical EM applications include dictionary lookup and range searching. The two important classes of indexed data structures are based upon extendible hashing and B-trees. The paradigms of filtering and bootstrapping provide a convenient means in online data structures to make effective use of the data accessed from disk. We also reexamine some of the above EM problems in slightly different settings, such as when the data items are moving, when the data items are variable-length (e.g., text strings), or when the allocated amount of internal memory can change dynamically. Programming tools and environments are available for simplifying the EM programming task. During the course of the survey, we report on some experiments in the domain of spatial databases using the TPIE system (transparent parallel I/O programming environment). The newly developed EM algorithms and data structures that incorporate the paradigms we discuss are significantly faster than methods currently used in practice.

End of preview. The entire article is 63 pages. To view the full-text, please rent this article to continue.

/lp/association-for-computing-machinery/external-memory-algorithms-and-data-structures-dealing-with-massive-RTOAcf5o2T
Welcome to DeepDyve! Rent Premier Research Articles and Save Up to 90%

Learn more

Bookmark

External memory algorithms and data structures: dealing with massive data

Vitter, Jeffrey Scott
ACM Computing Surveys (CSUR) , Volume 33 (2)
Association for Computing MachineryJun 1, 2001

More Info

More Like This Article

View All dataSource[]=actageo&dataSource[]=aspet&dataSource[]=aaos&dataSource[]=aacc&dataSource[]=aacr&dataSource[]=aea&dataSource[]=aip&dataSource[]=ajnr&dataSource[]=ams&dataSource[]=aps_physical&dataSource[]=appi_book&dataSource[]=appi_journal&dataSource[]=apha&dataSource[]=asip&dataSource[]=asm&dataSource[]=asn&dataSource[]=aspb&dataSource[]=avs&dataSource[]=annual_reviews&dataSource[]=arxiv&dataSource[]=acm&dataSource[]=berghahn&dataSource[]=cabi&dataSource[]=clinical_trials&dataSource[]=dailymed&dataSource[]=degruyter&dataSource[]=du_press&dataSource[]=esa&dataSource[]=eu_press&dataSource[]=elsevier&dataSource[]=emerald&dataSource[]=ejtr&dataSource[]=emea&dataSource[]=epo&dataSource[]=faseb&dataSource[]=gsa&dataSource[]=health_affairs&dataSource[]=hindawi&dataSource[]=imanager&dataSource[]=imedpub&dataSource[]=informa_healthcare&dataSource[]=informs&dataSource[]=iop&dataSource[]=iucr&dataSource[]=iospress&dataSource[]=jbjs&dataSource[]=leftcoast&dataSource[]=lu_press&dataSource[]=mesharpe&dataSource[]=mary_ann_liebert&dataSource[]=medline&dataSource[]=mit_press&dataSource[]=nature&dataSource[]=oxford&dataSource[]=pier_professional&dataSource[]=pnas&dataSource[]=portlandpress&dataSource[]=psyc_articles&dataSource[]=psyc_books&dataSource[]=psyc_critiques&dataSource[]=plos_journal&dataSource[]=pubmed_central&dataSource[]=rsna&dataSource[]=rockefeller&dataSource[]=rcn&dataSource[]=ria&dataSource[]=rsc&dataSource[]=sage&dataSource[]=spie&dataSource[]=springer_journal&dataSource[]=springer&dataSource[]=taylor_francis&dataSource[]=aps&dataSource[]=the_scientist&dataSource[]=uc_press&dataSource[]=uspto_abstract&dataSource[]=wiley&dataSource[]=pct

Browse: Subject Areas | Journals | Publishers

Sign Up for a DeepDyve Account

Bookmark an Article

To bookmark an article, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Subscribe to Journal Email Alerts

To subscribe to email alerts, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Thank you for renting with DeepDyve

Your PayPal account has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

Your credit card has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

OK

New! You can now keep track of new articles from ACM Computing Surveys (CSUR) on your personalized homepage! Learn more

PDF Download — Not Available

Thanks for your interest in purchasing the PDF. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for purchasing with DeepDyve

Your PayPal account has been charged $.

Your credit card has been charged $.

You can now download this article. A purchase receipt has also been sent to your email address.

Download This Article or I'm done with my download

Print Page — Not Available

Thanks for your interest in printing individual pages. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for printing with DeepDyve

Your PayPal account has been charged $0.

Your credit card has been charged $0.

You can now print this article. A purchase receipt has also been sent to your email address.

Print the Selected Pages or I'm done with my printing

Please refresh to generate a new download link

Your article download link has expired. Please refresh this page to obtain a new download link and try again.

Follow a Journal

To get new article updates from a journal on your personalized homepage, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK