World’s Largest Data Repositories

Library of Congress Main Reading Room

Business Intelligence Lowdown purports to rank the “Top 10 Largest Databases in the World” as of February 2007.  Of course, no one really knows who has the largest repositories of data, including both paper and electronically stored information (“ESI”), but this article presents an educated guess.  It differs substantially from a prior 2005 list of the largest databases compiled by the Winter Corporation.  The 2005 list ranked Yahoo number one, whereas it is not even included in the 2007 list.  Some of the other choices are surprising too, especially their guess as to who has the most ESI, essentially the “weatherman.”  Their supposition that the tenth largest repository is the venerable Library of Congress is also almost certainly wrong, since it is generally accepted that most large corporations today have far more information stored in their computers than the equivalent of the entire paper and digital collection of the Library.  Still, it helps to include the Library because it provides a benchmark to try to grasp the enormity of the other repositories. 

Here are the rankings in reverse order, with a little detail provided on the tenth and first places: 

10. Library of Congress.  According to the Library of Congress (whose main public reading room is shown above), it is the largest library in the world, with more than 130 million items on approximately 530 miles of bookshelves. The collections include more than 29 million books and other printed materials, 2.7 million recordings, 12 million photographs, 4.8 million maps, and 58 million manuscripts. The Library expands at a rate of 10,000 items per day, and takes up close to 530 miles of shelf space.  
9. The Central Intelligence Agency.
8. Amazon.
7. You Tube.
6. Choice Point.
5. Sprint.
4. Google.
3. AT&T.
2. National Energy Research Scientific Computing Center.
1. The World Data Centre for Climate.  This ESI is located on one of the world’s largest supercomputers owned by the Max Planck Institute for Meteorology and German Climate Computing Centre. The WDCC has 220 terabytes of data readily accessible on the web, including information on climate research and anticipated climatic trends, as well as 110 terabytes (or 24,500 DVDs) worth of climate simulation data. In addition, six petabytes worth of additional information are stored on magnetic tapes for easy access. According to Business Intelligence Lowdown, six petabyts is 3 times the amount of ALL the U.S. academic research libraries’ contents combined. 

Comments are closed.

%d bloggers like this: