Microdata are the individual records which contain information collected about each person and housing unit. They include the census basic record types, computerized versions of the questionnaires collected from households, as coded and edited during census processing. The Census Bureau uses these confidential microdata in order to produce the summary data that go into the various reports, summary files, and special tabulations. Public use microdata samples are extracts from the confidential microdata taken in a manner that avoids disclosure of information about households or individuals. For Census 2000, the microdata are only available to the public through the Public Use Microdata Sample (PUMS) products.
Public Use Microdata Sample (PUMS) files contain records representing 5-percent or 1-percent samples of the occupied and vacant housing units in the U.S. and the people in the occupied units. Group quarters people also are included. The file contains individual weights for each person and housing unit, which when applied to the individual records, expand the sample to the relevant total. Please see Chapter 6 - Data Dictionary for a complete list of the variables and recodes.
Some of the items included on the housing record are: acreage; agricultural sales; allocation flags for housing items; bedrooms; condominium fee; contract rent; cost of utilities; family income in 1999; family, subfamily, and relationship recodes; farm residence; fire, hazard, and flood insurance; fuels used; gross rent; heating fuel; household income in 1999; household type; housing unit weight; kitchen facilities; linguistic isolation; meals included in rent; mobile home costs; mortgage payment; mortgage status; plumbing facilities; presence and age of own children; presence of subfamilies in household; real estate taxes; rooms; selected monthly owner costs; size of building (units in structure); state code; telephone service; tenure; vacancy status; value (of housing unit); vehicles available; year householder moved into unit; and year structure built.
Some of the items included on the person record are: ability to speak English; age; allocation flags for population items; ancestry; citizenship; class of worker; disability status; earnings in 1999; educational attainment; grandparents as caregivers; Hispanic origin; hours worked; income in 1999 by type; industry; language spoken at home; marital status; means of transportation to work; migration Public Use Microdata Area (PUMA); migration state; mobility status; veteran period of service; years of military service; occupation; person's weight; personal care limitation; place of birth; place of work PUMA; place of work state; poverty status in 1999; race; relationship; school enrollment and type of school; time of departure for work; travel time to work; vehicle occupancy; weeks worked in 1999; work limitation status; work status in 1999; and year of entry.
The Public Use Microdata Sample (PUMS) files contain geographic units known as super-Public Use Microdata Areas (super-PUMAs) and Public Use Microdata Areas (PUMAs). To maintain the confidentiality of the PUMS data, minimum population thresholds are set for PUMAs and super-PUMAs. For the 1-percent state-level files, the super-PUMAs contain a minimum population of 400,000 and are composed of a PUMA or a group of contiguous PUMAs delineated on the 5-percent state-level PUMS files. Super-PUMAs are a new geographic entity for Census 2000. The 5-percent state-level files contain PUMAs, each having a minimum population of 100,000; the 5-percent files also will show corresponding super-PUMAs codes. Each state is separately identified and may be comprised of one or more super-PUMAs or PUMAs. Large metropolitan areas may be subdivided into super- PUMAs and PUMAs. PUMAs and super-PUMAs do not cross state lines. Super-PUMAs and PUMAs also are defined for place of residence on April 1, 1995 and place of work.
The 1-percent files give users the maximum amount of social, economic, and housing information available. There is no national minimum threshold for the identification of variable categories, with the exceptions of a national minimum population of 8,000 for race and Hispanic origin. The goal of these files is to provide a similar level of detail as was available in the 1990 PUMS files (and, in some cases, more detail). The 1% Public Use Microdata Sample Equivalency (PUMEQ1) file is intended to help users understand the relationships between standard Census 2000 geographic concepts (counties, county subdivisions, places, and census tracts) and the SuperPUMA geographic units of the 1% PUMS data files.
In order to provide the level of characteristic detail for the 1-percent files described above, the minimum geographic population threshold needed to be raised above 100,000 (the PUMA minimum). A new geographic entity was created -- the super-PUMA. Super-PUMAs have a minimum population of 400,000 and are composed of a PUMA or PUMAs delineated on the 5-percent PUMS files. Each state will be identified, and any state with a population of 800,000 or greater can be subdivided into two or more super-PUMAs.
To maintain confidentiality, while retaining as much characteristic detail as possible, a minimum threshold of 10,000 nationally is set for the identification of variable categories within categorical variables in the 5-percent PUMS files. The 5% Public Use Microdata Sample Equivalency (PUMEQ5) file is intended to help users understand the relationships between standard Census 2000 geographic concepts (counties, county subdivisions, places, and census tracts) and the SuperPUMA and PUMA geographic units of the 5% PUMS data files.
Each PUMA in the 5-percent files must meet a minimum population threshold of 100,000. The minimum PUMA threshold was held at 100,000 by increasing the degree of variable collapsing as described above. The 100,000 minimum population threshold.the threshold set for both the 1980 and 1990 PUMS files.permits greater historical comparability.