NIST Diverse Community Excerpts Data

Description

The Diverse Community Excerpts are a set of tabular demographic data of households in the United States drawn from real records released in the American Community Survey, a product of the US Census Bureau. The data contain 24 features and are partitioned into three geographic regions: Boston area (7634 records), Dallas-Forth Worth area (9276 records), and US national (27254 records). The feature set is identical for all partitions, but the demographics vary radically between the geographic regions. Therefore, these data are well suited for comparisons of synthetic demographic data generator performance.Detailed documentation for usage, design, and purpose of the data are included in the repository including brief descriptions of localities that the data represent.These data are incorporated into the "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist

Resources

Name Format Description Link
0 This repository is contains limited feature (22 columns) excerpts from the American Community Survey partitioned into three geographic regions. It is available as part of the SDNist synthetic data evaluation package. https://github.com/usnistgov/SDNist/tree/main/nist%20diverse%20communities%20data%20excerpts

Tags

  • demographic-data
  • privacy
  • synthetic-data
  • american-community-survey
  • sdnist

Topics

Categories