June 3, 2011 By Steve DuScheid
Editor’s Note: Steve DuScheid is marketing director of Maponics, a developer of polygonal map data, such as neighborhood boundaries, ZIP codes and school attendance zones.
Every year, federal and state government agencies collect, analyze and publish an enormous amount of data — directly and through grants to universities and foundations. Researchers and policymakers often segment this data by geographic area to compare regions, analyze trends and draw conclusions. One challenge to effectively grouping data by geography is finding the right level of granularity suited to answering particular questions. Too often, researchers simply use what’s readily available or must be satisfied with the level of geography inherent in the processes or organizations used to collect it.
Some common geographic entities used to segment and analyze data include: county, ZIP code and U.S. Census Bureau geography (i.e., block groups).
While there are real benefits to using these defined areas — including wide availability, broad geographic coverage, and the ability to link and compare multiple data sets — none of them truly reflect social and cultural boundaries at the local level. Therefore, they may not answer fundamental research questions or address key factors for policy decisions. ZIP codes and similar entities were defined to facilitate and administer government operations and services — and while some may take into account population characteristics — their borders aren’t meaningful to local citizens.
Standard geographic entities will always be important in how researchers analyze data and how policymakers draw conclusions. But with the availability of new geographic data sets and the growing volume of geotagged data, it’s now possible for researchers to consider questions in new ways that align data to the geographic areas most relevant to answering them.
Below are some of pros and cons of using the standard geographic entities in research and some alternatives that offer new ways to look at data.
County. There are many data sets collected and managed at the county level and made available to federal, state and local government agencies. There are many reasons for this — not least of which is the established infrastructure in place within county governments. Also, data at the county level is manageable to work with because there are only about 3,100 counties in the U.S. But counties are far too large (averaging more than 3,000 square miles) and too varied in population (from as few as 45 to as many as 9 million people) to get at many local socio-economic questions. Population groups within counties are often too diverse for researchers to characterize behaviors or outcomes.
View Full Story