Government Technology

    Digital Communities
    Industry Members

  • Click sponsor logos for whitepapers, case studies, and best practices.
  • McAfee

How Chicago’s Data Dictionary is Enhancing Open Government



October 21, 2013 By

Right now, the City of Chicago is working on documenting all of its data. That’s right: all of Chicago’s public data, across all databases, in all its departments and sister agencies.

The project, called the Chicago Data Dictionary, is a massive, public metadata repository—a searchable archive of “data describing data”—that gives users information about the variety of data in the City of Chicago’s numerous databases. As the next phase of Chicago’s government transparency initiative, the Data Dictionary complements the City’s open data portal by providing background information on where such data comes from.

While it may not be the city’s chicest tech initiative, the Data Dictionary is nonetheless an ambitious and colossal project that is enhancing the city’s data landscape.

Why Does a City Need a Data Dictionary?

A database is only as good as the data it contains. Its validity, however, can suffer if its data is not defined clearly. Thus, data dictionaries, or metadata repositories, are important because they provide database users with key “ground rules” for understanding complex, often jargon-riddled, databases. Data dictionaries also allow users to find data quickly with a simple query.

In the case of a public data dictionary, “users” can include just about anyone who accesses municipal data. Public data dictionaries can benefit academic researchers and software developers who want to know what kinds of data a City holds, and how they can access it for research or application development. They can also assist city staff who manage city databases and work to improve their efficiency.

Since government open data initiatives are still new, public data dictionaries are uncommon.  In Cambridge, Massachusetts, the Cambridge Information Technology Department (ITD) is creating a Data Dictionary for its Geographical Information Systems (GIS) division. Cambridge’s dictionary provides information about the city’s geographical data use, coding, history, and other attributes.    

Like Cambridge’s program, most metadata repositories cover only a single department, project or database. The Chicago Data Dictionary is a radical step: it takes the standard metadata repository model and amplifies it across an entire city.

Building a Metadata Repository in Chicago

The Chicago Data Dictionary is part of Mayor Rahm Emanuel’s vision to use technology to make government more efficient and transparent. The initiative also expands upon Chicago’s goal to be the nation’s leader in open data. 

In March 2012, the Mayor sponsored an ordinance for the Data Dictionary, and it quickly passed through City Council. With the assistance of a $300,000 grant from the John D. and Catherine T. MacArthur Foundation, the Chapin Hall at the University of Chicago research center led the initiative along with Chicago’s Department of Innovations and Technology (DoIT).

Nine months later, an Executive Order issued by the Mayor mandated that city agencies regularly publish and update their public data on the City’s data portal. The Order specifically mentioned the Data Dictionary as a tool that would “improve City operations, services and analytical decision-making.” 

Now one year into the project, Chapin Hall and DoIT are continuing work on the first of a three-phase plan to develop the Data Dictionary. In the past year, Chapin Hall has completed the inclusion of more than 12 city databases into the Dictionary; currently, they are identifying, processing, and cataloguing over 100 additional municipal databases.

However, Chicago government contains far more than 100 databases. By including every City and sister agency database in the new repository, how can Chicago ever complete such a herculean task?

This is the wrong question to ask. As the project’s scope implies, compiling the Dictionary is no quick job, nor is it ever a “done” job. Because new municipal databases may be added or changed, the Data Dictionary requires continued maintenance to ensure that its users receive useful and up-to-date information.

This brings us to the right question: how can the Data Dictionary improve the way Chicago’s citizens and government understand and use their City’s data?   

One way to do so is to make the Data Dictionary available online, even as its development continues. Chapin Hall designed its homepage simply and efficiently, helping convey its purpose as a querying tool for users:

A second way to do so is to share the design of the Dictionary itself, so that outside cities and organizations may benefit by adopting it. As with many of Chicago’s other open-source projects, DoIT will make the source code for the Chicago Data Dictionary available on Github for anyone who wishes to build a metadata repository of their own. 

Moreover, while some of Chicago’s open-source initiatives, such as the SmartData predictive analytics platform, are intended to be replicated by other cities, Chicago’s Data Dictionary model can serve a purpose for any type of organization. A ready-made API could be a gift to database administrators in nonprofits and private companies alike who use databases regularly.  

A New Tool for the Public

When thinking of new and innovative ways data can improve cities, most people generally don’t think of metadata repositories. But without better understanding the “data about the data,” many of these new benefits may not develop in the first place.

Chicago’s Data Dictionary, a bibliographic giant growing bigger by the day, is providing the City with just that resource. The next time someone in Chicago has a question about their city’s data, they know where to look first.

This story originally appeared on Data-Smart City Solutions.
 


| More

Comments

Add Your Comment

You are solely responsible for the content of your comments. We reserve the right to remove comments that are considered profane, vulgar, obscene, factually inaccurate, off-topic, or considered a personal attack.

In Our Library

White Papers | Exclusives Reports | Webinar Archives | Best Practices and Case Studies
Digital Cities & Counties Survey: Best Practices Quick Reference Guide
This Best Practices Quick Reference Guide is a compilation of examples from the 2013 Digital Cities and Counties Surveys showcasing the innovative ways local governments are using technological tools to respond to the needs of their communities. It is our hope that by calling attention to just a few examples from cities and counties of all sizes, we will encourage further collaboration and spark additional creativity in local government service delivery.
Wireless Reporting Takes Pain (& Wait) out of Voting
In Michigan and Minnesota counties, wireless voting via the AT&T network has brought speed, efficiency and accuracy to elections - another illustration of how mobility and machine-to-machine (M2M) technology help governments to bring superior services and communication to constituents.
Why Would a City Proclaim Their Data “Open by Default?”
The City of Palo Alto, California, a 2013 Center for Digital Government Digital City Survey winner, has officially proclaimed “open” to be the default setting for all city data. Are they courageous or crazy?
View All