More and more cities across the country have begun to make available for download the data that they collect and manage as part of their day-to-day administrative tasks. For many parties, including neighborhood groups, entrepreneurs, and community-minded “civic hackers,” this information holds great value for understanding a city’s people, places, and challenges.
The City of Minneapolis joined this nationwide trend by unveiling its own open data portal in December 2014, making dozens of datasets associated with the city’s physical assets, environmental resources, and administrative records available for download at opendata.minneapolismn.gov. As of June 15, 2015, the city’s open data portal had logged more than 400,000 visits, 5,400 dataset views, and 1,200 data downloads.
Community Dividend recently spoke with Otto Doll, chief information officer for the City of Minneapolis, about the city’s decision to open up its data to the public. Doll shared his insights about the benefits associated with an open data policy; how open data can help communities, including low- and moderate-income neighborhoods; and what other municipalities should consider when contemplating the creation of their own open data portal.
Community Dividend: The City of Minneapolis recently adopted an open-data approach. What benefits do you associate with making some of the city’s administrative data available to the public?
Otto Doll: I think it’s fair to say that transparency is the number one benefit in our minds. The City of Minneapolis makes use of a lot of data—more than 100 trillion characters of data, in fact—and personnel from different departments within the city are making decisions based on that information. Being able to share much of that data so members of the public can see how decisions that affect them are being made is really the first motivator.
Another benefit of making data publicly available is the time savings for the city departments themselves. Today, people request information all the time, and some of those requests are repetitive, with people asking for a lot of the same information every month. By having data available on our web site for requesters to download themselves, we save staff time and resources, which in turn helps us operate more efficiently.
We’re hopeful that by being transparent and getting information out there, the public will become more informed about their city, and we’re hopeful that special interest groups or the civic hacking community* can use the data to create an application that a lot of us who live in or visit the city can use.
CD: What’s an example of how administrative data can be used to help or improve communities in the city, including low- and moderate-income neighborhoods?
OD: A good example would be the data associated with the whole world of development. The city collects information from developers before deciding whether to grant construction and other types of related permits, and that sort of information is useful to neighborhoods that would like to know more about the project that is being proposed. The data would show how the city believes the development would affect the area, including the effects on traffic. This type of information is useful for residents who wish to voice concerns about a specific project. We also release data about all of the 311 calls the city receives. A community group could analyze the data to determine some of the more frequent complaints and concerns deriving from their neighborhood and could then, if it’s in their power, work to alleviate some of the quality-of-life issues that residents are voicing.
CD: Who’s a typical consumer of the data you offer?
OD: There’s a variety of stakeholders out there: special interest groups and people with a business intention; civic hackers, who are sometimes associated with neighborhood groups or just working by themselves; even the guy on the street—anyone and everyone.
CD: What formats do you make your data available in? Is it readily usable for someone with average computer skills?
OD: Our open data portal itself offers just raw information. We present that data in several ways: you can pull it down in all its raw glory; you can look at the data in columnar format; you can push it down as a spreadsheet to yourself; or you can make use of the API [application programming interface] that we have available, if you’re developing an application or map of some sort.
I don’t think the average user wants to use—and maybe doesn’t have the technical knowhow to deal with—the raw data, even though we make that available. They want it put in context, such as on a map, so for some of the data we release, we offer a service called MapIT Minneapolis [at cityoflakes.maps.arcgis.com/home]. It features interactive maps of the location of foreclosures, of vacant and condemned properties, even of dangerous dogs that the city has identified. This resource is not what the city views as its open data portal—it’s a separate but connected resource—but I think that’s where more and more of our effort will probably go over time.
CD: Data privacy is a huge issue for all kinds of institutions right now. How do you determine what data can be made publicly available and what can’t? Are there laws governing that?
OD: Yes, and they start at the federal level. Disclosures of things like health information and law enforcement information are covered nationally through federal laws. Then you get to the state level. For example, in Minnesota, we have the Data Practices Act, which regulates data management and presumes government data is public unless it’s specifically classified as non-public. Then sometimes there are local ordinances, such as rules prohibiting the disclosure of certain things. I advise cities that are considering a move to open up their data to look at the data governance rulesets within the full government stack, from federal all the way down to local.
We also talk to our city departments, because while we in information technology are just caretakers of the data, the departments are the owners. Sometimes they’ll know the data requirements a lot better than we do, or are at least more sensitive to them, so we rely on the departments to confirm everything before we release data to the public.
CD: If a city decides to move ahead with making its data public, what steps should it pursue to get the process going?
OD: One of the first things they have to do is figure out the proper approach to take with other departments. While some city personnel might be on board with releasing departmental data, others may not think much about it. One of the biggest worries that people from the departments might have is that if they put this information out there, someone from the public might misinterpret it, and then the city would have to spend a lot of time explaining why, say, someone’s assessment may be inaccurate, when in fact the homeowner isn’t interpreting the information correctly. To help avoid this friction, we created an open data policy that gave general direction to the departments. That’s one thing.
Another critical thing you have to do is come up with a very tight process for vetting all of the data sufficiently. You want a method to ensure that no personally identifiable information or other information that’s restricted from public exposure gets released—because obviously, that would be a really bad scene if you let loose with information that should never have been made public. As I mentioned before, you have to consider the various laws that might govern data: federal, state, and perhaps local statutes that you have to abide by. At the City of Minneapolis we actually do triple checks on anything we’re going to release.
The third thing is the need to engage with users and stakeholders to learn if the tool itself—the portal—is helpful. We want to know if the open data portal works or if it doesn’t work.
CD: As far as personally identifiable information goes, what’s an example of a dataset or a type of data that you avoid releasing?
OD: An example would be some of the information submitted to our 311 system, which city residents can use to report non-emergency things like code violations. Today a person can use our 311 smartphone app to notify the city about something that needs attention, such as a pothole. So the person takes a picture of the pothole, writes a short description, the app records the geographic coordinates of where the picture is taken, and that goes to our centralized 311 system. As I mentioned before, we currently release some of our 311 data, but here’s the challenge: let’s say you took that photo of a pothole but you also captured the license plate of a nearby car in the shot. If we were to release the photo to the public, the license plate would have to be blotted out, because that is personally identifiable information. The technology is available to recognize license plates and other personally identifiable details in photos and to blot them out, but we simply don’t have the money or resources to do that. So rather than risk putting something out there that contains that kind of information, we’ll exclude that data from public access until we can come up with a cost-effective solution. In the meantime, we make available the text portion of the 311 data.
CD: Including the 311 data, you currently have 62 datasets available on the portal. Besides updates to older datasets, do you plan to release any new categories of datasets—from a department that hasn’t contributed data yet, for instance?
OD: Yes, historical election data, snow emergency tow data, bicycle trail updates, inspection data. Like I mentioned before, we haven’t made public anywhere near the 100 trillion characters of data the city has. A good bit of that information still needs to be organized, structured, and automated so that we can consistently put it out on the portal. It’ll just take time to go through all of it.
CD: What do you think the future holds for the open data movement?
OD: One of the traps that people in information technology fall into with our users is that we create systems of record, meaning we release the information and that’s it. But to me, the future is really about creating systems of engagement, where we plan in advance around our ability to share information, to share it in context, and to interact with the public regarding the information.
One of the things Minneapolis’s open data policy requires is that all future procurements of information systems must ensure the ability to extract data from that information system and easily make it public through the open data portal. But beyond making it available, how are we engaging the public? We still need the portal to serve as a system of record, but we also need to ask, how can we get feedback? How do we ensure people can question the validity of the information we’ve released and how can we correct any errors? I think this type of engagement with our constituency will pay a lot more dividends.