At times we may assume the value of specific data are self-evident, when they may not be. This misses an opportunity to expand the audience for the data beyond those already interested. People who are not deeply immersed in a subject matter community, but nonetheless could bring new ideas to the table, may benefit from some data literacy.
Note: Recently Project Open Data launched Next.Data.gov as a vision for where Data.gov might go. It does a nice job of promoting the use of government data and modelling examples of companies who have successfully used that data. The launch was accompanied by requests for feedback. I put together some thoughts on an area I think could be explored a bit: profiling prominent data sets through conversations with subject matter experts who understand and can share the value of the data with people who may not immediately understand its value.
A very simplified life cycle for government data:
- The data is collected
- The data is released
- The value of the data is understood
- The data is used
We’ve been good at the first step for a long time.
In the last few years we’ve made great strides in the second step. Data.gov and related data portals provide unprecedented access to data that can be reused in innumerable ways.
There has been lots effort put into the fourth step, advocating and incentivizing for the use of government data through the Health Datapalooza, challenges, and other efforts. Next.data.gov (and alpha.data.gov before it) does a great job of profiling successful use of data by entrepreneurs and folks in the private sector.
The third step is one that I think could use some more love. Helping people to see the value in a specific data set could help start the gears turning in the minds of creative people who may not immediately understand the contents of a new release.
I’ve put together a rough concept of a Conversation page for next.data.gov that would accompany prominent or undervalued data releases to provide context. These conversations are intended to help “unpack” the raw data from multiple perspectives; helping those who might use or share the data understand the potential value in it.
This concept was also [submitted to Data.gov’s Github.
This example is based on the Inpatient Medicare Provider Charge Data. All the copy is for demonstration purposes only. It’s based on a rough understanding of the data and is likely to be inaccurate. But hopefully it helps articulate the concept.
I hope this concept is helpful in some way and look forward to any discussion it may provoke.