I just finished reading the book “Data Driven” by Thomas C. Redman. First of all, here is a short review I just wrote for my LinkedIn Reading List”
A Brief Summary
“Often, we speak of our data as a database table and field. As this book points out, data in a database isn’t interesting or useful. People only care about “data on the move” or data that is being used. For everyone who thinks their data is excellent when it is 99% correct, who talks about a data strategy or who over-casually makes plans to do “data mining” this is a book worth reading. It is explained in a manner that everyone can read and understand it. At the same time, it carefully lays-out the problems with our data and our plans to use it, as well as lists of issues to consider in order to make better use of this data.”
The Issues, Examples and Solutions
The book covers the issues that prevent data sharing. First of all, it reduces a person’s power to share their data. There are privacy and security issues. The data that one person works with and cares about is different than the data another group might benefit from and that’s not always apparent. The book addresses all these issues. It gives examples on privacy and security policy issues, that you can’t make it so difficult to get data that people can’t get at it, but that there are procedures to follow to make sure the data doesn’t get into the wrong hands. Or, there’s the issue that different people and groups need and want to see the data in different ways. He includes excellent and simple examples to illustrate his points.
Our Problem
One issue the author mentions is that a computer, well, it “computes.” It manages the routine. Data is managed when it comes from routine processes. Here’s a key for us. We’re always running into problems when we discuss the non-routine work such as ad-hoc samples, non-routine testing, etc… This is where our industry seems to have the most trouble finding a good way to manage and mine that data. We’re coming up with more strategies for it, but even this expert points to this as difficult.
It’s not unreasonable for us to try to capture and use this data but maybe it is true that we’re using a computer for something which it doesn’t handle particularly well – the unexpected and non-routine. If we recognize that as the challenge, we can work to overcome it.




