Recently my colleague introduced us a book that he finds interesting. It is about big data and how it would transform how we live, work and think. One key point to understand big data is that it does not necessarily mean having access to gigabytes (1 073 741 824 bytes) or exabyte (1.1529215 × 10^18 bytes) of data. It is more about making use of all available data that you have access to right now. You turn those seemingly junk data into somethings that provide insights.
Everything around us can be “datafied”.
To datafy a phenomenon is to put it in a quantified format so it can be tabulated and analyzed.
Logging is one way of datafication. As a software engineer, doing logging is essential to understand and spot potential issues in a complex software. The information gathered through logging provides us the insight to know rather accurately which parts of the software is causing issues. It makes things crystal clear.
Having all the data available to you, you eliminate the guess work. You don’t have to make assumptions (that may or may not be true) based on your gut feeling. Things usually fail with unrealistic assumptions. With the data that you collected, you no longer work in the dark. Everything is in the data.
My key take-away from reading the book is that we should let the data thinks for us. The data reflects the reality. It is the true measurement of the reality. We no longer have to be a genius to make the right decisions. This makes me think of the legendary investor, Warren Buffett.
Invest Using Big Data
In some ways, I think Warren Buffett datafies companies. He turns every aspect of a company into data that he can analyze. He reads tons and tons of financial reports. He talks to CEOs. He understands a company inside out. He, together with other legendary investors like Peter Lynch invest using big data in one way or another. Nothing is guess work for them. The data tells them what is good and what is bad, so they invest based on the available data. They are using common sense.
Based on the data they have, they choose the companies which are quietly making tons of money to invest in. They don’t have to do complicated calculations or use advanced theory to make their investment decisions. The data is guiding them. The data tells them whether the company is making profit or losing money. And they invest only in companies that make money. Nothing more complicated than that.
With that being said, we still cannot ignore the fact that it requires huge human effort to sort and analyze this large amount of data. We still need to give our effort in order to extract the info we want. There is no short-cut with big data.
Final thought: we may have limited resources, but not deprived of data. Data is cheap resource that has high potential value. It is available everywhere like the air we breath. Documents, books, web, … all contain valuable data. We, ourselves, generate tons of data everyday. The places we go, the foods we eat, the clothes we wear, the movies we watch, … are data. We can leverage on the data we have to understand the world better.