Tuesday, August 22, 2017

The Future of Data Mining Fast Data

The Future of Data Mining Fast Data






Firstly, here are sum statistics from the article I read for this particular blog post:
  • Every minute:
    • 48 HOURS of video are uploaded on Youtube
    • 204 million e-mails are sent
    • 600 new websites pop up
    • 600,000 pieces of content are shared on Facebook
    • Upwards of 100,000 tweets are sent

This article stresses the idea that data mining is time. Author Alissa Lorentz states that we must be able to mine data as quickly as we produce it. Because the of the plethora of electronic information available today, data mining is extremely important and an issue or concept of which I was previously not aware. Lorentz discusses the difference between smart data, data that provides insight to large data sets and big data, which is a term we apply to extremely large data sets. She then elaborates on a concept she calls "fast data." Fast data will eventually be extremely useful. It analyzes data sets in real time. If one were able to analyze all of the data available on a specific company in any given day in a meaningful way, lets just say Id be looking at the stock market.

In class, we have discussed mainly archiving data, organizing data in a historical sense. This article discusses a different concept: streaming data i.e. streaming data live rather than storing it for future use. To me, this is ideal. Rather than storing messages on Facebook, providing users with a list compiled of a certain amount of friends that have recently been in contact on the social network would save memory and computing powers as well as be more useful to the user who has messages from conversations years ago. Also, in applying this concept to other situations, Lorentz talks about how streaming data would provide important information on traffic or public health issues such as flu outbreaks. With the abundance of information that is constantly being added to the web, storing and archiving this information will undoubtedly become obsolete. Instead of focusing on analyzing past data, after reading this article, I think the best direction in the data mining world would be to chase the data rather than store it. Updating data sets in real time would not only eliminate the need for large storage systems, but it would better indicate the trends occurring in the here and now. 














Link to article:
http://www.wired.com/insights/2013/04/big-data-fast-data-smart-data/

download file now