WHAT IS DATA SCIENCE ?
We live in a world surrounded by data, when we use any social media applications like Whatsapp, Twitter, LinkedIn , Facebook and maybe thousands of others, we are dealing with data.
You visit a website and maybe before you start using the contents of the website, details are asked like your Name, Age, Sex, Educational Qualification, Profession, Income, Location , Mobile No , email address etc and other related questions pertaining to the usage of the website. Invariably we have agreed to pass our personal information . This data can be used by the website owners in deciding their future line of action by using data science.
This is where Data Science comes into play. Previously say about 20 years back Data was collected by Statisticians and they handled the data by finding out averages like mean, median, mode, standard deviation and found correlations between variables using limited Statistical methods.
Then came SAS a statistical software which used to compute data using computers but yet, the ability to interpret data and prediction were miles away. One could only get maybe a brief idea of what the data would possibly lead to but were never accurate.
In most of the situations, the data would be misleading and would give wrong results. A typical example could be seen in during Indian Elections in the '90s when Exit Polls would predict a particular party winning but the results would be exactly opposite.
Data manipulation changed in the 2000s when more computing Softwares came into being, Relational Database Management Softwares made waves, like Oracle, MS SQL were used extensively. Yet, data was undefined and not clearly interpreted.
Today, Data has gone beyond finding meaningful relationships. It's now Predictive Analytics which uses both Qualitative and Quantitative Aspects of data. Python programming language has been a revolution to Data Science.
Python is an Open Source Sofware ( Meaning there are no copyrights issues) with a large community updating the Python codes on a daily basis.
We have another open-source software called R , which is also extremely popular among Data Researchers but its community is not as large as Python.
Data Science Researchers prefer Python over R as its easier to use and versatile. Anyone who is interested in learning Data Science Techniques can download the software free of cost from their website https://www.python.org. The interesting part of Python being platform and data format independent features.
The R software can be downloaded from www.r-project.org. My personal experience is that Python is more robust, versatile and can handle large amounts of data without any hassles.In fact Python has become the darling of Data Science Researchers with its complex data handling facilities.

Comments
Post a Comment