Python vs. R for Data Science

In Data Science there are two languages that compete for users. On one side there is R, on the other Python. Both have a huge userbase, but there is some discussion, which is better to use in a Data Science context. Lets explore both a bit:

R
R is a language and programming environment especially developed for statistical computing and grahics. It has been around some time and several thousand packages to tackle statistical problems. With RStudio it also provides an interactive programming environment, that makes analysing data pretty easy.

Python
Python is a full range programming language, that makes it easy to integrate into a company wide system. With the packages Numpy, Pandas and Scikit-learn, Mathplotlib in combination with IPython, it also provides a full range suite for statistical computing and interactive programming environment.

R was developed solely for the purpose of statistical computing, so it has some advantages there, since it is specialized and has been around some years. Python is coming from a programming language and moves now into the data analysis field. In combination with all the other stuff it can do, websites and easy integrations into Hadoop Streaming or Apache Spark.
And for people who want to use the best of both sides can always use the R Python integration Rpy2.

I personally am recently working with Python for my ETL processes, including MapReduce, and anlysing data, which works awesome in combination with IPython as interactive development tool.

Please follow and like us:

Author: Marc

My career so far made it possible to have a look at the potential of analysis and data mining over a broad range of industries and data sources. I have expirience from customer relationship management in several industries to optimizing the aquisition of new customers through data mining. I can sqeeze information and knowledge from all available kinds of data to optimize processes in a company.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

code