What tools are used for Data Science? There are a lot of them out there and in this post I want to tell you about the ones I currently use or used before.
- KNIME is a graphical tool to analyse data. It uses an interface to build process flows that contain everything, from data transformation, initial analysis, predictive analysis, vizualisation and reporting. One of it’s advantages is the huge community and it being an open source tool, that encourages the community to contribute.
- Rapid Miner from Rapid-I is also a graphical tool to analyse data. Processes are built using predifined steps. It provides data transformation, initial analysis, predictive analysis, vizualisation and reporting. Since it is based on Java it is plattform independent. There is a community too, that helps to improve the programm and expands the available resources.
- SAShas a whole suite of tools for data manipulation and analysis. They provide Olap tool, predictive analytics, reporting and vizualisation. Being in the market for a long time, they have a huge customer base and lots of experience. There is also a system of trainings with exams to provide certified qualifications in using there tools.
- R is a free tool, developed for scientists in biology first, but it is spread through all kinds of industries now, due to its wide range of packages. There is no graphical interface but the language is easy to learn. R provides data manipulation, visualization, predictive analysis, reporting and initial analysis. Also there is an integration into Hadoop for better interaction with Big Data.
- Splunk is a tool primarily for analysing unstructured data, like logfiles. It provides real time statistics and a outstanding visualization for reports. Its language is related to SQL, so it is pretty easy to learn, if you used SQL queries before.
- Jedox provides an Olap server with an interface that looks like MS Excel on the web and they have a plugin into MS Excel too. It caters mainly to controlling need, but has some advantages regarding self-service BI. Based on PHP and Java it is available in a community version and a professional version.
- FastStats from Apteco uses a easy to understand graphical interface and some basic predictive methods. It enables business users to analyse their data themselves and even build small models. It also provides visualization tools. This is also a tool catering to self-service BI.
If you have other tools you use and like, please feel free to share them with me. I am always interessted in learning about new tools.