DataScientists: a blog about everything data related.

  • SQL and Hadoop

    Bringing SQL to Hadoop has been one of the major trends in Big Data these last twelve months. Reason enough for me to take a closer look at that scene right now. One reason to build an interface based on SQL for Hadoop is to make the technology available for more people. Companies that have […]

  • REST (Representational state transfer) APIs and Big Data

    Getting data, huge amounts of data, out of some systems tends to be quite a hazzle sometimes. Often you are required to use techniques such as FTP or SSH for transfering files. But with RESTful APIs getting more attention in the last few years, there is a new way to get your data. The charm […]

  • Big Data in Learning

    There are many fields in which big data can improve results. One of these being (e-)learning. Until recently the focus on analysing learning lay on analysing results of exams but with big data and analytics there are new possibilities to enhance the experience of learning as a whole. For example there is the possibility to […]

  • Hadoop and MPP

    With Big Data Map/Reduce is always the first term that comes into mind. But it’s not the only way to handle large amounts of data. There are databasesystems especially built to deal with huge amounts of data and they are called Massively Parallel Processing (MPP) databases. MPP database systems have been around for a longer […]

  • Visualization: Enhancing the Palo Suite with NVD3.js

    After my previous post How to visualize data? I was unsatisfied with the visualization provided by the Palo Suite provided by Jedox. This could have several reasons, not the least, that I may not have been able to get the max out of it. But the quality of the resulting diagramms and it’s interactivity were […]

  • How to visualize data?

    Data visualization is something like an art. How to make results from your research in data easy to understand by management, business users or just everyone out there? A list of data, like an Excel sheet ist not what catches the eye. The art in visualization is shown perfectly on the site of Martin Wattenberg. […]

  • Data Science Tools

    What tools are used for Data Science? There are a lot of them out there and in this post I want to tell you about the ones I currently use or used before. KNIME is a graphical tool to analyse data. It uses an interface to build process flows that contain everything, from data transformation, […]

  • Data Science and Machine Learning

    Machine Learning is acknowlegded as a part of Data Science, but will it be able to replace a Data Scientist? There have been several articles around that topic in the last few years and months. It’s true there has been some major progress in the field of machine learning and there are already articles about […]

  • Data Scientist: Hype or Sexy?

    Data Scientists seem to be everywhere nowadays. This title has seen a huge increase in appearences in job descriptions, as Indeed.com demostrates in its data. There are several sites and articles that even describe the job as sexy: Forbes Harvard Business Review The combination of handling Big Data and Analytics is what makes this title […]

  • Data Science: What is it?

    Data Science is an interdisciplinary field of sciences. It includes: Data Engineering Math Statistics Advanced Computing Visualization Domain Expertise It revolves around, as the name suggests, working with data. With the development in creating big sets of data in our society, the need for analyzing this data grows across all industries. And this calls for […]

Got any book recommendations?


By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close