DATA DO – データ道

DataScientists: a blog about everything data related.

SQL and Hadoop

Bringing SQL to Hadoop has been one of the major trends in Big Data these last twelve months. Reason enough for me to take a closer look at that scene right now. One reason to build an interface based on SQL for Hadoop is to make the technology available for more people. Companies that have […]

October 9, 2013
REST (Representational state transfer) APIs and Big Data

Getting data, huge amounts of data, out of some systems tends to be quite a hazzle sometimes. Often you are required to use techniques such as FTP or SSH for transfering files. But with RESTful APIs getting more attention in the last few years, there is a new way to get your data. The charm […]

August 23, 2013
Big Data in Learning

There are many fields in which big data can improve results. One of these being (e-)learning. Until recently the focus on analysing learning lay on analysing results of exams but with big data and analytics there are new possibilities to enhance the experience of learning as a whole. For example there is the possibility to […]

June 12, 2013
Hadoop and MPP

With Big Data Map/Reduce is always the first term that comes into mind. But it’s not the only way to handle large amounts of data. There are databasesystems especially built to deal with huge amounts of data and they are called Massively Parallel Processing (MPP) databases. MPP database systems have been around for a longer […]

April 26, 2013
Visualization: Enhancing the Palo Suite with NVD3.js

After my previous post How to visualize data? I was unsatisfied with the visualization provided by the Palo Suite provided by Jedox. This could have several reasons, not the least, that I may not have been able to get the max out of it. But the quality of the resulting diagramms and it’s interactivity were […]

March 11, 2013
How to visualize data?

Data visualization is something like an art. How to make results from your research in data easy to understand by management, business users or just everyone out there? A list of data, like an Excel sheet ist not what catches the eye. The art in visualization is shown perfectly on the site of Martin Wattenberg. […]

February 27, 2013
Data Science Tools

What tools are used for Data Science? There are a lot of them out there and in this post I want to tell you about the ones I currently use or used before. KNIME is a graphical tool to analyse data. It uses an interface to build process flows that contain everything, from data transformation, […]

December 6, 2012
Data Science and Machine Learning

Machine Learning is acknowlegded as a part of Data Science, but will it be able to replace a Data Scientist? There have been several articles around that topic in the last few years and months. It’s true there has been some major progress in the field of machine learning and there are already articles about […]

October 25, 2012
Data Scientist: Hype or Sexy?

Data Scientists seem to be everywhere nowadays. This title has seen a huge increase in appearences in job descriptions, as Indeed.com demostrates in its data. There are several sites and articles that even describe the job as sexy: Forbes Harvard Business Review The combination of handling Big Data and Analytics is what makes this title […]

October 15, 2012
Data Science: What is it?

Data Science is an interdisciplinary field of sciences. It includes: Data Engineering Math Statistics Advanced Computing Visualization Domain Expertise It revolves around, as the name suggests, working with data. With the development in creating big sets of data in our society, the need for analyzing this data grows across all industries. And this calls for […]

September 28, 2012

Got any book recommendations?

By continuing to use the site, you agree to the use of cookies. more information