Wednesday, April 13, 2016

Which big data programming language should I use?

In an article posted just this month it explains that with the vast amount of data programming available, how do users known which language to use.  The article describes four different data types including R, Python, Scala, and Java with reasons when or why users choose one over another.1 R is good for analysis and plotting. Python is more adapted to supporting big data processing frameworks. While, Scala is successful in the combination of functional and object-oriented paradigms. Finally, Java is not a favorite because it is only and now it only used by drones.

While reading this article, I noticed that deciding which data programming language to use really depends on the task, there is no one definite better language to use.  However, that being said it is still frowned upon to incorporate too many different languages or the control system will become overwhelmed and cluttered. While at the same time, utilizing more than one language is helpful to diversify the work place and let different tasks excel in more fitting languages. That is why it depends. In a separate poll it was surveyed that eighty-five percent of big data users preferred accomplishing projects using either R or Python language, proving that they are effective and easy to use. 2 Another interesting thing about Python is that it originated the idea of the web-based notebook allowing the sharing of basically anything in a logbook format. Due to the success of this idea, almost all of the other languages have also adopted this idea.

The first article on big data explains the language of R as a little confusing and requiring adjustments before fully understanding how to properly use this system, however in the second article it describes R as an everyday code used everywhere in social media including Facebook and Twitter. Something that the article does not fully explain is how Scala and Java are intertwined. It says Scala gets “access to the Java ecosystem for free”1 but does not explain how this is so, and if that is true do they use the same language or it is converted from one language to another depending on the system. A final question this article raises is that when Java releases Java 9 will this system be more competitive with the other big data systems because it will be improved from the version released 20 years ago, or is it still not up to par with the others.



2. http://www.datasciencecentral.com/profiles/blogs/ten-top-languages-for-crunching-big-data

1 comment:

  1. Coming from my limited experience with coding from my computer science 111 class and talking to my roommates, who one is a math major and another a computer science minor, I know some basics of these languages. What I find interesting is how today universities teach them. In my CS 111 class, of these four, we only touched on Java, my math major roommate learned R for the same reasons as the article said plotting and analysis of data, and Python I only heard for the first time when I was Abroad in New Zealand. I found it very interesting that there in New Zealand, that most of the students learn Python before anything else. My roommates there were computer science and engineering and they both had the language completely down as it was the first language they learned at the university. They don't expect to learn Java until next year. So hearing that was very shocking considering it is a language that I have never heard about here in the United States (granted I'm not a cs major, minor, or going for the certificate).

    ReplyDelete