Data science, advanced analytics, machine learning and artificial intelligence are revolutionizing not only our approach to business but also our society and – even – our personal lives. Think of self-driving cars, smart houses, and virtual assistants, the world as we know it is changing rapidly thanks to all the new “smart” technologies and the endless possibilities they create for our future.
For data scientists, smart technologies mean much more than just enjoying new inventions, it means they can actively contribute to creating them. To keep up with and support development of these drastic changes in the industry, data scientists must continue to build-upon their skills and educate themselves on the latest must-have tools in their field.
The Upward and Downward Trends to Monitor in Data Science Technology
Here are a few tools that might vanish within the next three years:
- Excel: Remember when data analysts were attempting some statistical modeling and linear regressions using Excel? Well, those days are over. No doubt excel is a great tool, but it’s just not fit for sustainable data science. Why? It’s too easy to make mistakes, and too hard to correct them. Not to mention that Excel cannot keep up with the increasing amount of data. Luckily, with statistical languages like ‘R’ and Python, data manipulation and predictive modeling can be done just as easily and on a larger scale.
- Hadoop and MapReduce: When the world started talking about Big Data around 10 years ago, Google and Yahoo! were making much use of HDFS and MapReduce to distribute and scale complex operations on big data. It turns out, that 10 years later, not only is MapReduce is not synonymous with Big Data anymore, but it has been abandoned by most major companies. At the same time, the rise of frameworks like Spark and Azure Data Lake Analytics to bridge the gap between statistical programming and distributed computing, makes specific Hadoop knowledge irrelevant for data scientists.
- IBM SPSS Modeler: There’s no doubt that the IBM SPSS Modeler is a great tool and is easy to use, but will it have a place in the future of data science? I don’t believe so. SPSS offers built-in data access capabilities, data-crunching functions and reporting options – but, it is quite pricey. However, from a trend perspective, it appears that data scientists lean more towards using the most advanced – individual – tools for certain tasks rather than trying to do everything with one tool. Additionally, it turns out, the data science community loves open source and the ‘R’ and Python packages available out there work terrifically for their needs.
So, what upward trends will we be seeing in the coming years?
- TensorFlow: Ever since Google decided to release its deep learning library in open source, it has become increasingly popular in the data science world. TensorFlow allows data scientists to easily define neural networks using data flow graphs. It runs – pretty – much anything and it’s optimized for deep learning. Statistics on stack overflow posts clearly identify that TensorFlow has overtaken every other deep learning library on the market.
- Julia: ‘R’ is still ranking as the number one favorite language for scientific and statistical programming, but Julia is here to challenge its status. Just like ‘R,’ Julia is open source and has a vast community of developers and supporters that are continuously implementing new packages and extending the available ones. Also, the performances of ‘C.’ Julia is here to stay thanks to its sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematically functioning library.
- Cognitive APIs: Both IBM, as well as Microsoft, have invested in releasing cognitive APIs to support data scientists and developers with text search, computer vision, sentiment analysis, OCR, and text-to-speech. Thanks to those APIs, it’s now possible to build elaborate systems like chatbots, smart traffic cameras and even personality evaluation tools. Easy-to-use, cheap and incredibly accurate those APIs will surely be one of the building blocks when developing the next complex smart application.
If you would like to know more, please contact Erica D’Acunto, Senior Data Scientist at ORTEC Consulting, via Erica.dAcunto@ortec.com or via our website: www.ortec-consulting.com.