Month: November 2018

Big Data Analytics – Statistical Methods

When analyzing data, it is possible to have a statistical approach. The basic tools that are needed to perform basic analysis are − Correlation analysis Analysis of Variance Hypothesis Testing When working with large datasets, it doesn’t involve a problem as these methods aren’t computationally intensive with the exception of Correlation Analysis. In this case, … Continue reading Big Data Analytics – Statistical Methods

Can Hadoop Replace a Data Warehouse?

The users I spoke with ranged from seasoned data warehouse professionals to professionals who are better described as application developers who have limited data experience. Given the diversity of users (who come from diverse organizations with diverse requirements), I got diverse ideas about what a warehouse is (and is not), plus whether or not Hadoop … Continue reading Can Hadoop Replace a Data Warehouse?

Data Exploration vs. Data Discovery

Much has been made of the term “data discovery.” It is used profusely in the BI market and describes a fundamental transition in BI tools as emphasis has shifted from reporting to looking for new trends. Companies such as Tableau and Qlik altered the BI landscape with their ability to help business analysts discover new … Continue reading Data Exploration vs. Data Discovery

Big Data Analytics – Data Exploration

Exploratory data analysis is a concept developed by John Tuckey (1977) that consists on a new perspective of statistics. Tuckey’s idea was that in traditional statistics, the data was not being explored graphically, is was just being used to test hypotheses. The first attempt to develop a tool was done in Stanford, the project was … Continue reading Big Data Analytics – Data Exploration

What’s the difference between data lakes and data warehouses?

If you’ve heard the debate among IT professionals about data lakes versus data warehouses, you might be wondering which is better for your organization. You might even be wondering how these two approaches are different at all. When you’re first learning about data lakes, you may initially feel like you’ve been down this path before. … Continue reading What’s the difference between data lakes and data warehouses?

5 Essential Skills Every Big Data Analyst Should Have

In short, anybody can become a Big Data analyst. All they need to do is master the five essential skills every data analyst should know. Essential big data skill #1: Programming Learning how to code is an essential skill in the Big Data analyst’s arsenal. You need to code to conduct numerical and statistical analysis … Continue reading 5 Essential Skills Every Big Data Analyst Should Have

Big Data Analytics – Data Collection

Data collection plays the most important role in the Big Data cycle. The Internet provides almost unlimited sources of data for a variety of topics. The importance of this area depends on the type of business, but traditional industries can acquire a diverse source of external data and combine those with their transactional data. For … Continue reading Big Data Analytics – Data Collection