MongoDB is an open source NoSQL DBMS which uses a document-oriented database model. It supports various forms of data. However, in MongoDB data consumption is high due to de-normalization. So, here, is a curated list of Top 9 MongoDB alternatives. This list includes commercial as well as open-source software with popular features and latest download … Continue reading 9 Best MongoDB alternatives in 2019
Month: May 2019
1. Objective Basically, Queuing in Kafka is one of the models for messaging traditionally. So, let’s begin with the brief introduction to Kafka as a Messaging System, that will help us to understand the Kafka Queuing well. Moreover, we will see some of the applications of Kafka Queue to clear the concept better.So, let’s start … Continue reading Kafka Queuing: Apache Kafka as a Messaging System
No matter how severe, hurricanes and other disasters are a concern for both individuals and businesses who operate in these areas. For businesses, such disasters can severely threaten their reputation, revenue, and competitiveness. Take Hurricane Sandy, which impacted hundreds of companies. Data recovery firms worked for weeks to try to restore data lost in the … Continue reading Is your data safe during hurricane season?
The story of data management has always been about greater simplicity. Organizations that once kept data siloed in different databases have found ways to connect it. When different types of data became more prevalent, such as social media and Internet of Things (IoT) data, that data was integrated, too. All to easily deliver robust insights. … Continue reading How to control costs and simplify life with IBM Hybrid Data Management Platform
1. Objective We will learn the whole concept of creating DataFrames in SparkR. Data is organized as a distributed collection of data into named columns. Basically, that we call a SparkDataFrames in SparkR. Also, there are many ways to create DataFrames in SparkR. 2. What is SparkDataFrames? Data is organized as a distributed collection of … Continue reading Ways to Create SparkDataFrames in SparkR
What is Hadoop? Apache Hadoop is open-source software that facilitates a network of computers to solve problems that require massive datasets and computation power. Hadoop is highly scalable, that is designed to accommodate computation ranging from a single server to a cluster of thousands of machines. While Hadoop is written in Java, you can program … Continue reading Hadoop for Data Science
The Apache Hadoop Module: Hadoop Common: this includes the common utilities that support the other Hadoop modules HDFS: the Hadoop Distributed File System provides unrestricted, high-speed access to the application data. Hadoop YARN: this technology accomplishes scheduling of job and efficient management of the cluster resource. MapReduce: highly efficient methodology for parallel processing of huge … Continue reading The Hadoop Module & High-level Architecture