All you need to know about NoSQL
Data Science

All you need to know about NoSQL

priya pille
priya pille
5 min read

NoSQL databases a phrase for "not only SQL" are databases that store data in forms or documents, other that the tabular or relational structures that we know of in traditional databases. You must have come across the term RDBMS – relational database management systems where the database is relative – the data in it structure such that you run a query and it retrieves some data for you. They store data in documents, graphs, key-value stores and wide columns.

NoSQL databases provide flexibility, speed and scalability in Big Data.

This article is a great precursor to understanding what and how a NoSQL database is. Data Science is the new dot com, and it is here to stay. If you are looking to kickstart your career in this field, a career change or someone who is looking for a roadmap to become Databricks Certified Associate Developer for Apache Spark on a career break confused about how to proceed, you must definitely consider data science.

 What is a NoSQL database? 

While the term “NoSQL database” standing for NoSQL or Not Only SQL is debatable, let us say it has all the databases other than the ones that store data in relational tables.

 They began as a response to huge data centres that occupied vast amounts of space, were complex, and at the same time were costly. NoSQL DB optimised storage by eliminating data duplicity, and flexible data storage options. As data kept pouring in, storage costs grew and there was a need to store days on an in basis rather than tweaking them to fit into a particular schema or table.

Further, as software development adopted agile, applications turned to real-time data streaming, and cloud hosting came into the picture, the need to iterate faster and make dynamic changes to their stack became prominent.

NoSQL database features :

While each NoSQL DB has distinct features that it offers, conceptually they can be classified into having:

Schema and relations flexibility: In relational databases, data is stored as per a strict schema – meaning they are stored as per a definitive structure – like tables or stacks, or trees and graphs depending on the database. The structure defines the data stream placement – directing which data goes where. On the flip side, you cannot fit data that does not adhere to the defined schema. In NoSQL, there are no defined structures and data is stored in the form of documents. This offers flexibility in both storage and putting different data types together. Similarly, you can put in a set of unrelated data into the same document in the NoSQL database.

Scaling: Let us understand scaling first. By scaling, we mean the number of reads and write requests the database can handle at a time or the load your database can take. There are two types of scaling - vertical scaling and horizontal scaling. In vertical scaling, as seen in traditional databases, we scale up by the addition of computational power, through a processing output upgrade, add Ram etc., Horizontal scaling is where you are adding parallel servers and distributing the database across it; which means the load is distributed. NoSQL databases inherently support ‘sharding’ where you are dividing the dataset into shards so as to improve scalability. You will come across this concept if you look up essentials to crack Databricks Certified Associate Developer for Apache Spark.

Speed in querying: NoSQL databases are faster when it comes to querying because of the way data is stored – data entities are grouped but not partitioned. It can store large amounts of data and is flexible with storage. Running a read or write request on it is, therefore, faster on NoSQL databases than on a relational database.

NoSQL Databases are used in..

NoSQL databases work wonderfully with Big Data. SQL databases are well established and use mature technology. There is no “better” database when it comes to relational or NoSQL databases. What we can do though is choose depending on its deployment and use cases. NoSQL databases are largely used and can prove to be extremely advantageous in:

·         Projects and applications using Agile development

·         Projects and applications that use real-time and large volumes of data

·         Where the architecture requires scaling

·         Where the application requires speed – streaming or real-time dynamic applications

To summarise.

NoSQL databases were created out of a need to process Big Data, as the existing databases bogged their speed, and were unable to offer the query speed the internet giants demanded. They have a dynamic schema, that can be used for insights and predictive analysis.

Think Twitter, Facebook and Instagram – there are different types of data, different forms, sizes and structures. It makes an ideal use case for large content management units, streaming applications and real-time analytics. But remember that you cannot write off SQL databases altogether. NoSQL is not a replacement. Choosing a database depends on the use and context.

Discussion (0 comments)

0 comments

No comments yet. Be the first!