What is MongoDB?
MongoDB is a Document-oriented database. And is in the NOSQL database group.
In this type of database, there is no table and record and collection and document are used. It can be said that the set looks like a table and the document looks like a record in a relational database.
In this database, the data does not have a fixed structure, and both documents (similar to a record in a relational database) can have a completely different structure, this type of structure is called a BSON. For example, two documents from an entity in this database might look like this:
id_ – name – age
id_ – family
What is BSON? MongoDB stores data in json format. This structure is called BSON in Mongolia. The structure of BSON is as follows.
"Date of Birth": "21/05/85",
"type" : "Home",
"type" : "work",
One of the advantages of MongoDB database over other relational databases (such as MySQL) is the ability to process and search much larger volumes of data at a time, as well as the ability to store larger volumes of data.
Higher processing speed: The relational database in each search or in each data store must examine many conditions such as the relationships between tables and the accuracy of record values, which greatly increases the RAM and CPU overhead, while MongoDB due to the NOSQL structure only It stores and searches, and as a result, the speed of accessing and storing data is drastically reduced.
Higher data storage: Databases can increase the amount of data that can be stored in the system in two ways. horizontal-scaling and vertical-scaling.
1. vertical scaling: In this method, the data is stored on a node and to increase the data volume, we increase the RAM and CPU or Hard Disk (hard disk). One of the databases that use this method is MySQL.
2.horizontal-scaling: In this method, the data is distributed on different nodes, and each piece of data is stored on one server, thus the processing load will be distributed on different servers and the speed of data access and the amount of stored data will be increased. Databases that use this method can be called MongoDB and Cassandra.
As mentioned above, in MongoDB you can master the amount of data that can be stored as well as the speed of data access by mastering horizontal-scaling called shard.
Advantages of using shared:
- As the cluster expands, the number of processes that each shard performs decreases (due to the spread of processes between the shards), thus increasing the speed of data access.
- As the number of shards increases, so does the amount of data that can be stored.
MongoDB spreads data across the collection. This means that the data of a set is distributed between nodes and shards. MongoDB uses the shard key to manage how data is distributed. shard key is a simple key or a combination key that is present in all documents. Generally id_ can be used as a shard key.
MongoDB uses two types of shard keys to distribute data. range based partitioning and hash based partitioning.
To study: What is Docker?
- Range Based Sharding: In this method, the data is stored as a shard key in categories called chunk. As a result, data with a shard key will be stored close to each other in a chunk. The advantage of this method is high search speed when we search by key. And the obvious problem is that the data is not properly distributed in the chunk. Since the data is generally stored in the database in key order, and because the input data to the database is in sequential key, all data will be stored in a chunk series.
2.Hash Based Sharding: In this method, Mongo generates a hash from the fields and then distributes the data in chunks using these hashes. Because the hash generated by the keys is completely different from the key, two documents with equal keys may be in two completely different chunk.
In this method, unlike the Range Based Sharding method, the data is completely distributed in chunks, so the processing pressure will not be on one node. One of the disadvantages of this method is that unlike the Range Based Sharding method, it is not easy to quickly search the data of a range.