Lecture 12: MongoDB

Spatial Database Systems

J Mwaura

MongoDB

Supports scaling and geospatial indexing capabilities

No concepts of tables, rows, columns

No features of ACID compliance, JOINS, foreign keys etc

MongoDB stores data as Binary JSON Documents (also known as BSON)

MongoDB is built for scalability, performance and high-availability

MongoDB

MongoDB provides High Availability, Scalability and Partitioning at the cost of Consistency and Transactional support

MongoDB is a platform of choice for applications needing a flexible schema, speed and partitioning capabilities while it may not be suited for applications which require consistency and atomicity

MongoDB uses a JSON (JavaScript Object Notation) based document store to store the data

Documents have dynamic schema - means documents in a same collection can have different fields or structure or maybe common fields can have different type of data

It is implemented using C++ language

Features of MongoDB

Mongo provide support for secondary indexes, for users to query using query documents

It provides support for atomic updates (atomicity) at per document level

It provides replica set which are master-slave replication with automated failover and built-in horizontal scaling via automated range-based partitioning

MongoDB - Data Model

Polymorphic/flexible/dynamic schema - implies that documents within the same collection can have same or different set of fields or structure, and even common fields can store different type of values across documents

JSON

JSON stands for JavaScript Object Notation

JSON is both easy for humans to read and edit & easy for computers to store data, parse and output

JSON basic data types such as Strings, Floating point numbers, Boolean values, Null value, Arrays, Hashes

JSON Structure

JSON Structure

JSON documents begin and end with curly braces

JSON is composed of Fields, and each field has a key and a value & are commas separated

JSON documents are analogous to objects, structs, maps, or dictionaries, in other languages

All keys are surrounded in double quotes

Keys and values must be separated by colons

Arrays (defined by square brackets []) and objects can themselves be values and nested in any combination

Arrays contain an ordered, comma separated list of values & are analogous to arrays lists, vectors, or sequences

Binary JSON (BSON)

BSON is an extended form of JSON data model - supports embedding or arrays and objects within other arrays

BSON allows navigation to the objects to build indexes and match objects against query expression both on top-level and nested BSON keys, The Identifier (_id)

Key is used for querying data from the documents

A key uniquely identify each document within a collection. This is referred as _id in MongoDB

This key value is immutable and can be of any data type except arrays

Databases, Collections & Documents

In MongoDB, a database serves as a namespace for collections. Each database and collection combination define a namespace e.g. mydb.mycollection

Collections store individual records called documents

This hierarchy allows us to group together records of similar items within collections, and group collections required for the same application within the same database

Capped Collection

MongoDB has a concept of capping the collection - helps in maintaining its replication logs

This means it stores the documents in the collection in inserted order and as the collection reaches its limit the documents will be removed from the collection in FIFO (First in First Out) order

Capped collection guarantees preservation of the data in insertion order, hence

  • Queries retrieving data in insertion order return results quickly and don't need an index
  • Updates that change the document size are not allowed

Mongo Installation - Ubuntu

  1. sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 4B7C549A058F8B6B
  2. echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb.list
  3. sudo apt update | sudo apt install mongodb-org
  4. sudo systemctl enable mongod | sudo systemctl start mongod
  5. mongod --version checks the core database server or daemon
  6. mongo checks the database shell

Securing the Deployment

In MongoDB Authentication and Authorization is supported on per-database level

Users exist in the context of a single logical database and are stored in the system.users collection within the database

System.users - This collection stores information for authentication and authorization on that database. It stores the user's credentials for authentication and users privileges information for authorization

The available roles are in mongoDB; -read, -readWrite, -dbAdmin, -userAdmin, -clusterAdmin, -readAnyDatabase, -readWriteAnyDatabase, -userAdminAnyDatabase, -dbAdminAnyDatabase

Using Mongo Shell

Create the admin

>use admin

>db.createUser({user: "AdminUser", pwd: "password", roles:["userAdminAnyDatabase"]})

>db.createUser({user:"josh",pwd:"root",roles:["root"]})

...

>db.auth("AdminUser", "password")

More commands >> reference

MongoDB Statements

Syntax >use DATABASE_NAME

>use mydb

Insert at least one document to keep this database >db.users.insert({ id: 1})

Show databases >show dbs

Find current selected database >db

Delete database >db.dropDatabase()

MongoDB Statements

Copy database >db.copyDatabase(fromdb, todb, fromhost, username, password, mechanism)

Create collection >db.createCollection(name, options)

Get collections information >db.getCollectionInfos(); e.g. >db.getCollectionInfos({ name: "accounts" });

Insert document >db.COLLECTION_NAME.insert(document)

Query >db.COLLECTION_NAME.find(condition)

More commands >> reference

Controlling access over network

Always disable the HTTP Status page & the REST configuration in the production environment

Use Firewalls - used to control access within a network;

  • it can be used to allow access from a specific IP address to specific IP ports
  • it can be used to stop any access from any untrusted hosts

Encrypt Data - file system level encryption and permissions should be implemented in order to prevent unauthorized access to the files

Encrypt communication - it is recommended to use SSL for communication between the server and the client

  • Generate the .pem file which will contain the public key certificate and the private key

CRUD Operations

Aggregation

Data Models

Transactions

Indexes

Security

Change Streams

Replication

Sharding

Administration

Storage

End of Lecture 12

Spatial Database Systems

That's it!

Queries about this Lesson, please send them to: jmwaura@jkuat.ac.ke

*References*

  • Database Systems: Design, Implementation, and Project Management, Springer. Albert K W Yeung & G. Brent Hall
  • Database Systems: Design, Implementation, and Management, 12th ed. Carlos Coronel & Steven Morris
  • Database Modeling and Design; Logical Design, 5th ed. Taby Teorey et.al
  • Fundamentals of database systems, 6th ed. Ramez Elmasri & Shamkant B. Navathe
Courtesy of