MongoDb is a document database (i.e. a no SQL database).
It is advertised as cross-platform, providing high performance, high availability, and easy scalability.
MongoDb is a schemaless database. Every document can be a different structure, if you want.
MongoDb does not support relationships between Collections or Documents.
If you want data to be used together, save it as one document.
Instead of relying on foreign keys, just duplicate the data whereever it is needed.
MongoDb says disk space is cheap.
A database contains multiple Collections.
Each database on a server has it's own section of the file system.
A collection contains multiple Documents.
A collection is analogous to a SQL Table.
MongoDb Collections do not enforce a schema, meaning that documents in the same collection do not have to have the same fields.
Fields with the same Key do not need to contain the same data types.
What to divide into multiple Collections, and what to keep in one Collection?
Put multiple types in one Collection:
- Collections cannot be joined for queries.
- Aggregations cannot be performed across multiple Collections.
One type per Collection:
- To deserialize, you must know what type you are dealing with.
- If mixing types in a Collection, make sure they can be easily differentiated. Maybe by a "type" field.
- Indexes on the Collection will be updated whether or not the new/edited records contains the field being indexed.
- If the Indexed fields are all shared fields, that's an indication these types should be in the same Collection.
- If only part of a Collection is frequently updated, it should probably be its own Collection. Otherwise you'll get locking conflicts.
- Ex: An "auction item" has an array of "bids" under it. Now it's hard for people to all save their bids because they lock each other out of the database.
Many suggestions online says don't nest data more than 1 level deep. But that probably depends on how variable the structure is and how precise of a programmer you are.
- Ex: Don't have a array of objects which also contain arrays. At least, not when you'll want to search by those value.
count the documents
you can enter any "find" query into the method
db.getCollection("name").countDocuments({"field":"value"});
the raw integer result is printed at the end of the console output
insert multiple documents into collection
var allDocs=
[
{ "_id":"1" },
{ "_id":"2", "parentDocId":"1" },
];
db.collection_name.insert(allDocs);
remove all documents from collection
db.collection_name.remove({});
A document contains multiple Key/Value Pairs called Fields.
A document is analogous to a SQL Row/Record.
Documents are displayed, edited, etc as JSON objects.
Ex:
{
_id: ObjectId(7df78ad8902c),
title: 'Test',
comments: [
{
user: 'Steve',
comment: 'Test comment'
}
]
}
MongoDb provides a default key Field called "_id".
You can specify the _id when inserting a record, or allow MongoDb to generate it.
The default _id is made up of the timestamp, machine id, process id, and sequence number.
Mongo's query language is called MQL.
//comment out a line of MQL
Find will return a list of records.
Where field equals X:
db.getCollection('MyCollection').find({"myField": "X"})
db.getCollection('MyCollection').find({"myObject.myField": "X"})
db.getCollection('MyCollection').find({"myField": UUID("12345678-1234-1234-1234-123456789012")})
Where field exists:
db.getCollection('MyCollection').find({"myObject.myField":{$exists: true}})
If any link in the path to the field does not exist, then the field does not exist.
And
db.getCollection('MyCollection').find({$and: [{"myFieldA": "A"}, {"myFieldB": "B"}])
Array is at least 1-element long
db.getCollection.find({'myArray.0': {$exists: true}})
(indexing starts at 0)
Array is exactly 2-elements long
db.getCollection.find({'myArray': {$size: 2}})
Sort by age descending:
db.collection.find().sort( { age: -1 } )
Return just the first X records.
db.collection.find().sort( { age: -1 } ).limit(50)
If sort gives you an "exceeded memory limit" then add a limit to the number of results.
db.getCollection('customers').distinct('firstName')
Returns all the distinct values of the field "firstName" from collection "customers".
String minus last two characters:
myField: { $substr: [ "$originField", 0, { $subtract: [ { $strLenCP: "$originField" }, 2 ] } ] }
Last two characters of string:
myField: { $substr: [ "$originField", { $subtract: [ { $strLenCP: "$originField" }, 2 ] }, -1 ] }
String contains
db.getCollection('customers').find({fullName: { $regex: '.*Steve.*' } })
db.getCollection('customers').find({fullName: { $regex: /.*Steve.*/ } })
Capitalization matters
Count results (returns integer):
db.getCollection('MyCollection').find({"myField": "X"}).count()
Each expression (array element) in an aggregate can be mixed up in any order, repeated, etc.
Ex: You can have three "match" expressions, then a "replaceRoot", then another "match".
Just like find:
db.getCollection('Customers').aggregate([
{ $match: { _id: UUID("customer's uuid") } }
])
Find, then raise a nested document to be the new root of each result:
//given this customer format
{
_id: UUID("uuid"),
age: 35,
address: {
street: "street",
city: "city",
state: "state"
}
}
db.getCollection('Customers').aggregate([
{ $match: { _id: UUID("customer's uuid") } },
{ $replaceRoot: { newRoot: "$address" } } //pulls all of address up be the root
])
//results in
{
street: "street",
city: "city",
state: "state"
}
db.getCollection('Customers').aggregate([
{ $match: { _id: UUID("customer's uuid") } },
{ $replaceRoot: { newRoot: { age: "$age", city: "$address.city" } } } //flattens different levels together
])
//results in
{
age: 35,
city: "city"
}
Group by:
Start with the group by id, then add as many aggregations as you want.
db.getCollection('Customers').aggregate([
{ $group: { _id: "$idField", arrayA: { $addToSet: { "fieldA":"$aValue", "fieldB":"$bValue" } } } }
])
A group with a multi-part key
db.getCollection('Customers').aggregate([
{ $group: { _id: { a: "$a", b: "$b" } } }
])
"addToSet" creates an array of unique values
db.getCollection('Customers').aggregate([
{ $group: { _id: "$idField", arrayA: { $addToSet: "$aField" }, arrayB: { $addToSet: "$bField" } } }
])
And don't look for a DISTINCT operation in aggregate pipelines, there isn't one, group by is the only option.
AddField:
Add a new field to the documents
{ $addFields: { <newField>: <expression>, ... } }
Project to filter an array:
db.getCollection('MyCollection').aggregate([
{
"$project" : {
"field_to_keep_as_is": 1,
"my_filtered_array" : {
"$filter" : {
"input" : "$my_array",
"as" : "my_array", /*defaults to "this"*/
"cond" : {
/*$eq: ["$$my_array.some_field", "some_value" ]*/ /*if you want to check for a value*/
$not: ["$$my_array.field_that_might_not_exist"] /*$exists doesn't work in here*/
}
}
}
}
}
]);
Simplify an array
db.getCollection('poc_agencies').aggregate([
{ $project: {
"array_of_strings": {
$reduce: {
input: "$input_array_of_objects",
initialValue: [],
in: { $concatArrays: [ "$$value", ["$$this.keep_just_this_field"] ] }
}
}
}
}
]);
- $$value refers to the current accumulated value
- $$this refers to the next array element being operated on
Unwind, to break an array into individual objects
db.getCollection('MyCollection').aggregate([
{
$unwind: "$my_array"
}
]);
Flatten a recursive lookup
db.getCollection('agencies').aggregate([
{ $graphLookup: {
from: 'agencies', //name of the collection to search
startWith: '$_id', //field name to start with, probably same as connectionFromField but with a $ symbol
connectFromField: '_id', //this field in parent
connectToField: 'parentAgencyId', //connects to this field in child
as: 'descendantAgencyIds', //put all the results into an array named this
maxDepth: 100 //stop recursive lookup at this depth, to avoid infinite loops
}
}
])
Overwrite an entire collection with the results of this pipeline
db.getCollection('input_collection_name').aggregate([
{ $out : { db: "database_name", coll: "output_collection_name" } }
])
$out must be the last step in the pipeline
db.collection.updateOne()
db.collection.updateMany()
db.collection.update()
db.runCommand(
{
update: <collection>,
updates: [
{
q: <query>,
u: <document or pipeline>,
upsert: <boolean>,
multi: <boolean>,
collation: <document>,
arrayFilters: <array>,
hint: <document|string>
},
...
],
ordered: <boolean>,
writeConcern: { <write concern> },
bypassDocumentValidation: <boolean>,
comment: <any>
}
)
Optional.
A user-provided comment to attach to this command. Once set, this comment appears alongside records of this command in the following locations:
- mongod log messages
- database profiler output
- currentOp output
Optional. Defaults to true.
True: when an update statement fails, return without performing the remaining update statements.
False: when an update fails, continue with the remaining update statements, if any.
An array of one or more update statements to perform on the named collection.
{
q: <query>,
u: <document or pipeline>,
upsert: <boolean>,
multi: <boolean>,
collation: <document>,
arrayFilters: <array>,
hint: <document|string>
}
Any of:
- document containing update operator expressions
- a replacement document
- an aggregation pipeline (MongoDB 4.2 or later)
True: if no documents match the query, perform an insert.
Defaults to false.
True: update all documents that meet the query criteria.
False: update only 1 document.
For string comparisons.