• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • Crypto Currency
  • Technology
  • Contact
NEO Share

NEO Share

Sharing The Latest Tech News

  • Home
  • Artificial Intelligence
  • Machine Learning
  • Computers
  • Mobile
  • Crypto Currency

Group by in Pandas, SQL, and NoSQL

January 30, 2021 by systems

MongoDB (NoSQL database)

NoSQL refers to non-SQL or non-relational database design. NoSQL also provides an organized way of storing data but not in tabular form.

There are several NoSQL databases used in the data science ecosystem. In this article, we will be using MongoDB which stores data as documents. A document in MongoDB consists of field-value pairs. Documents are organized in a structure called “collection”. As an analogy, we can think of documents as rows in a table and collections as tables.

The dataset is stored in a collection called marketing. Here is a document in the marketing collection that represents an observation (i.e. a row in a table).

> db.marketing.find().limit(1).pretty()
{
"_id" : ObjectId("6014dc988c628fa57a508088"),
"Age" : "Middle",
"Gender" : "Male",
"OwnHome" : "Rent",
"Married" : "Single",
"Location" : "Close",
"Salary" : 63600,
"Children" : 0,
"History" : "High",
"Catalogs" : 6,
"AmountSpent" : 1318
}

The db refers to the current database. We need to specify the collection name after the dot.

MongoDB provides the aggregate pipeline for data analysis operations such as filtering, transforming, filtering, and so on. For group by operations, we use the “$group” stage in the aggregate pipeline.

The first example is to calculate average spent amount for each age group.

> db.marketing.aggregate([
... { $group: { _id: "$Age", avgSpent: { $avg: "$AmountSpent" }}}
... ])
{ "_id" : "Old", "avgSpent" : 1432.1268292682928 }
{ "_id" : "Middle", "avgSpent" : 1501.6909448818897 }
{ "_id" : "Young", "avgSpent" : 558.6236933797909 }

The fields (i.e. column in table) used for grouping are passed to the group stage with the “_id” keyword. We assign a name for each aggregation that contains the field to be aggregated and the aggregation function.

Filed Under: Artificial Intelligence

Primary Sidebar

Stay Ahead: The Latest Tech News and Innovations

Cryptocurrency Market Updates: What’s Happening Now

Emerging Trends in Artificial Intelligence: What to Watch For

Top Cloud Computing Services to Secure Your Data

The Future of Mobile Technology: Recent Advancements and Predictions

Footer

  • Privacy Policy
  • Terms and Conditions

Copyright © 2025 NEO Share

Terms and Conditions - Privacy Policy