Distinct As You Like It: Filtering with Regular Expressions

Introduction to MongoDB in Python

Donny Winston

Instructor

Finding a substring with $regex

db.laureates.find_one({"firstname": "Marie"})
{'born': '1867-11-07',
 'bornCity': 'Warsaw',
 'bornCountry': 'Russian Empire (now Poland)',
 'firstname': 'Marie',
 'surname': 'Curie, née Sklodowska',
 ...}
db.laureates.distinct("bornCountry", 
                      {"bornCountry": {"$regex": "Poland"}})
['Russian Empire (now Poland)',
 'Prussia (now Poland)',
 'Germany (now Poland)',
 'Austria-Hungary (now Poland)',
 'German-occupied Poland (now Poland)',
 'Poland',
 'Poland (now Ukraine)',
 'Poland (now Lithuania)',
 'Poland (now Belarus)',
 'Free City of Danzig (now Poland)']
Introduction to MongoDB in Python

Flag options for regular expressions

case_sensitive = db.laureates.distinct(
    "bornCountry",
    {"bornCountry": {"$regex": "Poland"}})
case_insensitive = db.laureates.distinct(
    "bornCountry",
    {"bornCountry": {"$regex": "poland", "$options": "i"}})

assert set(case_sensitive) == set(case_insensitive)

from bson.regex import Regex db.laureates.distinct("bornCountry", {"bornCountry": Regex("poland", "i")})
['Russian Empire (now Poland)', ...]
import re

db.laureates.distinct("bornCountry",
                      {"bornCountry": re.compile("poland", re.I)})
['Russian Empire (now Poland)', ...]
Introduction to MongoDB in Python

Beginning and ending (and escaping)

from bson.regex import Regex

db.laureates.distinct("bornCountry", 
                      {"bornCountry": Regex("^Poland")})
['Poland',
 'Poland (now Ukraine)',
 'Poland (now Lithuania)',
 'Poland (now Belarus)']
db.laureates.distinct(
    "bornCountry", 
     {"bornCountry": Regex("^Poland \(now")})
['Poland (now Ukraine)', 
 'Poland (now Lithuania)', 
 'Poland (now Belarus)']
db.laureates.distinct(
    "bornCountry", 
     {"bornCountry": Regex("now Poland\)$")})
['Russian Empire (now Poland)',
 'Prussia (now Poland)',
 'Germany (now Poland)',
 'Austria-Hungary (now Poland)',
 'German-occupied Poland (now Poland)',
 'Free City of Danzig (now Poland)']
Introduction to MongoDB in Python

Let's practice!

Introduction to MongoDB in Python

Preparing Video For Download...