Non-tabular NoSQL data stores

Introduction to NoSQL

Jake Roach

Data Engineer

Document databases

Definition: A NoSQL data storage tool that stores data in a flexible, semi-structured format, made up of key-value, key-array, and key-object pairs (similar to JSON).

$$

$$

Postgres JSON logo.

{
    "title": "Python for Data Analysis",
    "price": 53.99,
    "topics": [
        "Data Science",
        "Data Analytics",
        ...
    ],
    "author": {
        "first": "William"
        ...
    }
}
Introduction to NoSQL

Querying JSON data with Postgres JSON

Highlighted JSON data stored in a Postgres table.

SELECT
    books -> 'title' AS title,
    books -> 'price' AS price
FROM data_science_resources
WHERE 
    books -> 'author' ->> 'last' = 'Viafore';

Resulting in the following output:

Output of Postgres JSON query.

Introduction to NoSQL

Connecting to a Postgres database

import sqlalchemy

# Create a connection string, and an engine
connection_string = "postgresql+psycopg2://<user>:<password>@<host>:<port>/<database>"
db_engine = sqlalchemy.create_engine(connection_string)

To create a connection to a Postgres database:

  • Form a connection string
  • Create an engine using sqlalchemy.create_engine
  • db_engine variable will be created, pre-exercise
Introduction to NoSQL

Writing and executing Postgres JSON queries

import pandas as pd

# Build the query
query = """
    SELECT
        books -> 'title' AS title,
        books -> 'price' AS price
    FROM data_science_resources;
"""

# Execute the query
result = pd.read_sql(query, db_engine)
print(result)

To write and execute a query:

  • Build a query in a string
  • Pass query and db_engine to the pd.read_sql() function
  • Print the resulting DataFrame
Introduction to NoSQL

Other non-tabular NoSQL data stores

Key-value
{
    "name": "Jane Doe",
    "age": 25,
    "email": "[email protected]"
}

$$

Redis logo.

Graph

Image of a graph database.

Introduction to NoSQL

Let's practice!

Introduction to NoSQL

Preparing Video For Download...