Introduction to NoSQL
Jake Roach
Data Engineer
Definition: A NoSQL data storage tool that stores data in a flexible, semi-structured format, made up of key-value, key-array, and key-object pairs (similar to JSON).
$$
$$
{
"title": "Python for Data Analysis",
"price": 53.99,
"topics": [
"Data Science",
"Data Analytics",
...
],
"author": {
"first": "William"
...
}
}
SELECT
books -> 'title' AS title,
books -> 'price' AS price
FROM data_science_resources
WHERE
books -> 'author' ->> 'last' = 'Viafore';
Resulting in the following output:
import sqlalchemy
# Create a connection string, and an engine
connection_string = "postgresql+psycopg2://<user>:<password>@<host>:<port>/<database>"
db_engine = sqlalchemy.create_engine(connection_string)
To create a connection to a Postgres database:
sqlalchemy.create_engine
db_engine
variable will be created, pre-exerciseimport pandas as pd
# Build the query
query = """
SELECT
books -> 'title' AS title,
books -> 'price' AS price
FROM data_science_resources;
"""
# Execute the query
result = pd.read_sql(query, db_engine)
print(result)
To write and execute a query:
query
and db_engine
to the pd.read_sql()
functionIntroduction to NoSQL