Databricks with the Python SDK
Avi Steinberg
Senior Software Engineer


Install Python package dependencies:
pip install --upgrade databricks-langchain langchain-community langchain
databricks-sql-connector databricks-sqlalchemy
Import libraries:
from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.sql_database import SQLDatabase
from databricks_langchain import ChatDatabricks

Export environment variables:
os.environ['DATABRICKS_TOKEN'] = '<Your-Access-Token>'
os.environ["DATABRICKS_HOST"] = "<your-workspace-id>.cloud.databricks.com"
verbose=True to have the AI agent show its work
db = SQLDatabase.from_databricks(
catalog="samples",
schema="nyctaxi",
warehouse_id=warehouse_id)
llm = ChatDatabricks(
endpoint="databricks-meta-llama-3-3-70b-instruct",
temperature=0.1,
max_tokens=100)
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent = create_sql_agent(llm=llm, toolkit=toolkit, verbose=True)
# Query the Databricks SQL Agent
result = agent.run("What's the time and distance of the longest trip?")
print(result)
db = SQLDatabase.from_databricks(
catalog="samples",
schema="nyctaxi",
warehouse_id=<your-warehouse-id>
)
llm = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct",temperature=0.1, max_tokens=100, )
temperature: a float between 0 and 1 used to specify the randomness in model responsesmax_tokens: specify the maximum number of tokens in the responsetoolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent = create_sql_agent(llm=llm, toolkit=toolkit, verbose=True)result = agent.run("What's the time and distance of the longest trip?") display(result)
The average trip takes approximately 15 minutes.
Databricks with the Python SDK