Input data

Introduction to TensorFlow in Python

Isaiah Hull

Visiting Associate Professor of Finance, BI Norwegian Business School

The slide shows a diagram of three different data types: text data, image data, and numeric data.

Introduction to TensorFlow in Python

Importing data for use in TensorFlow

  • Data can be imported using tensorflow
    • Useful for managing complex pipelines
    • Not necessary for this chapter
  • Simpler option used in this chapter
    • Import data using pandas
    • Convert data to numpy array
    • Use in tensorflow without modification
Introduction to TensorFlow in Python

How to import and convert data

# Import numpy and pandas
import numpy as np
import pandas as pd

# Load data from csv
housing = pd.read_csv('kc_housing.csv')

# Convert to numpy array
housing = np.array(housing)
  • We will focus on data stored in csv format in this chapter
  • Pandas also has methods for handling data in other formats
    • E.g. read_json(), read_html(), read_excel()
Introduction to TensorFlow in Python

Parameters of read_csv()

Parameter Description Default
filepath_or_buffer Accepts a file path or a URL. None
sep Delimiter between columns. ,
delim_whitespace Boolean for whether to delimit whitespace. False
encoding Specifies encoding to be used if any. None
Introduction to TensorFlow in Python

Using mixed type datasets

This image shows data taken from the King County housing dataset with the house price column highlighted.

This image shows data taken from the King County housing dataset with the waterfront column highlighted.

Introduction to TensorFlow in Python

Setting the data type

# Load KC dataset
housing = pd.read_csv('kc_housing.csv')

# Convert price column to float32
price = np.array(housing['price'], np.float32)

# Convert waterfront column to Boolean
waterfront = np.array(housing['waterfront'], np.bool)
Introduction to TensorFlow in Python

Setting the data type

# Load KC dataset
housing = pd.read_csv('kc_housing.csv')

# Convert price column to float32
price = tf.cast(housing['price'], tf.float32)

# Convert waterfront column to Boolean
waterfront = tf.cast(housing['waterfront'], tf.bool)
Introduction to TensorFlow in Python

Let's practice!

Introduction to TensorFlow in Python

Preparing Video For Download...