Leveraging Classes

Software Engineering Principles in Python

Adam Spannbauer

Machine Learning Engineer at Eastman

Extending Document class

class Document:
    def __init__(self, text):
        self.text = text

Tokenize Document

Software Engineering Principles in Python

Current document class

class Document:
    def __init__(self, text):
        self.text = text

Tokenize Document init

Software Engineering Principles in Python

Revising __init__

class Document:
    def __init__(self, text):
        self.text = text
        self.tokens = self._tokenize()

doc = Document('test doc') print(doc.tokens)
['test', 'doc']
Software Engineering Principles in Python

Adding _tokenize() method

# Import function to perform tokenization
from .token_utils import tokenize

class Document: def __init__(self, text, token_regex=r'[a-zA-Z]+'): self.text = text self.tokens = self._tokenize()
def _tokenize(self): return tokenize(self.text)
Software Engineering Principles in Python

Non-public methods

Tokenize Document init

Private Property Sign

Software Engineering Principles in Python

The risks of non-public methods

  • Lack of documentation

  • Unpredictability

Private Property Sign

Software Engineering Principles in Python

Let's Practice

Software Engineering Principles in Python

Preparing Video For Download...