Crash Course X

Web Scraping in Python

Thomas Laetsch

Data Scientist, NYU

Another Slasher Video?

xpath = '/html/body/div[2]'

Simple XPath:

  • Single forward-slash / used to move forward one generation.
  • tag-names between slashes give direction to which element(s).
  • Brackets [] after a tag name tell us which of the selected siblings to choose.
Web Scraping in Python

Another Slasher Video?

highlight_div.png

xpath = '/html/body/div[2]'
Web Scraping in Python

Slasher Double Feature?

  • Direct to all table elements within the entire HTML code:
xpath = '//table'
  • Direct to all table elements which are descendants of the 2nd div child of the body element:
xpath = '/html/body/div[2]//table`
Web Scraping in Python

Ex(path)celent

Web Scraping in Python

Preparing Video For Download...