Abseits der ausgetretenen XPath-Pfade

Web Scraping in Python

Thomas Laetsch

Data Scientist, NYU

(At)tribut

  • @ steht für „Attribut“
    • @class
    • @id
    • @href
Web Scraping in Python

Klammern und Attribute

xpathattr.png

Web Scraping in Python

Klammern und Attribute

xpathattr_div_p1.png

xpath = '//p[@class="class-1"]'
Web Scraping in Python

Klammern und Attribute

xpathattr_div.png

xpath = '//*[@id="uid"]'
Web Scraping in Python

Klammern und Attribute

xpathattr_div_astc2.png

xpath = '//div[@id="uid"]/p[2]'
Web Scraping in Python

Inhalt mit contains

XPath-contains-Notation:

contains( @attribut-name, "string-expr" )

Web Scraping in Python

Contain this

xpath = '//*[contains(@class,"class-1")]'

ClassSelection-Xpath-contains.png

Web Scraping in Python

Contain this

xpath = '//*[@class="class-1"]'

ClassSelection-Xpath-eq.png

Web Scraping in Python

Ganz schön „classy“

xpathattr_div_astc2.png

xpath = '/html/body/div/p[2]'
Web Scraping in Python

Ganz schön „classy“

xpathattr_div_p2-class.png

xpath = '/html/body/div/p[2]/@class'
Web Scraping in Python

Am Ende des Pfads

Web Scraping in Python

Preparing Video For Download...