Van het gebaande XPad

Webscraping in Python

Thomas Laetsch

Data Scientist, NYU

(At)tribuut

  • @ staat voor "attribuut"
    • @class
    • @id
    • @href
Webscraping in Python

Haakjes en attributen

xpathattr.png

Webscraping in Python

Haakjes en attributen

xpathattr_div_p1.png

xpath = '//p[@class="class-1"]'
Webscraping in Python

Haakjes en attributen

xpathattr_div.png

xpath = '//*[@id="uid"]'
Webscraping in Python

Haakjes en attributen

xpathattr_div_astc2.png

xpath = '//div[@id="uid"]/p[2]'
Webscraping in Python

Inhoud met contains

Xpath contains-notatie:

contains( @attribuutnaam, "string-expr" )

Webscraping in Python

Bevat dit

xpath = '//*[contains(@class,"class-1")]'

ClassSelection-Xpath-contains.png

Webscraping in Python

Bevat dit

xpath = '//*[@class="class-1"]'

ClassSelection-Xpath-eq.png

Webscraping in Python

Word classy

xpathattr_div_astc2.png

xpath = '/html/body/div/p[2]'
Webscraping in Python

Word classy

xpathattr_div_p2-class.png

xpath = '/html/body/div/p[2]/@class'
Webscraping in Python

Einde van het pad

Webscraping in Python

Preparing Video For Download...