Web Scraping di R
Timo Grossenbacher
Instructor
...
<ol>
<li>Elemen pertama.</li>
<li>Elemen kedua.</li>
<li>Elemen ketiga.</li>
<li>Elemen keempat.</li>
<li>Elemen kelima.</li>
</ol>
...
html %>%
html_elements(xpath =
'//ol/li[position() = 2]')
# Equivalent CSS selector:
# ol > li:nth-child(2)
{xml_nodeset (1)}
[1] <li>Elemen kedua.</li>
...
<ol>
<li>Elemen pertama.</li>
<li>Elemen kedua.</li>
<li>Elemen ketiga.</li>
<li>Elemen keempat.</li>
<li>Elemen kelima.</li>
</ol>
...
html %>%
html_elements(xpath =
'//ol/li[position() < 3]')
{xml_nodeset (2)}
[1] <li>Elemen pertama.</li>
[2] <li>Elemen kedua.</li>
...
<ol>
<li>Elemen pertama.</li>
<li>Elemen kedua.</li>
<li>Elemen ketiga.</li>
<li>Elemen keempat.</li>
<li>Elemen kelima.</li>
</ol>
...
html %>%
html_elements(xpath =
'//ol/li[position() != 3]')
{xml_nodeset (4)}
[1] <li>Elemen pertama.</li>
[2] <li>Elemen kedua.</li>
[3] <li>Elemen keempat.</li>
[4] <li>Elemen kelima.</li>
...
<ol>
<li class = 'blue'>Elemen pertama.</li>
<li>Elemen kedua.</li>
<li class = 'blue'>Elemen ketiga.</li>
<li>Elemen keempat.</li>
<li class = 'blue'>Elemen kelima.</li>
</ol>
...
html %>%
html_elements(xpath =
'//ol/li[position() != 3 and @class = "blue"]')
{xml_nodeset (2)}
[1] <li class="blue">Elemen pertama.</li>
[2] <li class="blue">Elemen kelima.</li>
html %>%
html_elements(xpath =
'//ol/li[position() != 3 or @class = "blue"]')
{xml_nodeset (5)}
...
...
<ol>
<li class = 'blue'>Elemen pertama.</li>
<li>Elemen kedua.</li>
<li class = 'blue'>Elemen ketiga.</li>
</ol>
<ol>
<li class = 'red'>Elemen pertama.</li>
<li>Elemen kedua.</li>
</ol>
...
html %>%
html_elements(xpath = '//ol[count(li) = 2]')
{xml_nodeset (1)}
[1] <ol>\n<li class="red">...
html %>%
html_elements(xpath = '//ol[count(li) > 2]')
{xml_nodeset (1)}
[1] <ol>\n<li class="blue">...
Web Scraping di R