Web Scraping in R
Timo Grossenbacher
Instructor
A request is sent to the web server
Typical status codes: 200
(OK), 404
(NOT FOUND), 3xx
(redirects), 5xx
(server errors)
A response is received from the web server
GET /index.html
)POST /test HTTP/1.1
Host: foo.example
Content-Type: application/x-www-form-urlencoded
Content-Length: 27
field1=value1&field2=value2
POST requests are also answered with a response!
library(httr)
GET('https://httpbin.org')
Response [https://httpbin.org/]
Date: 2020-09-19 13:02
Status: 200
Content-Type: text/html; charset=utf-8
Size: 9.59 kB
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
...
library(httr)
response <- GET('https://httpbin.org')
content(response)
{html_document}
<html lang="en">
[1] <head>\n<meta http-equiv="Content-Type" content="text/html; charset=UTF ...
[2] <body>\n <a href="https://github.com/requests/httpbin" class="github ...
Web Scraping in R