Wednesday, 19 March 2014

Installing RSelenium on Win 8.1

I was asked was it difficult to install and get RSelenium up and running on the windows platform.
It isn't and I made a quick screenr to show you how.


So in summary:



install.packages('RSelenium')
library(RSelenium)
checkForServer()
startServer()
remDr <- remoteDriver()
remDr$open()
remDr$navigate("http://www.google.com/ncr")
remDr$findElement(using = "name", value = "q")
remDr$highlightElement()
webElem$sendKeysToElement(list("R Cran"))
webElem$clearElement()
webElem$sendKeysToElement(list("R Cran", key = "enter"))
remDr$close()
remDr$closeServer()

13 comments:

  1. When I type startServer, it opens a java but it is empty.
    What I have to do ?

    ReplyDelete
    Replies
    1. Hi darannu. Yes the latest version of RSelenium writes the log to file in the /bin directory in the RSelenium package. https://github.com/johndharrison/RSelenium/blob/master/R/util.R#L61 so the java screen appears blank but is still working.

      Delete
    2. I can`t connect phantomjs to R.
      I type:
      remDr <- remoteDriver(browserName = "phantomjs")
      remDr$open()
      but it shows:
      info: driver.version: unknown
      I also type:
      remDr <- remoteDriver(browserName = "C:/Users/annunzid/Desktop/phantomjs-1.9.7-windows/phantomjs.exe")
      but whit the same result.
      What I have to do ?

      Delete
    3. Hi darannu,

      phantomjs needs to be in your path. So you would add C:/Users/annunzid/Desktop/phantomjs-1.9.7-windows/ to your path. See the explanation http://stackoverflow.com/questions/9546324/adding-directory-to-path-environment-variable-in-windows.

      Alternatively you can give the path to the phantomjs binary as an argument. In your case that would be:

      remDr <- remoteDriver(browserName = "phantomjs", extraCapabilities = list(phantomjs.binary.path = "C:/Users/annunzid/Desktop/phantomjs-1.9.7-windows/phantomjs.exe"))

      Delete
    4. John sorry but I have another question,

      I`m trying to Scrape Flight Ticket Data Using R and Phantomjs, I have seen your code and it work for this webpage :
      http://www.finn.no/reise/flybilletter/resultat?numberOfChildren=0&tripType=roundtrip&requestedDestination=PEK.AIRPORT&requestedReturnDate=15.09.2014&requestedOrigin=OSL.AIRPORT&requestedDepartureDate=01.09.2014&numberOfAdults=1

      but why the same code doesn`t work for this one:

      http://www.skyscanner.it/trasporti/voli/rome/it/voli-piu-economici-da-roma-per-italia.html?rtn=1&oym=1405&iym=1405

      I`m not able to parse the html.

      Thanks in advance

      Delete
    5. Hi Darannu,

      This site is checking for the user agent. You can call extra phantomjs settings using
      phantomjs.page.settings.SETTING = VALUE where SETTING is taken from https://github.com/ariya/phantomjs/wiki/API-Reference-WebPage#webpage-settings. In this case we can set a user agent so the site see us as firefox 29:

      appURL <- "http://www.skyscanner.it/trasporti/voli/rome/it/voli-piu-economici-da-roma-per-italia.html?rtn=1&oym=1405&iym=1405"
      library(RSelenium)
      addCap <- list(phantomjs.page.settings.userAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20120101 Firefox/29.0")
      remDr <- remoteDriver(browserName = "phantomjs"
      , extraCapabilities = addCap)
      remDr$open()
      remDr$navigate(appURL)
      tableElem <- remDr$findElement("id", "browse-data-table")
      xData <- tableElem$getElementAttribute("outerHTML")[[1]]
      xData <- htmlParse(xData, encoding = "UTF-8")
      readHTMLTable(xData)

      Delete
    6. This comment has been removed by the author.

      Delete
    7. This comment has been removed by the author.

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Hi John,
    I am having trouble connecting to PhantomJS using RSelenium on windows 7. Commands typed:

    require(RSelenium)
    RSelenium::startServer()
    remDr <- remoteDriver(browserName = "phantomjs", extraCapabilities = list(phantomjs.binary.path = "C:/Users/home/Desktop/phantomjs-1.9.8-windows/phantomjs.exe"))
    remDr$open()

    $class
    [1] "org.openqa.selenium.UnsupportedCommandException"

    $additionalInformation
    [1] "\nDriver info: driver.version: unknown"

    ReplyDelete
    Replies
    1. Hi Alex,

      Can you file an issue at https://github.com/ropensci/RSelenium and ill look into it.

      Best Regards

      John Harrison

      Delete
  4. There is a connection error.

    ```{r}
    #'devtools::install_github("ropensci/RSelenium")
    #'system('cmd cd Documents')
    #'system('cmd java -jar selenium-server-standalone-2.44.0.jar')

    library(RSelenium)
    library(rJava)
    #'RSelenium::checkForServer(update=TRUE)
    RSelenium::startServer()
    Sys.sleep(5)

    webDr <- remoteDriver(browserName='chrome')
    #'webDr <- remoteDriver(browserName="chrome", extraCapabilities = list(chrome.binary.path = "C:/Users/Scibrokes Trading/Documents/chromedriver.exe"))
    #'webDr <- remoteDriver(browserName='firefox')
    webDr$open(silent=TRUE)
    webDr$navigate('https://www.google.com/webhp')
    ```

    Thanks in advance.

    ReplyDelete