3

after looking around, it seems that if you login to a website through Scrapy, the authenticated login session doesn't transfer over if you try to use Selenium within the spider. Is there a way to transfer that session over to Selenium? Or would I have to login to the website all over again with Selenium?

Thanks!

John Doe
  • 139
  • 1
  • 1
  • 9

2 Answers2

1

The session is most likely just your cookie. So to convert to carry your session over to Selenium webdriver you need to set cookies of scrapy requests to selenium.

Scrapy is smart enough to keep track of cookies by itself you can find the cookies of the current request in response.headers.
Then you can set those cookies for your webdriver:

driver.add_cookie({'name': 'foo', 'domain': 'bar'})

You can convert response.headers['Set-Cookie'] to a dictionary using dict comprehension like:

import re
foo = response.headers['Set-Cookie']
values = {k.strip():v for k,v in re.findall(r'(.*?)=(.*?);', foo)}
driver.add_cookie(values)

Note: some websites can use more complicated sessions that also require other headers to match, but you can also replicated that by copying your scrapy response headers to your selenium webdriver.

Granitosaurus
  • 20,530
  • 5
  • 57
  • 82
  • Hi thanks so much for that, I'll try it out. But I'm new to python and I;m a little confused about the `values = {k.strip():v for k,v in re.findall(r'(.*?)=(.*?);', foo)}` line. Are you defining a function within the line? I would just like to know so I can lookup whatever related tutorials I need to understand that line. – John Doe Jul 15 '16 at 15:57
  • This is called [dictionary comprehension](http://stackoverflow.com/questions/1747817/create-a-dictionary-with-list-comprehension-in-python) it's a bit more advance technique but basically it converts a string header `"cookie1=value1;cookie2=value"` into a dictionary `{"cookie1":"value1","cookie2":"value2"}` – Granitosaurus Jul 15 '16 at 21:51
  • Thanks so much. it didn't end up working but thanks anyway! I resorted to just re logging in manually with selenium and navigating back to where I was. – John Doe Jul 15 '16 at 21:53
0

check also a similar question here scrapy selenium authentication

log in with scrapy api

# call scrapy post request with after_login as callback
    return FormRequest.from_response(
        response,
        # formxpath=formxpath,
        formdata=formdata,
        callback=self.browse_files
    )

pass session to selenium driver

# logged in previously with scrapy api   
# partial solution
     cookies = map(lambda e: e.strip(), cookie2.split(";"))

     for cookie in cookies:                
            cookie_map = {"name": name, "value": value}                  
            print "adding cookie"
            print cookie_map
            self.driver.add_cookie(cookie_map)

    self.driver.get(response.url)

    files = self.wait_for_elements_to_be_present(By.XPATH, "//*[@id='files']", response)
    print files
Community
  • 1
  • 1
cipri.l
  • 819
  • 10
  • 22