0

When you receive data from a website, how can you access it if you can see the data after logging in?

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome('c://chromedriver.exe')

driver.get("http://www.gevolution.co.kr/rank/history.asp")


soup = BeautifulSoup(driver.page_source, 'html.parser')
blocks = soup.findAll('div', {'class': 'grp'})
bodys = []
for block in blocks:
    body = block.text
    body = str(body).strip()
    bodys.append(body)
print(bodys)

result : []

Some of the code that collects data on the sites I want to collect. By the way, if I turn on Chrome automatically, it will not be collected due to login problem. I want to know how to solve login problems.

Seo
  • 5
  • 4
  • Are you sure the elements related to this class `grp` do in fact exist on the page just by looking at the page source? Maybe the content you're trying to collect data from is **dynamic**, in which case BeautifulSoup will **not** solve your problem. In addition, what do you exactly mean by " if I turn on Chrome automatically"? – Lafa Sep 27 '18 at 07:19
  • @LAFA - The website and the `grp` class certainly exist. When I visit the web site, the login screen appears. Must login to view the website. I'm wondering how to log in automatically. – Seo Sep 27 '18 at 07:30
  • If the content you're trying to get only appears after you login, your script must login first. In this case, you will need to interact with forms, press buttons, etc. You can use **selenium** for this task. By the way, do you have an account for this website that you can use for this purpose? – Lafa Sep 27 '18 at 07:38
  • @LAFA - yes, I have an account. – Seo Sep 27 '18 at 07:55

1 Answers1

0

Solution 1

You can locate the cookies, and use that cookie to start selenium driver. This will not solve the problem permanently because the cookies will be expired after a while.

Solution 2

You can log in to the website by simulating login behaviour. You can use selenium to simulate the inputting of the password and the username by element.click() element.send_keys('value') and many other magic methods provided by selenium.

This would get tough if the website requires captchas to log in. In this case, you can either type in the captcha manually, or use an algorithm to recognize it, or use solution 1.

zxch3n
  • 387
  • 3
  • 9