Console app to login to ASP.NET website

Question

First, please excuse my naivety with this subject. I'm a retired programmer that started before DOS was around. I'm not an expert on ASP.NET. Part of what I need to know is what I need to know. (If yo follow me...)

So I want to log into a web site and scrape some content. After looking at the HTML source with notepad and fiddler2, it's clear to me that the site is implemented with ASP.NET technologies.

I started by doing a lot of google'ing and reading everything I could find about writing screen scrapers in c#. After some investigation and many attempts, I think I've come to the conclusion that it isn't easy.

The crux of the problem (as I see it now) is that ASP provides lots of ways for a programmer to maintain state. Cookies, viewstate, session vars, page vars, get and post params, etc. Plus the programmer can divide the work up between server and client scripting. A rich web client such as IE or Safari or Chrome or Firefox knows how to handle whatever the programmer writes (and the ASP framework implements under the covers).

WebClient isn't a rich web client. It doesn't even know how to implement cookies.

So I'm at an impasse. One way to go is to try to reverse engineer all the features of the rich client that the ASP application is expecting and write a WebClient on steroids class that mimics a rich client well enough to got logged in.

Or I could try embedding IE (or some other rich client) into my app and hope the exposed interface is rich enough that I can programmatically fill a username and password field and POST the form back. (And access the response stream so I can parse the HTML to scrape out the data I'm after...)

Or I could look for some 3rd party control that would be a lot richer that WebClient.

Can anyone shed some keen insight into where I should focus my attention?

This is as much a learning experience as a project. That said, I really want to automate login and information retrieval from the target site.

See http://stackoverflow.com/questions/1777221/using-cookiecontainer-with-webclient-class how to utilize cookies with WebClient. — abatishchev, Nov 15 '12 at 04:05
What you're doing is called [web crawling](http://stackoverflow.com/questions/tagged/web-crawler) — abatishchev, Nov 15 '12 at 04:07

score 3 · Answer 1 · answered Nov 15 '12 at 07:30

Here an example function I use to log in website and get my cookie

string loginSite(string url, string username, string password)
        {
            HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
            string cookie = "";

            //this values will change depending on the website
            string values = "vb_login_username=" + username + "&vb_login_password=" + password
                                + "&securitytoken=guest&"
                                + "cookieuser=checked&"
                                + "do=login";
            req.Method = "POST";
            req.ContentType = "application/x-www-form-urlencoded";
            req.ContentLength = values.Length;
            CookieContainer a = new CookieContainer();
            req.CookieContainer = a;
            System.Net.ServicePointManager.Expect100Continue = false; // prevents 417 error
            using (StreamWriter writer = new StreamWriter(req.GetRequestStream(), System.Text.Encoding.ASCII)) { writer.Write(values); }
            HttpWebResponse c = (HttpWebResponse)req.GetResponse();
            Stream ResponseStream = c.GetResponseStream();
            StreamReader LeerResult = new StreamReader(ResponseStream);
            string Source = LeerResult.ReadToEnd();


            foreach (Cookie cook in c.Cookies) { cookie = cookie + cook.ToString() + ";"; }
            return cookie;
        }

And here a call example:

string Cookie = loginSite("http://theurl.comlogin.php?s=c29cea718f052eae2c6ed105df2b7172&do=login", "user", "passwd");

            HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create("http://www.theurl.com");
            //once you got the cookie you add it to the header.

            req.Headers.Add("cookie", Cookie);
            HttpWebResponse response = (HttpWebResponse)req.GetResponse();
            using (Stream respStream = response.GetResponseStream())
            {
                using (StreamReader sr = new StreamReader(respStream))
                {
                    string s = sr.ReadToEnd();
                    HtmlReturn = s;
                  //  System.Diagnostics.Debugger.Break();
                }
            }

With Firefox you could use the extension HTTP-Headers to know what parameters are being set by post and you modify the variable values:

 string values = "vb_login_username=" + username + "&vb_login_password=" + password
                                + "&securitytoken=guest&"
                                + "cookieuser=checked&"
                                + "do=login";

To match with parameters on the destination website.

If you decide to Live-HTTP-HEaders for firefox, when you log into the website you will get the post information from headers, something like that:

GET / HTTP/1.1 Host: www.microsoft.com User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:15.0) Gecko/20100101 Firefox/15.0.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8 Accept-Language: es-es,es;q=0.8,en-us;q=0.5,en;q=0.3 Accept-Encoding: gzip, deflate Connection: keep-alive Cookie: WT_FPC=id=82.144.112.152-154450144.30258861:lv=1351580394112:ss=1351575867559; WT_NVR_RU=0=msdn:1=:2=; omniID=0d2276c2_bbdd_4386_a11d_f8da1dbc5489; MUID=349E06C547426937362B02CC434269B9; MC1=GUID=47b2ed8aeea0de4797d3a40cf549dcbb&HASH=8aed&LV=201210&V=4&LU=1351608258765; A=I&I=AxUFAAAAAAALBwAAukh4HjpMmS4eKtKpWV0ljg!!&V=4; msdn=L=en-US

score 1 · Answer 2 · answered Nov 15 '12 at 07:39

1

I suspect you may be able to build a Chrome extension that could do this for you.

By the way, you're not a "security expert" are you?

answered Nov 15 '12 at 07:39

Captain Kenpachi

6,960
7
47
68

score 0 · Answer 3 · answered Nov 29 '12 at 07:58

0

Why don't you use IE , Automating IE in Windows Forms is very simple ,plus you can easily handle proxy also .

answered Nov 29 '12 at 07:58

Shishir Shukla

43
1
7

Console app to login to ASP.NET website

3 Answers3