1

I am trying to use WebClient.DownloadString() to scrape JSON data from a URL.

The issue is I find that programmatically accessing the URL: "secure.somesite.com.au/api/products/getprice?productName=Cornmeal" results in the site forcefully closing the connection. I believe this occurs because the auth cookie is not set.

How does one set a cookie? I've spent some time reading up on stackoverflow and codeproject, and no one is setting actual cookies, they're all setting username and passwords. I need to set the cookie so the site knows I should have access.

 using (var client = new CookieAwareWebClient())
        {
            Cookie cookie = new Cookie();
            cookie.Name = "SWI";
            cookie.Value = "kjuujj7kxPvEC-4fBt5yyzWOJnjhriuoOtZ6Z0Ww";
            cookie.Domain = ".secure.somesite.com.au";
            client.CookieContainer.Add(cookie);

            string r = client.DownloadString("https://secure.somesite.com.au/api/products/getprice?productName=Cornmeal");
        }

CookieAwareWebClient Class:

public class CookieAwareWebClient : WebClient
    {
        public CookieAwareWebClient()
        {
            CookieContainer = new CookieContainer();
        }
        public CookieContainer CookieContainer { get; private set; }

        protected override WebRequest GetWebRequest(Uri address)
        {
            var request = (HttpWebRequest)base.GetWebRequest(address);
            request.CookieContainer = CookieContainer;
            return request;
        }
    }

PS. I have attempted to login with WebClient and all I get is the connection is closed forcefully. I think this is because if you are not logged already, requesting protected resources results in an error in WebClient and not just a string saying "null" being returned or something.

PPS. I have done this in python, but now need it working in C#.

  client.get('https://secure.somesite.com.au/api/products/getprice', params={
                'productCode': '{}'.format(code)
            }, headers=headers, timeout=60)

The cookie here is in a header.

  • I assume it isn't my misuse of WebClient because I can substitute the URL for "https://openexchangerates.org/api/latest.json?app_id=4be3cf28d6954df2b87bf1bb7c2ba47b" and it works a treat. – alex_1995_henry Mar 11 '21 at 08:32
  • Is `CookieAwareWebClient` the same as the one presented [here](https://stackoverflow.com/questions/1777221/using-cookiecontainer-with-webclient-class)? – ProgrammingLlama Mar 11 '21 at 08:33
  • @John I've edited the question, thanks for pointing that out, forgot it wasn't inbuilt. – alex_1995_henry Mar 11 '21 at 09:18
  • You should set your cookies in the cookie container, rather than doing it manually as a header. You'll probably need to make a `public` property to expose the cookie container. – ProgrammingLlama Mar 11 '21 at 09:19
  • @John I'll give it a go, thanks. – alex_1995_henry Mar 11 '21 at 09:24
  • I've edited my post with the suggestion in @John's comment. Still get ```IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.``` and ```SocketException: An existing connection was forcibly closed by the remote host``` – alex_1995_henry Mar 11 '21 at 09:44

0 Answers0