4

I just graduated and I'm applying for data science jobs, one of the projects on my resume involved me scraping a website's data. This definity violated terms of service as I was required to sign into the website to retrieving this data. I then analyzed this data to find interesting trends however I didnt sell this data or use it for profit.

Is it a bad idea to include this project on my resume?

Dan
  • 49
  • 2

2 Answers2

9

I don't see much wrong with listing it.

For all I know (as an employer) you may have gotten permission to do it, and I'm certainly not going to read through the ToS of the website and make assumptions about what terms you may have have broken.

I think most employers are just going to look at it as a project, not think about potential legal aspects. - Maybe I'm wrong.

Like the other answer says, please don't engage in unethical behaviour in the future.

flexi
  • 13,626
  • 4
  • 36
  • 62
  • Interesting point, the data is technically public as there is a public API to retrieve the data but it has a low rate limit. – Dan Nov 22 '20 at 22:16
  • 3
    This answer very much. I care more about your experience with scraping XYZ and how you've managed that correctly, instead of whether the company you've worked for had permission to gather this data or not. Even more so as just because someone puts something in TOS doesn't make it so, this matter is substantially more complicated than that (usually scraping is fine, but how you handle copyrighted data can be problematic). But for most part no one will care, unless you will go into the interview shouting "oh how I went into the illegal scraping yo". Focus on the tech. – Aida Paul Nov 22 '20 at 22:19
  • Upvoting both answers up to this point, but I like this one if you decline to name the website, and refuse to give up anything that might be considered the website's intellectual property. – Michael McFarlane Nov 22 '20 at 22:34
  • @Dan, If it has a low rate limit, it may just mean that you scraped the web site over a very long period of time. It doesn't necessarily mean that you did anything wrong. Include the project on your resume. – Stephan Branczyk Nov 23 '20 at 07:19
  • @TymoteuszPaul that's weird: since I was told to not tell put everything on the resume. Things like political activism and helping websites to promote homosexuality in arabian countries I shouldn't put on there... What makes this actual gray area stuff less a problem than doing things that are morally good? – paul23 Nov 23 '20 at 18:46
5

Think about this:

You're asking if you should tell a potential employer, who is going to have all sorts of concerns about contracts, liability, and ethical behavior, that you violated a TOS contract, which if you did it under their employ, would expose them to liability.

No, don't advertise it, and don't engage in unethical behavior in the future.

ETA: Also, many of us have been around since the days where web scraping was a big no-no, as bandwidth was still very expensive.

While that's changed, the term "scraping" or "web scraping", can still set off red flags, so it's best not to mention it at all until you have a job.

Old_Lamplighter
  • 159,693
  • 108
  • 436
  • 585
  • 1
    Thanks for your answer – Dan Nov 22 '20 at 22:08
  • 2
    This is grossly oversimplifying the issue. Often enough scraping is perfectly fine thing to do - even if TOS may try to claim otherwise. And unless OP is lawyer of his old company (or his own project really), I wouldn't expect him to be able to make a call whether what he did violate anything, as that TOS may very well not be worth the electricity it took to display. But that's really getting into the legalities of it, where details like this matter, and no one will sue op over it. – Aida Paul Nov 22 '20 at 22:26
  • @TymoteuszPaul you're mistaking simplifying with oversimplifying. If I'm looking at a resume, I'm not going to run it past legal, and see whether or not web scraping violated anything. Most STILL see it as unethical, and will take a very dim view of it. You also know better. Provide an answer of your own instead of over-complicating mine. – Old_Lamplighter Nov 22 '20 at 22:58
  • 1
    The issue here is the question is flawed. IF there was some violation, this answer would be 10000% correct of course. – Fattie Nov 23 '20 at 14:37
  • This question is in the category "You're not in an episode of Perry Mason". The OP has heard some sexy legal phrase, and, thinks they are in " violation ! " of something. There's another similar question today from a coder thinking they are in " violation ! " of something for having utility routines. Fortunately for the OPs in these cases ... it's kind of a delusion of significance; a case of "your concerns are so far from the mark that, happily, it's silly and you can forget this issue" – Fattie Nov 23 '20 at 14:54
  • @Fattie "Scraping" is a term that can get companies in a panic. It's better not to risk it – Old_Lamplighter Nov 23 '20 at 14:57
  • That's definitely a reasonable point. – Fattie Nov 23 '20 at 15:16