0

I have the following code that basically logs in and navigates to a page where a file is listed that I want to download:

const getNextcloudDownloadUrl = async (): Promise<string> => {
  
  const downloadUrl = `https://${BASEURL}${lastFile}`;
  const fileName = downloadUrl.substring(downloadUrl.lastIndexOf('/') + 1);

  const download = await page.evaluate((downloadUrl, fileName) => {
    https.get(downloadUrl, res =>
    {
      const file = fs.createWriteStream(`/tmp/${fileName}`);
      res.pipe(file);
      file.on('finish', () => {
        file.close();
        console.log('done');
      });
    })
  }, downloadUrl, fileName);

  return downloadUrl;
};

I cannot get it to work. The things breaks because Error: Evaluation failed: ReferenceError: https is not defined. I cannot get it to work. I want to download a 500 MB file. I have looked through everything. Tried fetch but that does not work with streams supposedly.

I have tried the following resources, but I cannot solve it:

Here is the request when I copy it in Chrome DevTools (but I since have found out that this does not work due to streams):

fetch(downloadUrl, {
  "headers": {
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
    "accept-language": "de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7,fr;q=0.6,es;q=0.5,sv;q=0.4,ru;q=0.3",
    "sec-ch-ua": "\" Not;A Brand\";v=\"99\", \"Google Chrome\";v=\"97\", \"Chromium\";v=\"97\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"Windows\"",
    "sec-fetch-dest": "document",
    "sec-fetch-mode": "navigate",
    "sec-fetch-site": "same-origin",
    "sec-fetch-user": "?1",
    "upgrade-insecure-requests": "1",
    "cookie": "oc_sessionPassphrase=......."
  },
  "referrerPolicy": "no-referrer",
  "body": null,
  "method": "GET"
});
Spurious
  • 1,903
  • 5
  • 27
  • 53
  • When you evaluate a function with puppeteer, the function will be executed in the browser context, not having access to any external resources (variables). And, even if you forward the node modules like `fs` and `http`, they will have as much permission as your browser have. – Giancarl021 Jan 30 '22 at 15:20
  • I assumed something like this to be the case. How can I get around this? How can I forward the browser permissions outside the `puppeteer` context? – Spurious Jan 30 '22 at 16:21

1 Answers1

1

I have gotten it working the following way:

  // Download file
  const fileName = downloadUrl.substring(downloadUrl.lastIndexOf('/') + 1);
  const cookies = await page.cookies();

  const cStr = cookies.map((c: any) => `${c.name}=${c.value}`).join(';');
  const fRes = fetch(downloadUrl, {
    headers: {
      accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
      'accept-language': 'de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7,fr;q=0.6,es;q=0.5,sv;q=0.4,ru;q=0.3',
      'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="97", "Chromium";v="97"',
      'sec-ch-ua-mobile': '?0',
      'sec-ch-ua-platform': '"Windows"',
      'sec-fetch-dest': 'document',
      'sec-fetch-mode': 'navigate',
      'sec-fetch-site': 'same-origin',
      'sec-fetch-user': '?1',
      'upgrade-insecure-requests': '1',
      cookie: cStr,
    },
    referrerPolicy: 'no-referrer',
    body: null,
    method: 'GET',
  });
  
  return await fRes
    .then(
      (res) =>
        new Promise(async (resolve, reject) => {
          const gcsFile = await uploadNextcloudFileToGoogleCloudStorage(fileName);
          const dest = gcsFile.createWriteStream();
          // @ts-ignore
          res.body.pipe(dest);
          // @ts-ignore
          res.body.on('finish', () => resolve('it worked'));
          dest.on('error', reject);
        })
    )
    .then((x) => {
      return {
        status: 200,
        downloadUrl,
        fileName,
      };
    })
    .catch((e) => {
      return {
        status: 400,
        error: e,
      };
    });

This then automatically uploads the file to Google Cloud Storage without storing it in tmp.

Spurious
  • 1,903
  • 5
  • 27
  • 53