0

The problem I have is the following:

This is the email I am receiving from this sender (format is always the same, amount of rows differs greatly (4 to 100) (Table inside the email): I have modified this code to suit my needs

**I use " | " to signify cell values **

Example Email Content:

"Good morning

TEST Emergency has the following CMO vacancies – hope you can help

Date | Shift | Department | Position | Hospital | Rate per hr

Monday, 16 November 2020 | 0800-1800 | EMERGENCY | CMO | TEST |$130

Monday, 16 November 2020 | 1400-2400 | EMERGENCY | CMO | TEST |$145

Tuesday, 17 November 2020 | 1400-2400 | EMERGENCY | CMO | TEST |$145

Sunday, 22 November 2020 | 1200-2200 | EMERGENCY | CMO | TEST |$145

....Signature"

I am using this code to extract it:

 var SEARCH_QUERY = 'label: Testing';

function appendData_(sheet, array2d) {
    sheet.getRange(sheet.getLastRow() + 1, 1, array2d.length, array2d[0].length).setValues(array2d);
}
// Main function, the one that you must select before run
function saveEmails() {
    var array2d = getEmails_(SEARCH_QUERY);
    if (array2d) {
        appendData_(SpreadsheetApp.getActiveSheet(), array2d);
    }
}

// Will get the emails and return a list
function getEmails_(q) {
    var emails = [];
    // Sheet header columns 
    emails.push(["Date","Shift","Department","Hospital","Rate"]);

    var threads = GmailApp.search(q);

    for (var i in threads) {
        var thread=threads[i];

        //var data = threads[i].getPlainBody();
        var msgs = threads[i].getMessages();
        for (var j in msgs) {
            var msg = msgs[j];

            // A different attempt at extraction
          var data = msg.getPlainBody().replace(/(Kee(.|\n|\r)*)/ig,'').replace(/(\r\n\r\n)/g,',').replace(/(.)*help/g,'').replace(/((\w)*day)/g,'').split('\\W+');//.match(/(16 November 2020)/ig);
          
          //Extracting values and assigning to arrays 
          var shiftdate = msg.getPlainBody().match(/(\d{1,2}\s(Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|August|Sep|Sept|September|Oct|October|Nov|November|Dec|December)\s\d{4})/ig);
          var shifttime = msg.getPlainBody().match(/(\d{4}-\d{4})/ig);
          var department = msg.getPlainBody().match(/(EMERGENCY)/g);
          var hospital = msg.getPlainBody().match(/(TEST)/g);                     
          var rate = msg.getPlainBody().match(/(\$\d{3})/g);
        
 // for (let i=0 i < shiftdate.length; i++){
            emails.push([shiftdate,shifttime,department,hospital,rate]);
         //  }
    }
    return emails;
   console.log(emails)
}

//Will clear the sheet and add all emails there
function appendData_(sheet, array2d) {
  sheet.clear();
  sheet.getRange(sheet.getLastRow() + 1, 1, array2d.length, array2d[0].length).setValues(array2d);
}

Now the above code nets me this:

[ 'Date', 'Shift', 'Department', 'Hospital', 'Rate' ], [ [ '16 November 2020', '16 November 2020', '17 November 2020', '22 November 2020' ], [ '0800-1800', '1400-2400', '1400-2400', '1200-2200' ], [ 'EMERGENCY', 'EMERGENCY', 'EMERGENCY', 'EMERGENCY' ], [ 'TEST', 'TEST', 'TEST', 'TEST' ], [ '$130', '$145', '$145', '$145' ] ] ]

When inserting the data into the spreadsheet, The problem I see with the above is that after execution is that A,B,C,D,E,F all contain their respective array and only display the first value, so I'm thinking of adding in a for loop to repeat the .setvalues...

Date | Shift | Department | Position | Hospital | Rate per hr

16 November 2020 | 0800-1800 | EMERGENCY | CMO | TEST |$130

I am not sure how to proceed, as my end goal is the following:

In the end I want to further process the data so that a Row shows:

Start Date (Currently marked as Date) | End Date | Shift Start | Shift End | Department | Position | Hospital | Rate per hr

Meaning the script would need to go through the dates and compare an array of Date + TIME and for consecutive dates and then push each last consecutive date to Array "End Date". This would need to be compared against the Time (I can split that array later on).

Now, I cannot figure out: What would be the best way to proceed?

  1. Should I try to insert the data into the columns now and push each array to it's own column, and then create a separate code that will first sort the sheet and then read each row and compare the date, and time to find consecutive values? In which case:

1.1. When running a for loop on the currently logged array, it seems to skip repeating values - These are crucial though, as it signifies different start times.

  1. Should I create new arrays with the first, second etc value from each and then put them in a for loop with an if statement? Matching start times and then reassigning the [0] value from the last consecutive array into a new array? Or would it be better to remove all consecutive array values (except first and last) and then add another step? Again, I'm worried about the code skipping repeating values when using .length and assigning?

Would the below code snippet be at all adequate? :

firstArray [16 November 2020],[0800-1800],[EMERGENCY],[CMO],[TEST],[$130]
secondArray[17 November 2020],[1400-2400],[EMERGENCY],[CMO],[TEST],[$145]
third Array[17 November 2020],[0800-1800],[EMERGENCY],[CMO],[TEST],[$145]


for(var line in firstArray) 
{
 var isMatched = secondArray.indexOf(firstArray[1]); 
 if (isMatched !== -1) 
 {
   var matchedValFromArray2 = secondArray.indexof[1][isMatched]
 };

}


I'm at a loss of how to proceed with further data extraction. Hope the Question is clear enough and that someone would have some advise...

HK LD
  • 21
  • 3
  • I would split the body with `split(/\r/)` and then split each line with `split(' | ') and filter out the unwanted lines and load them directly into the spreadsheet as a 2d array. – Cooper Dec 02 '20 at 04:19
  • I use the '|' to represent different cells.. Sorry for not making that clear :/ – HK LD Dec 02 '20 at 04:23
  • so what's the problem? – Cooper Dec 02 '20 at 04:28
  • As stated in question 1, 1.1 and 2. I don't seem to grasp how best to proceed with the continuity in processing here. – HK LD Dec 02 '20 at 04:33
  • Could you post a sample sheet and a sample of what is how do you want to end up. I find your question kind of confusing. I don't seem to find your end goal and on which point are you getting the problem. – Raserhin Dec 02 '20 at 10:23

1 Answers1

0

I think this is similiar to what you are trying to do:

function example1() {
  const s='one|two|three|four|five\r1|2|3|4|5\rA|B|C|D|E\ra|b|c|d|e';//simpified version of your data after removing unnecessary lines
  let a=s.split(/\r/);//split on carriage returns to get each line in a row
  a.splice(0,0,'Col1|Col2|Col3|Col4|Col5|Col6|Col7');//add titles
  let dt=Utilities.formatDate(new Date(),Session.getScriptTimeZone(),"E MMM dd, yyyy HH:mm:ss");
  let b=a.map((r,i) =>{
              let t=r.split('|');//split for columns
  if(i>0) {
    t.splice(0,0,dt);//insert date at beginning
    t.splice(3,0,'whatever');//insert whatever in column4
  }
  return t;
});
const ss=SpreadsheetApp.getActive();
const sh=ss.getActiveSheet();
sh.clearContents();//just did this to clear out previous data while developing
sh.getRange(1,1,b.length,b[0].length).setValues(b);
}
Cooper
  • 59,616
  • 6
  • 23
  • 54