0

I have written SQL code that I'm going to use in Spark. The code works fine when applied in T-SQL on MS Server, however when I run apply the code to Spark platform I get the error: Undefined function: 'EOMONTH'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'. I believe the problem is that I need to register the T-SQL function EOMONTH, however, I'm not sure how to do that.

I have tried registering the function with the following;

sqlContext.sql("""CREATE TEMPORARY FUNCTION function_name AS 'EOMONTH'"""),

but I get the error:

Can not load class 'EOMONTH' when registering the function 'function_name', please make sure it is on the classpath;

The full code is as follows:

SELECT     MakeName, SUM(Cost) AS TotalCost
FROM       Make AS MK INNER JOIN Model AS MD 
           ON MK.MakeID = MD.MakeID
INNER JOIN Stock AS ST ON ST.ModelID = MD.ModelID
WHERE      DateBought BETWEEN 
           CAST(YEAR(DATE_ADD(m, -1, CURRENT_DATE())) AS CHAR(4)) 
           + RIGHT('0' + CAST(MONTH(DATE_ADD(m, -1, CURRENT_DATE())) 
           AS VARCHAR(2)),2) + '01'
           AND EOMONTH(DATE_ADD(m, -1, CURRENT_DATE()))
GROUP BY   MakeName

I should just get the result from the query

ThanhPhanLe
  • 1,315
  • 3
  • 14
  • 25
Carltonp
  • 1,166
  • 5
  • 19
  • 39

2 Answers2

1

Use last_day instead of eomonth, i.e.

FROM       Make AS MK INNER JOIN Model AS MD 
           ON MK.MakeID = MD.MakeID
INNER JOIN Stock AS ST ON ST.ModelID = MD.ModelID
WHERE      DateBought BETWEEN 
           CAST(YEAR(DATE_ADD(m, -1, CURRENT_DATE())) AS CHAR(4)) 
           + RIGHT('0' + CAST(MONTH(DATE_ADD(m, -1, CURRENT_DATE())) 
           AS VARCHAR(2)),2) + '01'
           AND LAST_DAY(DATE_ADD(m, -1, CURRENT_DATE()))
GROUP BY   MakeName
Richard Nemeth
  • 1,784
  • 1
  • 6
  • 16
0

Spark SQL is not the same as SQL Server T-SQL. There is no EOMONTH function in Spark SQL. However, there is a last_day() function that you can use.

> SELECT last_day('2019-08-11');
 2019-08-31

This is accessible via PySpark as well (see Stack Overflow answer here).

David Makogon
  • 69,407
  • 21
  • 141
  • 189
  • OK, great, I'm going to try your suggestion and let you know how I get on. – Carltonp Aug 11 '19 at 15:27
  • Entered last_day, however, I'm now getting the error 'nvalid number of arguments for function add_months. Expected: 2; Found: 1;. However, if you take a look at the code I have definitely entered two arguments '(DATE_ADD(CURRENT_DATE(),-1))'. Can someone let me know if I'm missing something ```test5 = spark.sql("""SELECT TRX_EVENT_END AS TRX_EVENT ,SALES_QUANTITY FROM Transactions WHERE TRX_EVENT_END BETWEEN CAST(YEAR(DATE_ADD(CURRENT_DATE(),-1)) AS CHAR(4))+ RIGHT('0' + CAST(ADD_MONTHS(DATE_ADD(CURRENT_DATE(),-1))AS VARCHAR(2)),2) + '01'AND LAST_DAY(DATE_ADD(CURRENT_DATE(),-1))""")``` – Carltonp Aug 11 '19 at 16:49
  • @Carltonp - please *edit your question* to include the code you're including here. As a comment, i'm afraid such code is unreadable (all on one line, no indentation). You can edit your question to show what you've tried, additional errors, etc. – David Makogon Aug 11 '19 at 17:53
  • Hi David, the code is as follows, `test5 = spark.sql("""SELECT TRX_EVENT_END AS TRX_EVENT ,SALES_QUANTITY FROM Transactions WHERE TRX_EVENT_END BETWEEN CAST(YEAR(DATE_ADD(CURRENT_DATE(),-1)) AS CHAR(4))+ RIGHT('0' + CAST(ADD_MONTHS(DATE_ADD(CURRENT_DATE(),-1))AS VARCHAR(2)),2) + '01'AND LAST_DAY(DATE_ADD(CURRENT_DATE(),-1))""")` Is this ok? – Carltonp Aug 13 '19 at 17:24