i am using databricks with pyspark, i am doing some test to connect to on-premise databases and i find a problem : i can do it with python but no with pyspark. for example this code with python and pymssql works and i can get data later :
import pymssql
try:
conn = pymssql.connect(
host=host,
user=user,
password=password,
database=db
)
except Exception as e:
print("An error occurred while connecting to SQL Server:", e)
but this codes does not work
df = spark.read \
.format("jdbc") \
.option("url", f"jdbc:sqlserver://{host}:1433;database={db};user={user};password={password}") \
.option("dbtable", db) \
.load()
the credential are the same , i have tried many ways, with many drivers, jars, etc. Has anyone had something similar happen? could it be a networking error? the error is always "com.microsoft.sqlserver.jdbc.SQLServerException: Login failed for user ... " thanks
SOLUTION
integrated_security = 'true'
authentication_scheme = "NTLM"
url = f"jdbc:sqlserver://{host}:{port};database={nombre};integratedSecurity={integrated_security};authenticationScheme={authentication_scheme}"
query = f"(SELECT TOP 10 * FROM {table}) AS subquery"
jdbcDF = spark.read \
.format("jdbc") \
.option("url", url) \
.option("dbtable", query) \
.option("user", f'{user_solo}@{domain}') \
.option("password", clave) \
.load()