I have a stored procedure GetCombinedRequestInfosByEmail that retrieves hierarchical data for employees. The procedure performs well for lower-level employees but takes around 55 seconds to execute for higher-level employees (e.g., CEO with 30,000 employees). For managers with fewer employees (10-7), it executes in milliseconds.
I am using another stored procedure GetRequestInfosByUser within this procedure, which dynamically constructs and executes a SQL query.
My current version of the stored procedure:
ALTER PROCEDURE GetCombinedRequestInfosByEmail
@Email NVARCHAR(50)
AS
BEGIN
SET NOCOUNT ON;
CREATE TABLE #EmployeeHierarchy
(
employee_id NVARCHAR(50) PRIMARY KEY,
employee_email NVARCHAR(255),
level INT
);
WITH EmployeeHierarchy AS
(
SELECT
u.userid AS employee_id,
u.Email AS employee_email,
u.manageruserid AS manager_id,
0 AS level
FROM
UserTable u
WHERE
u.Email = @Email
UNION ALL
SELECT
u.userid AS employee_id,
u.Email AS employee_email,
u.manageruserid AS manager_id,
eh.level + 1
FROM
UserTable u
JOIN
EmployeeHierarchy eh ON u.manageruserid = eh.employee_id
)
INSERT INTO #EmployeeHierarchy (employee_id, employee_email, level)
SELECT DISTINCT
employee_id, employee_email, level
FROM
EmployeeHierarchy
WHERE
level > 0
ORDER BY
level, employee_id;
CREATE TABLE #RequestInfodata
(
RequestID INT,
RequestCreatedByEmailID NVARCHAR(200),
ApplicationName NVARCHAR(200),
ITowner NVARCHAR(100),
Businessowner NVARCHAR(100),
RiskScore INT,
RequestCreatedBy NVARCHAR(24)
);
DECLARE @employee_id NVARCHAR(50);
DECLARE @employee_email NVARCHAR(255);
DECLARE employee_cursor CURSOR FOR
SELECT employee_id, employee_email
FROM #EmployeeHierarchy;
OPEN employee_cursor;
FETCH NEXT FROM employee_cursor INTO @employee_id, @employee_email;
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT INTO #RequestInfodata
EXEC GetRequestInfosByUser
@UserId = @employee_id,
@UserEmail = @employee_email,
@IsBusinessOrItOwner = 1;
FETCH NEXT FROM employee_cursor INTO @employee_id, @employee_email;
END
CLOSE employee_cursor;
DEALLOCATE employee_cursor;
SELECT DISTINCT *
FROM #RequestInfodata;
-- Clean up temporary tables
DROP TABLE #EmployeeHierarchy;
DROP TABLE #RequestInfodata;
END;
Stored procedure GetRequestInfosByUser:
ALTER PROCEDURE [dbo].[GetRequestInfosByUser]
@UserId VARCHAR(50) = NULL,
@UserEmail VARCHAR(50) = NULL,
@IsBusinessOrItOwner BIT = 0
AS
BEGIN
SET NOCOUNT ON;
-- Base SQL query
DECLARE @SQL NVARCHAR(MAX) = '
SELECT
ri.RequestID,
ri.RequestCreatedByEmailID,
ri.NameOftheTool AS ApplicationName,
ri.ITOwners AS ITowner,
ri.BusinessOwners AS Businessowner,
ri.RiskSummaryScore AS RiskScore,
ri.RequestCreatedBy,
FROM
RequestInfo ri
LEFT JOIN
(SELECT
*,
ROW_NUMBER() OVER (PARTITION BY RequestID ORDER BY DateModified DESC) AS RowNumber
FROM
WorkflowTable) AS wf ON ri.RequestID = wf.RequestID
AND wf.RowNumber = 1
WHERE 1 = 1 ';
-- Append conditions to the WHERE clause based on input parameters
IF @UserId IS NOT NULL AND @UserId <> 'ALL' AND @IsBusinessOrItOwner = 0
BEGIN
SET @SQL = @SQL + ' AND ri.RequestCreatedBy = @UserId';
END
ELSE IF @UserId IS NOT NULL AND @UserEmail IS NOT NULL AND @IsBusinessOrItOwner = 1
BEGIN
SET @SQL = @SQL + ' AND (ri.RequestCreatedBy = @UserId OR ri.ITOwners = @UserEmail OR
ri.BusinessOwners = @UserEmail)';
END
-- Execute the SQL query
EXEC sp_executesql @SQL, N'@UserId VARCHAR(50), @UserEmail VARCHAR(50)', @UserId, @UserEmail;
END;
For the CEO (with 30,000 employees), the GetCombinedRequestInfosByEmail stored procedure takes around 55 seconds, whereas for managers with 10 or 7 employees, it executes in milliseconds. I am looking to optimize this stored procedure for better performance, particularly for high-level employees with large hierarchies.
User table:
| UserID | ManagerUserID | |
|---|---|---|
| 1 | [email protected] | NULL |
| 2 | [email protected] | 1 |
| 3 | [email protected] | 1 |
| 4 | [email protected] | 2 |
| 5 | [email protected] | 2 |
| 6 | [email protected] | 3 |
RequestInfo:
| RequestID | RequestCreatedBy | ITOwners | BusinessOwners | NameOfModule | RiskScore | RequestSubmitted |
|---|---|---|---|---|---|---|
| 1001 | 4 | [email protected] | [email protected] | Module1 | 5 | 2024-05-01 08:00:00.000 |
| 1002 | 5 | [email protected] | [email protected] | Module2 | 3 | 2024-05-02 08:00:00.000 |
| 1003 | 6 | [email protected] | [email protected] | Module3 | 4 | 2024-05-03 08:00:00.000 |
When executing the GetCombinedRequestInfosByEmail stored procedure for [email protected], I expect to get the combined request information for all employees under the CEO (including managers and their subordinates).
Here is the expected output:
| RequestID | RequestCreatedByEmailID | ApplicationName | ITowner | Businessowner | RiskScore | RequestCreatedBy |
|---|---|---|---|---|---|---|
| 1001 | [email protected] | Module1 | [email protected] | [email protected] | 5 | 4 |
| 1002 | [email protected] | Module2 | [email protected] | [email protected] | 3 | 5 |
| 1003 | [email protected] | Module3 | [email protected] | [email protected] | 4 | 6 |
The main issue is that for the CEO, who has 30,000 employees, the execution time is around 55 seconds, while for managers with fewer employees (10-7), it executes in milliseconds.
How can I optimize the GetCombinedRequestInfosByEmail stored procedure to reduce the execution time for high-level employees (e.g., CEO with 30,000 employees)?
GetRequestInfosByUserprocedure outputs a single resultset per user. If you execute it 30k times you get 30k resultsets (not 30k rows) and each of those need to be inserted into the temp table. Why not a single resultset containing all users, with a single stored procedure getting all the data?