-1

I've got a column that stores text and I'm trying to find a way to pull the datetime from the field. I've tried to use a combination of SUBSTRING,CHARINDEX, and LEN to pull the datetime from the field - but not always.

And I tried to use the function I found in this SO thread - but it only seems to work to pull the date itself and not the time.

Below is some sample data and the code I'm using to pull the datetime from the field.

DROP TABLE IF EXISTS #temptable;

CREATE TABLE #temptable
(
    ImportedBy VARCHAR(8000) NOT NULL
);

INSERT INTO #temptable
(
    ImportedBy
)
VALUES
('Tttttt Tooooon 2/3/2022 8:54:16 AM'),
('Pyyyyyy Vyyyyyy RN 1/24/2022 2:42:30 PM'),
('Jul 18 2022 11:34AM'),
('Jul 13 2022 10:32AM'),
('Meeeen Tooooo LPN 6/20/2022 3:57:15 PM'),
('Aaaaa Leeers RN 1/23/2023 9:48:21 AM');

SELECT t.ImportedBy,
       CASE
           WHEN TRY_CAST(t.ImportedBy AS DATETIME) IS NOT NULL THEN
               t.ImportedBy
           ELSE
               SUBSTRING(
                            TRIM(RIGHT(t.ImportedBy, 25)),
                            CHARINDEX(' ', TRIM(RIGHT(t.ImportedBy, 25))),
                            (LEN(TRIM(RIGHT(t.ImportedBy, 25))) - (CHARINDEX(' ', TRIM(RIGHT(t.ImportedBy, 25))))) + 1
                        )
       END AS [Scanned Date And Time],
        TRIM(RIGHT(t.ImportedBy, 25)) AS Step1,
        CHARINDEX(' ', TRIM(RIGHT(t.ImportedBy, 25))) AS Step2,
        (LEN(TRIM(RIGHT(t.ImportedBy, 25))) - (CHARINDEX(' ', TRIM(RIGHT(t.ImportedBy, 25))))) + 1 AS Step3,
       TRY_CAST(TRIM(ps.Item) AS DATE) AS ScannedDate
FROM #temptable AS t
    CROSS APPLY dbo.PatternSplitCM(t.ImportedBy, '[0-9/]') AS ps
WHERE ps.Matched = 1
      AND ps.Item LIKE '%[0-9]/[0-9]%';

I'm not sure what the solution might be - if the user doesn't enter a title behind their name - like in row 1 - it appears to work. But if the user has "RN" after this name, it isn't working correctly but "LPN" doesn't seem to cause an issue. And I don't understand why that would be. And the PatternSplitCM function does work - but it is required to also include the time stamp so that doesn't appear to be a resolution.

Query Results

Looking at the results when I added in each part of the SUBSTRING -

  1. The RIGHT, 25 takes the rightmost 25 characters of the string
  2. I'm not not sure what this CHARINDEX is doing
  3. I'm not sure what this is doing either - LEN - CHARINDEX(RIGHT)

The function is in the linked thread but I'll post it here as well:

-- Function by Chris Morris, read more here: http://www.sqlservercentral.com/articles/String+Manipulation/94365/
    CREATE FUNCTION dbo.PatternSplitCM
(
       @List                VARCHAR(8000) = NULL
       ,@Pattern            VARCHAR(50)
) RETURNS TABLE WITH SCHEMABINDING 
AS RETURN
    WITH numbers AS (
      SELECT TOP(ISNULL(DATALENGTH(@List), 0))
       n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
      FROM
      (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n),
      (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n),
      (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n),
      (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) g (n))

    SELECT
      ItemNumber = ROW_NUMBER() OVER(ORDER BY MIN(n)),
      Item = SUBSTRING(@List,MIN(n),1+MAX(n)-MIN(n)),
      [Matched]
     FROM (
      SELECT n, y.[Matched], Grouper = n - ROW_NUMBER() OVER(ORDER BY y.[Matched],n)
      FROM numbers
      CROSS APPLY (
          SELECT [Matched] = CASE WHEN SUBSTRING(@List,n,1) LIKE @Pattern THEN 1 ELSE 0 END
      ) y
     ) d
     GROUP BY [Matched], Grouper;
3
  • Try modifying your select to return ps.* and each of the expressions in the case expression. It may help to add characters to bracket the string values so that you can see any extra whitespace. If you still can't debug the code then please edit your post to include the PatternSplitCM function definition. Commented Jun 27 at 19:55
  • That SUBSTRING() code is returning everything after the first space in the last 25 characters of the column. Commented Jun 27 at 20:42
  • Since there's no fixed size for the portion of the string containing the date and time, I suggest you use a regexp with PATINDEX(). If you were using a newer version of SQL-Server you could use REGEXP_SUBSTR(). Commented Jun 27 at 20:46

1 Answer 1

2

Here is a solution that satisfies your data.

The challenging part of your requirements is that the date-time format in your data is not consistent. What this really means is that this solution may not work properly for all variations of date-time entered by the users.

Also, please be aware that this will fail miserably if you need to accomodate different locales that represent dates differently. Specifically, if one users enters 1/2/2025 with the expectation that this means Jan 2, 2025 and another user enters the same date but expects it to be Feb 1, 2025, you will have problems. The code I present will always interpret the date using the regional settings of the person logged in to SQL Server while running the query.

This solution uses a table filled with patterns to match on, and then joins to the target table using that pattern. This means that you can accomodate varying date-time formats simply by adding to the Matches table. For example, some people may use hypen (-) instead of forward slash (/) to separate the day month and year values. You can easily accomodate this by adding additional patterns to the matches table.

Also challenging is that some (most) people will enter 1 number when that is all they need. Ex: 1/1/2025 vs 1/01/2025 vs 01/01/2025. There is a similar challenge with hours and possibly minutes. Some users will enter seconds, some won't. Sometimes there may be a space before the AM/PM, and sometimes not. I suspect that some of your data may be represented by military time (14:23 instead of 2:23PM).

Regardless, this approach allows you to easily add additional rows to your matches table to accomodate the various styles.

Also, please be aware this may return inacurate results if there are multiple dates within the data. Basically, consider this to be a good starting point, but not necessarily a final solution.

DROP TABLE IF EXISTS #temptable;
DROP TABLE IF EXISTS #Match;

CREATE TABLE #temptable
(
    ImportedBy VARCHAR(8000) NOT NULL
);

INSERT INTO #temptable
(
    ImportedBy
)
VALUES
('Tttttt Tooooon 2/3/2022 8:54:16 AM'),
('Pyyyyyy Vyyyyyy RN 1/24/2022 2:42:30 PM'),
('Jul 18 2022 11:34AMxxx'),
('Jul 13 2022 10:32AM'),
('Meeeen Tooooo LPN 6/20/2022 3:57:15 PM'),
('Aaaaa Leeers RN 1/23/2023 9:48:21 AM');

Create Table #Match (
    Pattern VarChar(200),
    Length int
)

Insert
Into #Match(Pattern, Length)
Values  ('[0-9][.-/][0-9][.-/][0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9]:[0-9][0-9] [AP]M', 19),
        ('[0-9][0-9][.-/][0-9][.-/][0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9]:[0-9][0-9] [AP]M', 20),
        ('[0-9][.-/][0-9][0-9][.-/][0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9]:[0-9][0-9] [AP]M', 20),
        ('[0-9][0-9][.-/][0-9][0-9][.-/][0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9]:[0-9][0-9][ ][AP]M', 21),
        ('[a-z][a-z][a-z][ ][0-9][ ][0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9][ap][m]', 18),
        ('[a-z][a-z][a-z][ ][0-9][0-9][ ][0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9][ap][m]', 19),
        ('[a-z][a-z][a-z][ ][0-9][ ][0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9][ap][m]', 17),
        ('[a-z][a-z][a-z][ ][0-9][0-9][ ][0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9][ap][m]', 18)

Select  *, Convert(DateTime, SubString(ImportedBy, PatIndex('%' + #Match.Pattern + '%', #temptable.ImportedBy), #Match.Length))
From    #temptable
        Inner Join #Match
            On PatIndex('%' + #Match.Pattern + '%', #temptable.ImportedBy) > 0
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks George Mastros - I'll check this out asap.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.