2
\$\begingroup\$

I have a repetitive data reduction where I have a list of ID's that I want to find in a longer list. I have a method, but it is slower than I'd like. How can can I speed this up?

The process: After each contiguous section of ID's I want to insert a blank row (I'll do other things with the blank row but that's not important).If I have a unique list containing 10, 20, 30, 40. I want to insert a row below each of the contiguous sections of these numbers for example.

As in,

ID --
10
10
10 *Insert below here
15
15 *DO NOT INSERT HERE
20
20
20 *Insert below here
30
30 *Insert below here
10
10
10 *Insert below here
40 *Insert below here

I have created a way to do this, but for >40,000 lines of data and about 20 "ID's of interest" which can go for a thousand rows a piece it is not particularly efficient. Would auto filtering potentially improve efficiency or what about other methods? Storing data in array rather than using Find?

Here is an example of the module that will run on data formatted like the screenshot and workbook attached.

Option Explicit

Public Sub SummaryDataPopulate()
    Dim DataWs              As Worksheet: Set DataWs = ActiveSheet
    Dim IDNum               As Long
    Dim IDCurRow            As Long
    Dim IDofInterestRows    As Long
    Dim StartRow            As Long
    Dim EndRow              As Long
    Dim Counter             As Long: Counter = 0
    Const IDCol             As Long = 3

    Application.ScreenUpdating = False

    With DataWs
        IDofInterestRows = .Columns("A").Cells.Find("*", SearchOrder:=xlByRows, LookIn:=xlValues, SearchDirection:=xlPrevious).Row

        For IDCurRow = 3 To IDofInterestRows
        
            IDNum = DataWs.Range("A" & IDCurRow).Value
            
            StartRow = .Columns(IDCol).Find(IDNum).Row 'could update this in future to handle repeated ID's, currently finds first ID
            EndRow = find_end_row(IDNum, StartRow, IDCol, DataWs)
            
            'insert row & color
            .Rows(EndRow + 1).Insert
            .Rows(EndRow + 1).Interior.ThemeColor = xlThemeColorAccent6
            
            'check now if there is another section
            Do While ID_rerun_check(IDNum, EndRow, IDCol, DataWs) 'true if another section of same ID
                
                StartRow = .Columns(IDCol).Find(What:=IDNum, After:=.Cells(EndRow + 2, IDCol)).Row
                EndRow = find_end_row(IDNum, StartRow, IDCol, DataWs) 'new end row
                
                'insert row & Color
                .Rows(EndRow + 1).Insert
                .Rows(EndRow + 1).Interior.ThemeColor = xlThemeColorAccent6
            
            Loop
    
        Next IDCurRow
    
    End With

    Application.ScreenUpdating = True

End Sub

Private Function ID_rerun_check(ByVal IDNum As Long, ByVal PreviousEndRow As Long, ByVal IDCol As Long, ByRef data_ws As Worksheet) As Boolean

    Dim FoundRow As Long

    FoundRow = data_ws.Columns(IDCol).Find(What:=IDNum, After:=data_ws.Cells(PreviousEndRow + 1, IDCol)).Row

    ID_rerun_check = Not (FoundRow <= PreviousEndRow) 'false if the find has looped, 'true if find finds another section

End Function

Private Function find_end_row(ByVal IDNum As Long, ByVal StartRow As Long, ByVal IDCol As Long, ByRef data_ws As Worksheet) As Long

    Dim ii          As Long
    Dim EndRow      As Long
    Dim DataRows    As Long

    DataRows = data_ws.Columns("C").Cells.Find("*", SearchOrder:=xlByRows, LookIn:=xlValues, SearchDirection:=xlPrevious).Row

    For ii = StartRow To DataRows
        EndRow = data_ws.Columns(IDCol).Find(What:=IDNum, After:=data_ws.Cells(ii, IDCol)).Row
        If EndRow - 1 <> ii Then                 'check if cycle back to top or if jump in value
            Exit For                             'exits for loop
        End If
    Next ii

    find_end_row = ii

End Function

DATA Format Example

Here is a datafile: Sheet1 has the simple data, Sheet3 has data more realistic https://docs.google.com/spreadsheets/d/1ftCcHfiw7Sujg5L791gyn877bqmhaENy/edit?usp=sharing&ouid=111738563125877088842&rtpof=true&sd=true

\$\endgroup\$
6
  • 3
    \$\begingroup\$ This can be accomplished through sorting. For example: Consider IDs 10, 20 and 30. Append10.1, 20.1, and 30.1 to the end of the dataset and make any adjustments to the rows (e.g. formatting) sort the columns and then replace the .1 values. \$\endgroup\$ Commented Dec 11, 2022 at 13:21
  • 1
    \$\begingroup\$ Thanks, I'll give that a thought. How would sorting differentiate between two sections of "10" ? \$\endgroup\$ Commented Dec 11, 2022 at 14:20
  • 1
    \$\begingroup\$ Use a collection or scripting dictionary to lookup the ids. Add a helper column, use auto fill to number each row. Load the data into an array. Loop over the array. When you find a place that a row should be inserted, take that row number add 0.1 to it. This will be the new number to be appended. After you added all the new rows, sort the helper column and then delete it. IMO, it would take <4 seconds to process 40k rows of data. \$\endgroup\$ Commented Dec 12, 2022 at 9:50
  • 1
    \$\begingroup\$ You can use the DM_INSERT2 UDF function in my array repo. Or, if not needed as an UDF then use the InsertRowsAtValChange that works in any application supporting VBA. These methods also handle top and bottom rows besides the rows where value changes. There's a Demo workbook that showcases the UDF. As they are they won't handle special cases like in your example with 15 but you can add a helper column to the array and run the method on that instead. E.g. round down to multiple of ten and then just insert blank rows at value change \$\endgroup\$ Commented Dec 12, 2022 at 10:08
  • 1
    \$\begingroup\$ @CristianBuse, Thank you for pointing me to your array repo. InsertRowsAtValChange is FAST. The only change I had to make on my end is where you check If currentKey <> previousKey then I added a nested If statement: If IsNumeric(Application.Match(arr(i-1,columnIndex), inRng, 0)) Then where inRng is a range variable storing the ID's of interest. They already exist printed to a sheet in a known location and it seems to work well with the Match function which is fast enough. I don't love the nest but it helps to avoid arr(i-1) when i<LBound(arr) \$\endgroup\$ Commented Dec 13, 2022 at 17:11

1 Answer 1

2
\$\begingroup\$

If I understand you correctly....

Before & after running the sub
enter image description here ===> enter image description here

The list is 10,20,30,40,50,60,70,80,90,100.
The yellow fill just to show that it won't insert row because the value is not consecutive to the next row (only one cell).

I test the time spent by multiplying the data in "before running the sub" until cell A155649 (around 155k rows) with 10 item to compare (10 to 100) it takes 7 second something, quite a very long time to wait which I hope it's still OK for you to wait 7 second something.

Sub test()
Dim arrList: Dim arr: Dim itm: Dim el
Dim rg As Range: Dim addr As String

mulai = Timer
Application.ScreenUpdating = False

arrList = Array(10, 20, 30, 40, 50, 60, 70, 80, 90, 100)
Set rg = Range("A2", Range("A2").End(xlDown))

For Each el In arrList
    With rg
    .Replace el, True, xlWhole, , False, , False, False
    arr = Split(.SpecialCells(xlConstants, xlLogical).Address, ",")
    .Replace True, el, xlWhole, , False, , False, False
    End With

    For Each itm In arr
    If InStr(itm, ":") Then addr = addr & "," & Split(itm, ":")(1)
    Next

    If addr <> "" Then
        addr = Right(addr, Len(addr) - 1)
        Range(addr).Offset(1, 0).EntireRow.Insert Shift:=xlDown
        addr = ""
    End If
Next

Debug.Print Timer - mulai '(7.375 / 7.4375 / 7.242188)
End Sub

Basically it just collect all the last consecutive address of each looped element (10,20,30 etc), offset one row below then insert entire row.

With the data until A155649 I thought I can just collect all the looped element in one string address, offset it one row below and insert entire row all at once, but I don't know why it gave me error 1004 at the insert code line. So that's why I break the insert row process to each looped element.

Please note that the code assume each list (10,20,30...100) exist in column A. It will throw error if one of the list does not exist in column A. Need to add checking code to avoid the error, something like if not .find(el) is nothing then

Not sure though if the code can be applied to your situation or not.

I want to insert a row below each of the contiguous sections of these numbers

I wonder why in your sample ID 40, you put "*Insert below here". Please CMIIW.

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.