After spending a considerable amount of time poking at the internals of the Collection class, I have discovered 2 methods to iterate a private internal array within a VBA class using a For Each loop on the class instance.
Both methods:
- work without the need for assembly code or external DLL references
- make use of the already available
IEnumVariantreturned by aCollectionvia it's_NewEnummethod - are safe (no Application crashes)
I must note that I was already aware of the MEnumerator approach. This approach can lead to crashes, epecially within the 3 custom methods for IUnknown. There is absolutely no need to redo what IUnknown does - on the contrary this approach leads to crashes when state is lost and I found it unstable when updated for x64.
Prerequisites
Since we need to read/write some memory, I will use LibMemory. However, all calls to LibMemory can be easily replaced with calls to RtlMoveMemory (Win) / memmove (Mac).
The internal structure of a Collection's IEnumVariant
After some testing, this is how the structure looks like:
Const LONG_SIZE As Long = 4
Private Type CollEnumVariant
vTblPtr As LongPtr
refCount As Long
nextItem As LongPtr
unkPtr As LongPtr
collPtr As LongPtr
End Type
Private Enum CollEnumVariantOffsets
vTblPtrOffset = 0
refCountOffset = PTR_SIZE
nextItemOffset = refCountOffset + PTR_SIZE
unkPtrOffset = nextItemOffset + PTR_SIZE
collPtrOffset = unkPtrOffset + PTR_SIZE
End Enum
Let's test. Create a EnumHelper class with the following code. Note that you either need Rubberduck to syncronize the @Enumerator annotation or you have to copy paste the code into a text editor (e.g. Notepad) and then import the saved text file.
VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
END
Attribute VB_Name = "EnumHelper"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = False
Attribute VB_Exposed = False
Option Explicit
Private m_enum As IEnumVARIANT
Public Property Set EnumVariant(newEnum_ As IEnumVARIANT)
Set m_enum = newEnum_
End Property
'@Enumerator
Public Property Get EnumVariant() As IEnumVARIANT
Attribute EnumVariant.VB_UserMemId = -4
Set EnumVariant = m_enum
End Property
EnumHelper can also be found here.
Then, run TestCollEnumVariantStructure method from a standard module:
Option Explicit
Private Enum CollEnumVariantOffsets
vTblPtrOffset = 0
refCountOffset = PTR_SIZE
nextItemOffset = refCountOffset + PTR_SIZE
unkPtrOffset = nextItemOffset + PTR_SIZE
collPtrOffset = unkPtrOffset + PTR_SIZE
End Enum
Sub TestCollEnumVariantStructure()
Dim c As New Collection
'
c.Add 111
c.Add 222
'
Dim i As stdole.IEnumVARIANT: Set i = c.[_NewEnum]
Dim e As New EnumHelper: Set e.EnumVariant = i
Dim iPtr As LongPtr: iPtr = ObjPtr(i)
Dim v As Variant
Dim addr As LongPtr
Dim j As Long
'
Debug.Print "Virtual table: " & MemLongPtr(iPtr + vTblPtrOffset)
Debug.Print "Reference count: " & MemLong(iPtr + refCountOffset)
Debug.Print "Next item ptr: " & MemLongPtr(iPtr + nextItemOffset)
Debug.Print "Unknown ptr: " & MemLongPtr(iPtr + unkPtrOffset)
Debug.Print "Collection ptr: " & MemLongPtr(iPtr + collPtrOffset) & " (same as: " & ObjPtr(c) & ")"
Debug.Print
'
j = 0
For Each v In e
j = j + 1
Debug.Print "Item value for index " & j & ": " & v
If j = 1 Then
addr = MemLongPtr(iPtr + nextItemOffset)
Debug.Print "Address of second item is: " & addr
MemLong(addr + 8) = 555 '+8 as offset within the Variant
End If
Next v
Debug.Print
'
Debug.Print "Next item ptr (after loop): " & MemLongPtr(iPtr + nextItemOffset)
i.Reset
Debug.Print "Next item ptr (after Reset): " & MemLongPtr(iPtr + nextItemOffset)
Set c = Nothing
Debug.Print "Next item ptr (after Collection destroyed): " & MemLongPtr(iPtr + nextItemOffset)
Debug.Print "Collection ptr (after Collection destroyed): " & MemLongPtr(iPtr + collPtrOffset)
End Sub
I get the following result in the Immediate window:

As you can see we can even change the items inside the collection as there is a pointer to the next item in the enumeration.
The more interesting part is that once we destroy the collection, it does the cleanup for us i.e. it sets to zero both the pointer for the next item as well as the pointer for the collection. This is the very safe mechanism that Collections have in place to avoid crashes when you destroy the collection while still inside the For Each loop.
I've tested the offsets on both x32 and x64 VBA7 as well as VBA6. Would be grateful if someone can test this on VB6 which I don't have.
#1 Hijack calls to IEnumVARIANT::Next
We cannot swap the function pointer for the IEnumVARIANT::Next method with our own for 2 reasons:
- memory is write-protected in the original virtual table
- it would affect all instances of the class which we surely do not want
However, by creating our own virtual table we solve both problems. We will continue using the original functions except the one we actually need replacing. In other words, all the memory management is left to the original implementation while we make use of a custom IEnumVARIANT::Next function.
We will also make use of the 'nextItem' bytes so that we call call back into our desired class.
There are really 2 main ways to implement this approach:
a) Use a direct approach i.e. the regular Public Function NewEnum() As IEnumVariant with the Attribute NewEnum.VB_UserMemId = -4 in each class that we want to iterate
b) Use an indirect approach, like the EnumHelper class we used in TestCollEnumVariantStructure which requires Public Function NewEnum() As EnumHelper in any class that we want to iterate
#1a Direct approach - Public Function NewEnum() As IEnumVariant
The most widespread solution on the internet, to enumerate a private collection within a class, is to just add a method like this:
'@Enumerator
Public Function NewEnum() As IEnumVARIANT
Attribute NewEnum.VB_UserMemId = -4
Set NewEnum = internalCollection.[_NewEnum]
End Function
For our exercise - since we want to iterate an array (not a Collection), this implies that we have to call into our array class using late-binding or an interface. I will not implement late-binding because it's slower and it provides no guarantee that the class has the desired method. For example, the MEnumerator (line 152) approach, that I mentioned at the start, expects a default property that can receive an index and return a value. I will instead use an interface to make things explicit.
Create the following IEnumerable_1a interface:
Option Explicit
Public Function ItemByIndex(ByVal iIndex As Long _
, ByRef outVar As Variant) As Boolean
Err.Raise 5, TypeName(Me), "This class should be used as an interface only"
End Function
Create a standard module called LibEnumerable_1a:
Option Explicit
Private vtbl(0 To 6) As LongPtr
Private Type CollEnumVariant
vTblPtr As LongPtr
refCount As Long
enumInterface As IEnumerable_1a
unkPtr As LongPtr
coll As Collection
End Type
Private Enum CollEnumVariantOffsets
vTblPtrOffset = 0
refCountOffset = PTR_SIZE
enumInterfaceOffset = refCountOffset + PTR_SIZE
unkPtrOffset = enumInterfaceOffset + PTR_SIZE
collOffset = unkPtrOffset + PTR_SIZE
End Enum
Private m_ptrToIndex As New Collection
Private m_index() As Long
'*******************************************************************************
'We 'inject' the instance of a class implementing 'IEnumerable_1a' into the
' 'nextItem' member' (3rd member after vTbl and refCount) of the IEnumVariant
' instance returned by the relevant collection
'*******************************************************************************
Public Function NewEnumerable(ByRef c As Collection _
, ByRef i As IEnumerable_1a) As IEnumVARIANT
Set NewEnumerable = c.[_NewEnum]
Dim ptr As LongPtr: ptr = ObjPtr(NewEnumerable)
If vtbl(0) = 0 Then 'We swap the 'Next' and 'Reset' method with our own
'We keep the 3 IUnknown methods
'We replace Next
'Skip cannot be called from VBA anyway due to the unsupported type so can be left as is
'We replace Reset just in case, to avoid crashes when coercing to 'IEnumerable_1a'
'Clone can be left as is
'
MemCopy VarPtr(vtbl(0)), MemLongPtr(ptr), PTR_SIZE * 7 'Copy entire original virtual function table
vtbl(3) = VBA.Int(AddressOf IEnumVARIANT_Next) 'Replace 'Next'
vtbl(5) = VBA.Int(AddressOf IEnumVARIANT_Reset) 'Replace 'Reset'
End If
'
'Swap original vTable with our own (5 out of 7 are still original functions)
MemLongPtr(ptr) = VarPtr(vtbl(0))
'
'We overwrite the 'nextItem' value (3rd member after vTbl and refCount) with
' the pointer to the class instance implementing 'IEnumerable_1a'
MemLongPtr(ptr + enumInterfaceOffset) = ObjPtr(i)
'
'We store the instance pointer as a collection key so we can quickly point
' to an array postition where we store the index of the next item to retrieve
'This is not the fastest approach but rather a quick demo
Dim s As String: s = CStr(ptr)
Dim p As Long
On Error Resume Next
p = UBound(m_index) + 1
ReDim Preserve m_index(0 To p)
m_ptrToIndex.Remove s
m_ptrToIndex.Add p, s
On Error GoTo 0
End Function
'*******************************************************************************
'When IEnumVariant.Next is called, we coerce the first argument, which is the
' pointer to the IEnumVariant class instance, into our custom type so we can
' take advantage of having access to all values at the instance address
' without the need to copy memory ourselfs i.e. VB does the copying for us
'*******************************************************************************
Private Function IEnumVARIANT_Next(ByRef thisEnum As CollEnumVariant _
, ByVal celt As Long _
, ByRef rgVar As Variant _
, ByVal pceltFetched As LongPtr) As Long
If thisEnum.enumInterface Is Nothing Then GoTo CloseLoop
'
Dim mapIndex As Long
'
'We map the 'IEnumVARIANT' pointer to the index in the 'm_index' array
' where we store the nextItem positin
mapIndex = m_ptrToIndex(CStr(VarPtr(thisEnum)))
'
Dim nextIndex As Long: nextIndex = m_index(mapIndex)
'
'We retrive the next item and write it to 'rgVar' By Ref
If Not thisEnum.enumInterface.ItemByIndex(nextIndex, rgVar) Then GoTo CloseLoop
'
'We increment the index for the next item to be read
m_index(mapIndex) = nextIndex + 1
Exit Function
CloseLoop:
Const S_FALSE = 1
IEnumVARIANT_Next = S_FALSE
End Function
'*******************************************************************************
'Should not be called. This is just to prevent the user calling the 'Reset'
' method in the IEenumVariant on purpose, which would overwrite the 'nextItem'
' member i.e. 'enumInterface' in our representation
'*******************************************************************************
Private Function IEnumVARIANT_Reset(ByVal ptr As Long) As Long
Const E_NOTIMPL = &H80004001
IEnumVARIANT_Reset = E_NOTIMPL
End Function
Create the following DemoClass_1a class:
VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
END
Attribute VB_Name = "DemoClass_1a"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = False
Attribute VB_Exposed = False
Option Explicit
Implements IEnumerable_1a
Private m_coll As New Collection
Private m_items() As Variant
Private m_nextIndex As Long
Private m_bufferUBound As Long
Public Sub Add(ByRef Item As Variant)
If m_nextIndex >= m_bufferUBound Then
m_bufferUBound = m_bufferUBound * 2 + 1
ReDim Preserve m_items(0 To m_bufferUBound)
End If
If IsObject(Item) Then
Set m_items(m_nextIndex) = Item
Else
m_items(m_nextIndex) = Item
End If
m_nextIndex = m_nextIndex + 1
End Sub
Public Property Get Count() As Long
Count = m_nextIndex
End Property
'@Enumerator
Public Function NewEnum() As IEnumVARIANT
Attribute NewEnum.VB_UserMemId = -4
If m_bufferUBound > m_nextIndex - 1 Then
m_bufferUBound = m_nextIndex - 1
ReDim Preserve m_items(0 To m_bufferUBound)
End If
'
'We 'inject' the instance of this class into the 'nextItem' member
' of the IEnumVariant instance returned by the internal collection
'This is to enable the IEnumVariant to call back the 'ItemByIndex'
' method via the 'IEnumerable_1a' interface
Set NewEnum = NewEnumerable(m_coll, Me)
End Function
Private Function IEnumerable_1a_ItemByIndex(ByVal iIndex As Long, ByRef outVar As Variant) As Boolean
If m_nextIndex = 0 Then Exit Function
If iIndex < LBound(m_items) Or iIndex > UBound(m_items) Then Exit Function
If IsObject(m_items(iIndex)) Then
Set outVar = m_items(iIndex)
Else
outVar = m_items(iIndex)
End If
IEnumerable_1a_ItemByIndex = True
End Function
Finally, run TestArrayEnumerator_1a in a standard module:
Option Explicit
Sub TestArrayEnumerator_1a()
Dim d As New DemoClass_1a
Dim i As Long
Dim v As Variant
'
For i = 1 To 3
d.Add i
Next i
'
For Each v In d
Debug.Print v
Next v
End Sub
This will print 1, 2, 3 to the Immediate window.
It works, but there are a few serious disadvantages with this approch:
- the class is exposed to this x64 bug . For me this was a major issue over the years, until I completely switched to the indirect approach (e.g.
EnumHelper). - the 'nextIndex' for each instance of
IEnumVarianthas to be stored somewhere. In the above example I used them_ptrToIndexcollection andm_indexarray inside theLibEnumerable_1a. The alternative would be to store and manage this inside the demo class itself which does not scale well. Moreover, we cannot use the existing memory space underCollEnumVariantbecause if we overwrite any of thevTblPtrorunkPtrorcollpointers we can have crashes if state is lost or the user hasExit Forwithin theFor Eachloop. In other words, extra management just to store aLongvalue - overall this is slow. Try this:
Sub TestEnumeratorSpeed_1a()
Dim d As New DemoClass_1a
Dim i As Long
Dim v As Variant
Const size As Long = 1000000
Dim arr() As Variant: ReDim arr(1 To size)
Dim t As Double
'
For i = 1 To size
arr(i) = i
d.Add i
Next i
'
Debug.Print "Running 'For Each' on " & Format$(size, "#,##0") & " elements"
Debug.Print String$(40, "-")
'
t = Timer
For Each v In arr
Next v
Debug.Print "Array: " & Round(Timer - t, 3) & " (seconds)"
'
t = Timer
For Each v In d
Next v
Debug.Print "Class: " & Round(Timer - t, 3) & " (seconds)"
End Sub
I get something like:
Running 'For Each' on 1,000,000 elements
----------------------------------------
Array: 0.02 (seconds)
Class: 1.031 (seconds)
which is a x50 factor. Surely, it is faster to just make a copy of the internal array and iterate that copy.
#1b Indirect approach - Public Function NewEnum() As EnumHelper
No need for late binding or an interface. Create a standard module called LibEnumerable_1b:
Option Explicit
Private vtbl(0 To 6) As LongPtr
Private Type CollEnumVariant
vTblPtr As LongPtr
refCount As Long
enumH As EnumHelper_1b
unkPtr As LongPtr
coll As Collection
End Type
Private Enum CollEnumVariantOffsets
vTblPtrOffset = 0
refCountOffset = PTR_SIZE
enumHelperOffset = refCountOffset + PTR_SIZE
unkPtrOffset = enumHelperOffset + PTR_SIZE
collOffset = unkPtrOffset + PTR_SIZE
End Enum
'*******************************************************************************
'We add the instance of the 'IEnumVariant' returned by the collection as
' well as a reference to the desired array, both to the 'EnumHelper_1b' instance
'*******************************************************************************
Public Function NewEnumHelper(ByRef c As Collection _
, ByRef arr As Variant) As EnumHelper_1b
With New EnumHelper_1b
.Init c.[_NewEnum], arr
Set NewEnumHelper = .Self
Dim ptr As LongPtr: ptr = ObjPtr(.EnumVariant)
End With
If vtbl(0) = 0 Then 'We swap the 'Next' and 'Reset' method with our own
'We keep the 3 IUnknown methods
'We replace Next
'Skip cannot be called from VBA anyway due to the unsupported type so can be left as is
'We replace Reset just in case, to avoid crashes when coercing to 'EnumHelper_1b'
'Clone can be left as is
'
MemCopy VarPtr(vtbl(0)), MemLongPtr(ptr), PTR_SIZE * 7 'Copy entire original virtual function table
vtbl(3) = VBA.Int(AddressOf IEnumVARIANT_Next) 'Replace 'Next'
vtbl(5) = VBA.Int(AddressOf IEnumVARIANT_Reset) 'Replace 'Reset'
End If
'
'Swap original vTable with our own (5 out of 7 are still original functions)
MemLongPtr(ptr) = VarPtr(vtbl(0))
'
'We overwrite the 'nextItem' value (3rd member after vTbl and refCount) with
' the pointer to the instance of 'EnumHelper_1b'
MemLongPtr(ptr + enumHelperOffset) = ObjPtr(NewEnumHelper)
End Function
'*******************************************************************************
'When IEnumVariant.Next is called, we coerce the first argument, which is the
' pointer to the IEnumVariant class instance, into our custom type so we can
' take advantage of having access to all values at the instance address
' without the need to copy memory ourselfs i.e. VB does the copying for us
'*******************************************************************************
Private Function IEnumVARIANT_Next(ByRef thisEnum As CollEnumVariant _
, ByVal celt As Long _
, ByRef rgVar As Variant _
, ByVal pceltFetched As LongPtr) As Long
Const S_FALSE = 1
If thisEnum.enumH Is Nothing Then
IEnumVARIANT_Next = S_FALSE
ElseIf Not thisEnum.enumH.GetNext(rgVar) Then
IEnumVARIANT_Next = S_FALSE
End If
End Function
'*******************************************************************************
'Should not be called. This is just to prevent the user calling the 'Reset'
' method in the IEenumVariant on purpose, which would overwrite the 'nextItem'
' member i.e. 'enumInterface' in our representation
'*******************************************************************************
Private Function IEnumVARIANT_Reset(ByVal ptr As Long) As Long
Const E_NOTIMPL = &H80004001
IEnumVARIANT_Reset = E_NOTIMPL
End Function
Create the updated EnumHelper_1b class:
VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
END
Attribute VB_Name = "EnumHelper_1b"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = False
Attribute VB_Exposed = False
Option Explicit
Private m_enum As IEnumVARIANT
Private m_arrByRef As Variant
Private m_currIndex As Long
'@Enumerator
Public Property Get EnumVariant() As IEnumVARIANT
Attribute EnumVariant.VB_UserMemId = -4
Set EnumVariant = m_enum
End Property
Public Property Get Self() As EnumHelper_1b
Set Self = Me
End Property
Public Sub Init(ByRef newEnum_ As IEnumVARIANT _
, ByRef arr As Variant)
Set m_enum = newEnum_
m_arrByRef = GetArrayByRef(arr)
m_currIndex = LBound(arr)
'
'We overwrite the 'nextItem' value (3rd member after vTbl and refCount)
' with a pointer to this instance so that 'enumH' member in
' 'CollEnumVariant' representation is coerced to this class
MemLongPtr(ObjPtr(newEnum_) + PTR_SIZE * 2) = ObjPtr(Me)
End Sub
Public Function GetNext(ByRef v As Variant) As Boolean
If m_currIndex < LBound(m_arrByRef) Or _
m_currIndex > UBound(m_arrByRef) Then Exit Function
If IsObject(m_arrByRef(m_currIndex)) Then
Set v = m_arrByRef(m_currIndex)
Else
v = m_arrByRef(m_currIndex)
End If
m_currIndex = m_currIndex + 1
GetNext = True
End Function
Create DemoClass_1b class:
Option Explicit
Private m_coll As New Collection
Private m_items() As Variant
Private m_nextIndex As Long
Private m_bufferUBound As Long
Public Sub Add(ByRef Item As Variant)
If m_nextIndex >= m_bufferUBound Then
m_bufferUBound = m_bufferUBound * 2 + 1
ReDim Preserve m_items(0 To m_bufferUBound)
End If
If IsObject(Item) Then
Set m_items(m_nextIndex) = Item
Else
m_items(m_nextIndex) = Item
End If
m_nextIndex = m_nextIndex + 1
End Sub
Public Property Get Count() As Long
Count = m_nextIndex
End Property
Public Function NewEnum() As EnumHelper_1b
If m_bufferUBound > m_nextIndex - 1 Then
m_bufferUBound = m_nextIndex - 1
ReDim Preserve m_items(0 To m_bufferUBound)
End If
'
'We push the items array (By Ref) into the 'EnumHelper_1b' instance
' which will also store the 'IEnumVariant'
'Basically all calls coming from 'For Each...' will go into
' the 'EnumHelper_1b' instance and not this one (DemoClass_1b)
Set NewEnum = NewEnumHelper(m_coll, m_items)
End Function
Finally, run the TestArrayEnumerator from a standard module:
Option Explicit
Sub TestArrayEnumerator_1b()
Dim d As New DemoClass_1b
Dim i As Long
Dim v As Variant
'
For i = 1 To 3
d.Add i
Next i
'
For Each v In d.NewEnum
Debug.Print v
Next v
End Sub
This will print 1, 2, 3 to the Immediate window.
This approach is better than #1a for the following reasons:
- the class is bypassing this x64 bug
- no class needs the
VB_UserMemId = -4attribute. In fact a class can now expose multipleEnumHelperfunctions each pointing to a different array - the 'currentIndex' for each instance of
IEnumVariantis stored within it's corresponding 'parent'EnumHelperclass under them_currIndexvariable. This scales well - the array is stored by reference in a Variant which points to the actual array variable. The address does not change for the
m_itemsvariable itself even if the array is resized or destroyed.m_arrByRefwill always point to the current array whatever that is. We do NOT have to call into the demo class be that with late-binding or via interface. Of course, we would not need to store the array by reference if we were to implement an interface or use late-binding but that is much slower - this is faster than #1a: Try this:
Sub TestEnumeratorSpeed_1b()
Dim d As New DemoClass_1b
Dim i As Long
Dim v As Variant
Const size As Long = 1000000
Dim arr() As Variant: ReDim arr(1 To size)
Dim t As Double
'
For i = 1 To size
arr(i) = i
d.Add i
Next i
'
Debug.Print "Running 'For Each' on " & Format$(size, "#,##0") & " elements"
Debug.Print String$(40, "-")
'
t = Timer
For Each v In arr
Next v
Debug.Print "Array: " & Round(Timer - t, 3) & " (seconds)"
'
t = Timer
For Each v In d.NewEnum
Next v
Debug.Print "Class: " & Round(Timer - t, 3) & " (seconds)"
End Sub
I get this:
Running 'For Each' on 1,000,000 elements
----------------------------------------
Array: 0.02 (seconds)
Class: 0.305 (seconds)
which is a x15 factor compared to x50 (#1a).
Both #1a and #1b are safe for the following reasons:
- If the
DemoClassinstance is destroyed, then so is the internal collection destroyed which in turn clears the pointers in theIEnumVariantstructure thus stopping the enumeration (thanks to theIf thisEnum.enumInterface Is Nothing Then/If thisEnum.enumH Is Nothing Thenlines from our swap methods). - There are no crashes when state is lost even if the 'Reset' button is pressed in the IDE while code is running regardless which method is the last in the stack
- Cleanup is performed by the original implementation that handles all the Collection instances. We don't have to manage memory storage (e.g. CoTaskMemAlloc/CoTaskMemFree) nor handle reference count or internal interfaces
Overall, I think #1b scales well. However, this is not as fast as I would like it to be and this is mainly because of the extra function calls to IEnumVARIANT_Next and then the follow up to get the actual value from the array. There is a faster way, see section #2.
#2 Hijack the nextItemPtr within the IEnumVARIANT storage
As well detailed on VB Forums or here, there is a specific offset between items in the underlying structures making up a collection (linked-list). If we offset our items carefully, we can make use of the original speed of iterating a Collection.
For the purpose of this example, I will skip using the EnumHelper class so the below example is vulnerable to the previously mentioned x64 bug. We can easily implement the most basic EnumHelper (like the one under the 'The internal structure' section).
Create a class called DemoClass_2:
VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
END
Attribute VB_Name = "DemoClass_2"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = False
Attribute VB_Exposed = False
Option Explicit
Private Type VariantInfo
IsObject As Boolean
DummyOffset1 As Integer
DummyOffset2 As Long
#If Win64 Then
DummyOffset3 As LongLong
#End If
End Type
'We are mimicking the structure of an 'VbCollectionItem' as detailed here:
'https://gist.github.com/wqweto/39822f4fb7090fa086aeff1e2e06e630
'or on VB Forums.
'We only need the item value which is the 1st member (Variant) and the
' 4th member which is the address of the next item structure.
'Since we don't need the pointer to the key (2nd member) or the pointer
' to the previous item (3rd member), we replace with 'VariantInfo'
' which has the same byte size as the 2 pointers but allows us to store
' some useful metadata like 'IsObject' bool for the Variant member
Private Type EnumerableVariant
Value As Variant
Info As VariantInfo
NextPtr As LongPtr
End Type
Private m_coll As New Collection
Private m_items() As EnumerableVariant
Private m_nextIndex As Long
Private m_bufferUBound As Long
Public Sub Add(ByRef Item As Variant)
If m_nextIndex >= m_bufferUBound Then
m_bufferUBound = m_bufferUBound * 2 + 1
ReDim Preserve m_items(0 To m_bufferUBound)
End If
With m_items(m_nextIndex)
.Info.IsObject = IsObject(Item) 'This can be used later to speed up item retrieval
If .Info.IsObject Then Set .Value = Item Else .Value = Item
End With
m_nextIndex = m_nextIndex + 1
End Sub
Public Property Get Count() As Long
Count = m_nextIndex
End Property
'@Enumerator
Public Function NewEnum() As IEnumVARIANT
Attribute NewEnum.VB_UserMemId = -4
Set NewEnum = m_coll.[_NewEnum]
Dim ptr As LongPtr: ptr = VarPtr(m_items(0))
'
'We overwrite the 'nextItem' pointer in the 'IEnumVARIANT' so that
' it points to our desired 'EnumerableVariant' member which will allow
' the collection to iterate the values in the array
MemLongPtr(ObjPtr(NewEnum) + PTR_SIZE * 2) = ptr
'
'We link each 'EnumerableVariant' member to the next
' one via the 'NextPtr' member
If m_nextIndex > 1 Then
#If Win64 Then
Const size As Long = 48
#Else
Const size As Long = 28
#End If
Dim i As Long
For i = 0 To m_nextIndex - 2
ptr = ptr + size
m_items(i).NextPtr = ptr
Next i
End If
End Function
Finally, run TestArrayEnumerator_2 in a standard module:
Option Explicit
Sub TestArrayEnumerator_2()
Dim d As New DemoClass_2
Dim i As Long
Dim v As Variant
'
For i = 1 To 3
d.Add i
Next i
'
For Each v In d
Debug.Print v
Next v
End Sub
This will print 1, 2, 3 to the Immediate window.
How does this perform? Run TestEnumeratorSpeed_2 from a standard module:
Sub TestEnumeratorSpeed_2()
Dim d As New DemoClass_2
Dim i As Long
Dim v As Variant
Const size As Long = 1000000
Dim arr() As Variant: ReDim arr(1 To size)
Dim t As Double
'
For i = 1 To size
arr(i) = i
d.Add i
Next i
'
Debug.Print "Running 'For Each' on " & Format$(size, "#,##0") & " elements"
Debug.Print String$(40, "-")
'
t = Timer
For Each v In arr
Next v
Debug.Print "Array: " & Round(Timer - t, 3) & " (seconds)"
'
t = Timer
For Each v In d
Next v
Debug.Print "Class: " & Round(Timer - t, 3) & " (seconds)"
End Sub
I get this:
Running 'For Each' on 1,000,000 elements
----------------------------------------
Array: 0.02 (seconds)
Class: 0.047 (seconds)
which is way faster that both approaches in #1.
However, this approach has 2 main drawbacks:
- The array needs to be locked somehow to prevent reallocation while the loop is running. I haven't implemented this for the purpose of this example, but it is definitely achievable even if it proves to be complex.
- The internal array is not a contiguous array anymore. If I want to have a function that exposes a copy of the array, then I will have to copy each item one by one. I must mention that I already tried a 'workaround' that makes the array copy and then increases the 'cbElements' member of the underlying SAFEARRAY structure (of the returned copy) to accommodate the extra members in the struct. However, this does not work well in 3 out of 4 cases (
For Eachloop onVariant(),For Eachloop onVariantandFor.. To..loop onVariant()). It only works for aFor.. To..loop onVariant. I can provide more details if needed but wanted to say it's a dead-end. I also already tried to use a fakeParamArraythat points to each element by Reference which then gets copied with the build-in functionality (for param arrays) into an array by Value - however, this is much slower than iterating elements one by one. Also, I've tried referencing elements via ByRef Variants but that does not work well because in theFor Eachloop the iterator Variant then needs to be set to a value (likeEmpty) before getting the next value - so, not a good solution either. Here is a quick test:
Option Explicit
Private Type TagVariant
vt As Integer
wReserved1 As Integer
wReserved2 As Integer
wReserved3 As Integer
ptr1 As LongPtr
ptr2 As LongPtr
End Type
Private Type nextItemAddr
itemByRef As TagVariant
dummy1 As LongPtr
dummy2 As LongPtr
nextPtr As LongPtr
End Type
#If Win64 Then
Const sizeOfNext = 48
#Else
Const sizeOfNext = 28
#End If
Sub TestByRef()
Const size As Long = 10
'
Dim c As Collection: Set c = New Collection
Dim e As IEnumVARIANT: Set e = c.[_NewEnum]
Dim arr() As Variant: ReDim arr(0 To size - 1)
Dim arr2() As nextItemAddr: ReDim arr2(0 To size - 1)
Dim h As New EnumHelper: Set h.EnumVariant = e
Dim i As Long
Dim v As Variant
Dim ptr1 As LongPtr
Dim ptr2 As LongPtr
Dim vt As Integer
'
ptr1 = VarPtr(arr(0))
ptr2 = VarPtr(arr2(1))
vt = vbVariant + VT_BYREF
For i = 0 To UBound(arr2)
arr(i) = i 'doesn't matter what values
arr2(i).itemByRef.ptr1 = ptr1
arr2(i).itemByRef.vt = vt 'could use the actual var type here but then the ptr needs updated with +8
arr2(i).nextPtr = ptr2
ptr1 = ptr1 + VARIANT_SIZE
ptr2 = ptr2 + sizeOfNext
Next i
arr2(UBound(arr2)).nextPtr = 0
'
MemLongPtr(ObjPtr(e) + PTR_SIZE * 2) = VarPtr(arr2(0))
'
For Each v In h
Debug.Print v
v = Empty 'Required to avoid 'Type Mismatch' error that gets raised on the 'Next v' line
Next v
End Sub
Of course, any feedback on this would be great.
I do have some questions though:
Q1) Is any of the presented approaches viable (in your opinion, of course)? Would you use any of them? Why (both for yes or no)?
Q2) For approach #2a, I would be very interested if there is a way to be efficient in copying the internal Private m_items() As EnumerableVariant array into and array of Variant() without iterating one by one.
Q3) I would be interested in any improvements to any of the presented approaches
Thank you for your time!
For Eachloop to work and be efficient while also returning theItemsandKeysarrays. \$\endgroup\$