0

First off, I am new to ML.NET (and ML as a whole). I am trying to set up a model using a SQL Server table as my data source. I am selecting one label and 18 features from the same table and this table contains a little more than 3 million records in it. When I finish selecting my label/features and click on the Train button, I get a prompt telling me that VS will download 1.1 GB of data from the SQL Server (hosted on the same machine) which I acknowledge. I get feedback indicating that the download is in progress and this lasts for 30 - 60 seconds. Then I get the following error:

Error retrieving SQL data: "Exception of type 'System.OutOfMemoryException' was thrown."
   at Microsoft.ML.ModelBuilder.ToolWindows.ModelBuilderDataContext.<DownloadSqlFileAsync>b__88_0()
   at System.Threading.Tasks.Task`1.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ML.ModelBuilder.ToolWindows.ModelBuilderDataContext.<DownloadSqlFileAsync>d__88.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ML.ModelBuilder.ToolWindows.ModelBuilderDataContext.<<OnDataChanged>b__77_1>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ML.ModelBuilder.ToolWindows.TrainTabDataContext.<BuildTrainModelParametersAsync>d__138.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ML.ModelBuilder.ToolWindows.TrainTabDataContext.<StartTrainingAsync>d__130.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.ML.ModelBuilder.ToolWindows.TrainTabControl.<<StartTraining_Click>b__5_0>d.MoveNext()

Some fun facts:

  • I've watched the RAM count on the machine while the attempt to train is made and it's not getting above 65% of total RAM available.

  • In the same VS solution, I have another app where I routinely read the entirety of the table in question into memory via EF.

  • I am using VS Community and SQL Express

  • I see the RAM count increase by maybe 3 or so GB before the error occurs. It smells so badly like it's running the process in 32-bit (which would make sense of all of this) but if there's a setting for this, I can't find it. I've checked the Build properties for my ML project and made sure that's set to 64-bit but I'm not sure that's even what is being used when you're training the model.

5
  • Are you using the DatabaseLoader from ML.NET to get the data? Commented Mar 31, 2020 at 13:53
  • If you are using the .net framework, try unchecking "Enable the Visual Studio hosting process" Commented Mar 31, 2020 at 13:59
  • @Jon I'm not sure... =/ I am following this guide, which uses a UI clearly geared toward the chronically ignorant (like me): dotnet.microsoft.com/learn/ml-dotnet/get-started-tutorial/intro Commented Mar 31, 2020 at 14:09
  • @ShakHam I am using .NET Core 2.1. Perhaps I should try rolling that forward? Commented Mar 31, 2020 at 14:10
  • I think using the DatabaseLoader may be more efficient. Here's a sample for it - github.com/dotnet/machinelearning-samples/tree/master/samples/… Commented Mar 31, 2020 at 17:15

1 Answer 1

0

The ModelBuilder is (necessarily) a 32-bit extension and so it cannot process as much data as I was trying to push to it. I've opened a bug / feature request to get the data introduction into some 64-bit code or else change the way the data is ingested.

https://github.com/dotnet/machinelearning-modelbuilder/issues/647

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.