I have a chron job which calls a database table and gets about half a million records returned. I need to loop through all of that data, and send API post's to a third party API. In general, this works fine, but the processing time is forever (10 hours). I need a way to speed it up. I've been trying to use a list of Task with SemaphoreSlim, but running into issues (it doesn't like that my api call returns a Task). I'm wondering if anyone has a solution to this that won't destroy the VM's memory?
Current code looks something like:
foreach(var data in dataList)
{
try
{
var response = await _apiService.PostData(data);
_logger.Trace(response.Message);
} catch//
}
But I'm trying to do this and getting the syntax wrong:
var tasks = new List<Task<DataObj>>();
var throttler = new SemaphoreSlim(10);
foreach(var data in dataList)
{
await throttler.WaitAsync();
tasks.Add(Task.Run(async () => {
try
{
var response = await _apiService.PostData(data);
_logger.Trace(response.Message);
}
finally
{
throttler.Release();
}
}));
}
Your list is of type Task<DataObj>
, but your async
lambda doesn't return anything, so its return type is Task
. To fix the syntax, just return the value:
var response = await _apiService.PostData(data);
_logger.Trace(response.Message);
return response;
As others have noted in the comments, I also recommend not using Task.Run
here. A local async
method would work fine:
var tasks = new List<Task<DataObj>>();
var throttler = new SemaphoreSlim(10);
foreach(var data in dataList)
{
tasks.Add(ThrottledPostData(data));
}
var results = await Task.WhenAll(tasks);
async Task<DataObj> ThrottledPostData(Data data)
{
await throttler.WaitAsync();
try
{
var response = await _apiService.PostData(data);
_logger.Trace(response.Message);
return response;
}
finally
{
throttler.Release();
}
}