1

When I create a query on a DbSet with too many Concat or Where clauses, I get a stack overflow error.

Essentially I have the problem where I have a list of thousands of AND clauses all connected with OR clauses. It would look a little something like:

(A AND B) OR (C AND D) OR ...

The clauses are created from a list so the number of AND clauses that are concatenated by the OR clauses is dynamic and could be from 0 to thousands.

I tried creating selects for each AND clause and using Concat to combine multiple selects together using Entity Framework, but I get a stack overflow exception.

I feel like there should be a better way to write the code, but I'm not sure so I've included the error and some example code in a hope someone knows how this should be done without reverting back to writing inline SQL (Goes against the entity framework paradigm)

The exact error is as follows:

Stack overflow.
Repeat 798 times:
--------------------------------
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.VisitMethodCall(System.Linq.Expressions.MethodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Linq.Expressions.Expression)
--------------------------------
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)

Libraries:

  • EFCore.BulkExtensions.PostgreSql Version="8.1.2"
  • Microsoft.EntityFrameworkCore.Design Version="9.0.0"
  • Npgsql.EntityFrameworkCore.PostgreSQL Version="9.0.2"
  • Npgsql.EntityFrameworkCore.PostgreSQL.Design Version="1.1.0"

Here's an example how to reproduce the issue:

private class SimpleDbContext : DbContext
{
    // Stores the values 0, 1, 2, ..., 100000
    public virtual DbSet<SequencePoint> SequencePoints { get; set; }
}

private class SequencePoint
{
    public int SequenceNumber { get; set; }
}
    
private void ConcatErrorTest()
{
    SimpleDbContext simpleDbContext = new();
        
    List<Tuple<int, int>> selectRanges = new(); // 0, 10, 11, 20, 21, 21, etc...
        
    for (int i = 0; i < 7500; i++)
    {
        int startRange = i * 10;
        int endRange = startRange + (i % 5);
        selectRanges.Add(new Tuple<int, int>(startRange, endRange));
    }

    IQueryable<SequencePoint> queryable = null;

    foreach (Tuple<int,int> selectRange in selectRanges)
    {
        IQueryable<SequencePoint> whereQueryable = simpleDbContext.SequencePoints.AsQueryable().Where(point =>
            (point.SequenceNumber >= selectRange.Item1) &&
            (point.SequenceNumber <= selectRange.Item2)
        );
            
        queryable = queryable == null ? whereQueryable : queryable.Concat(whereQueryable);
    }

    // Throws Stack overflow.
    List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
        
    _logger.LogInformation("result = {result}", result);
}
6
  • What is the size of selectRanges? The obvious solution is to find the threshold where stackoverflow occurs, and then rather than having a single queryable you run them in batches and append to result e.g. batch selectRanges into batches of 100, and then run multiple queries, each with 200 parameters in them and then run them one at a time, AddRangeing into result. This will likely also allow you to avoid stackoverflow.com/questions/1009706/… . Commented Feb 7 at 7:20
  • 1
    I suppose simpleDbContext.SequencePoints.AsQueryable().Where(point => selectRanges.Any( sr => point.SequenceNumber >= sr.Item1 && point.SequenceNumber <= sr.Item2)) or something along these line should remove the need for the for loop and the Concat. That still leaves open the option that the dbprovider will choke when selectRanges is big Commented Feb 7 at 7:23
  • 3
    What is the actual use case? An excessive number of conditions do not seem like a great idea. Perhaps there are better ways to organize your data? Commented Feb 7 at 7:52
  • I would bypass EF for this and maybe use Dapper and plain SQL. Commented Feb 7 at 8:47
  • Don't expect that EF can process this. Better to get third party extension, create temporary table and join to this table. Commented Feb 7 at 9:14

2 Answers 2

1

Not only is EF Core going to struggle with this, the database itself will struggle. You need to rethink your whole approach.

Use an array or JSON parameter to pass in your data, and do a single .Where on your table.

private async Task ConcatErrorTest()
{
    using SimpleDbContext simpleDbContext = new();
        
    List<int> selectRanges = new(); // 0, 10, 11, 20, 21, 21, etc...
        
    for (int i = 0; i < 7500; i++)
    {
        int startRange = i * 10;
        selectRanges.Add(startRange);
    }
    var array = selectRanges.ToArray();
    
    IQueryable<SequencePoint> queryable = simpleDbContext.SequencePoints
        .Where(point =>
            array.Any(i =>
                point.SequenceNumber >= i &&
                point.SequenceNumber <= i + i / 10 % 5
            )
        );

    List<int> result = await queryable
        .Select(sequenceNumber => sequenceNumber.SequenceNumber)
        .ToListAsync();
        
    _logger.LogInformation("result = {result}", result);
}

Note also the addition of using and await.

Sign up to request clarification or add additional context in comments.

Comments

0

Thanks for the responses.

In my case the example problem here was a simplification of my actual problem but the responses helped me get on the right track.

The most entity framework way of approaching this problem was to have the ranges in another table and use a multi select. Note a Join would not work here as you can't do that with a LINQ joins - LINQ only supports equijoins.

Here's what the code would look like:

private class SimpleDbContext : DbContext
{
    // Stores the values 0, 1, 2, ..., 100000
    public virtual DbSet<SequencePoint> SequencePoints { get; set; }
    
    // [0, 0], [10, 11], [20, 21, 22], etc...
    public virtual DbSet<SelectionRange> SelectionRanges { get; set; }
}

private class SequencePoint
{
    public int SequenceNumber { get; set; }
}

private class SelectionRange
{
    public int LowerRange { get; set; }
    public int UpperRange { get; set; }
}

private void ConcatReworked()
{
    SimpleDbContext simpleDbContext = new();
    
    IQueryable<SequencePoint> queryable = from sequencePoint in simpleDbContext.SequencePoints
        from selectionRange in simpleDbContext.SelectionRanges
        where sequencePoint.SequenceNumber >= selectionRange.LowerRange
        where sequencePoint.SequenceNumber <= selectionRange.UpperRange
        orderby sequencePoint.SequenceNumber
        select sequencePoint;

    List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
    
    _logger.LogInformation("result = {result}", result);
}

The other approach I tried was to write the Query manually using a StringBuilder which looked something along this lines of:

private async Task LoadSelection(List<SelectionRange> selectionRange)
{
    using NpgsqlConnection connection = await npgsqlDataSource.OpenConnectionAsync();
    StringBuilder commandSqlBuilder = new();
    commandSqlBuilder.Append("SELECT ");
    // ... More SQL here
    for (int i = 0; i < selectionRange.Count; i++)
    {
        SelectionRange range = selectionRange[i];
        commandSqlBuilder.Append("OR (");
        commandSqlBuilder.Append($"value >= (@range_lower_param_{i})");
        commandSqlBuilder.Append($"value <= (@range_upper_param_{i})");
        // ... More SQL here
    }
    // ... More SQL here
    NpgsqlCommand command = new NpgsqlCommand(commandSqlBuilder.ToString(), connection);
    command.Parameters.AddWithValue(...);
    NpgsqlDataReader reader = await command.ExecuteReaderAsync();
    while (await reader.ReadAsync())
}

While this does work, and runs about twice as fast as the join, it does have new limitations such as A statement cannot have more than 65535 parameters. So I needed to split it into smaller statements and join the results in code later. Also when the database names change in a migration I will need to come back an update this String Builder to reflect the changes.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.