3

I am trying to build a LINQ query to query against a large SQL table (7M+ entries) of Documents.

Each document has many DocumentFields :

Simplified UML

My goal is to apply successive filters (from 0 to ~10 filters) on the value field of DocumentField:

Here is an example of the filters I want to apply:

[ {fieldId: 32, value: "CET20533"}, {fieldId: 16, value: "882341"}, {fieldId: 12, value: "101746"} ] 

What I want is to retrieve every document in my database that matches all of the filters. For the previous example, I want all documents that have a value of CET20533 for the field with the Id "32", the value 882341 for the field with the Id 16, and so on.

I had a first approach :

List<MyFilter> filters = ... // Json deserialization db.Documents.Where(document => filters.All(filter => document.DocumentFields.Any(documentField => documentField.Id == filter.Id && documentField.Value == filter.Value))); 

This approach doesn't work : my filters List isn't a primitive type, and therefore cannot be used in a LINQ query.

I had a second approach, which didn't throw an error at me, but only applied 1 filter :

var result = db.Documents.Select(d => d); foreach (var filter in filters) { var id = filter.Id; var value = filter.Value; result = result.Where(document => document.DocumentFields.Any(documentField => documentField.Id == id && documentField.Value == value)); } 

The problem with this approach is, I believe, some sort of a concurrency problem. I applied a simple pause Thread.Sleep(2000) in each iteration of the foreach to test, and it seems to work.

Questions :

  • How to remove the pause and still not have concurrency problems ?
  • Is there a better way to build my query ?

EDIT :

For more clarity, here is an actual example of a document that matches the previous filters example :

Example document

9
  • 2
    Are you sure that filters.All doesn't change your IQueriable to IEnumerable? Check the result with the profiler. I would suggest to parse filters and add it to query as Expressions Commented Aug 7, 2017 at 8:23
  • As soon as db.Documents is IQueryable<T>, the second approach with chaining multiple Where should work - I don't see any concurrency issues there since it's just building a query. Of course I'm assuming you use == in both places (= is being a typo). Commented Aug 7, 2017 at 9:06
  • The problem with the multiple Where is that I don't know in advance how many filters there will be : it can be 0, it can also be 10. So I can't write .Where(filter1).Where(filter2).Where(filter3). The foreach actually only builds the query, as you said, so I don't see either why it doesn't work without the pause ... EDIT : It seems to work without the Sleep now, I guess I made a mistake earlier ? Commented Aug 7, 2017 at 9:17
  • 1
    you can build you IQueriable in a loop like foreach(filter in filters) query = query.Where(GetExpr(fiter)); Commented Aug 7, 2017 at 9:26
  • @ASpirin — Nice trick! I'll have to keep that one in mind. Commented Aug 7, 2017 at 9:27

3 Answers 3

1

You have to Build expression based on your filters and append each in where separately (or not if you can manage it)

db.Documents.Where(ex1).Where(ex2)... 

see e.g and MSDN

Or simple case: Start from DocumentFields and retrieve Related Documents. operation Contains works for simple types. that will also simplier in case of building of expression

Sign up to request clarification or add additional context in comments.

Comments

1

Quite convinced that your data model is too generic. It will hurt you in terms of program clarity and performance.

But let's go with it for this answer, which I took as a challenge in expression building. The goal is to get a nice queryable that honors the filters on the data server side.

Here's the data model I used, which I think closely matches yours:

public sealed class Document { public int Id { get; set; } // ... public ICollection<DocumentField> Fields { get; set; } } public sealed class DocumentField { public int Id { get; set; } public int DocumentId { get; set; } public string StringValue { get; set; } public float? FloatValue { get; set; } // more typed vales here } 

First, I implement conveniance functions to create predicates for individual fields of individual field types:

public static class DocumentExtensions { private static readonly PropertyInfo _piFieldId = (PropertyInfo)((MemberExpression)((Expression<Func<DocumentField, int>>)(f => f.Id)).Body).Member; private static Expression<Func<DocumentField, bool>> FieldPredicate<T>(int fieldId, T value, Expression<Func<DocumentField, T>> fieldAccessor) { var pField = fieldAccessor.Parameters[0]; var xEqualId = Expression.Equal(Expression.Property(pField, _piFieldId), Expression.Constant(fieldId)); var xEqualValue = Expression.Equal(fieldAccessor.Body, Expression.Constant(value, typeof(T))); return Expression.Lambda<Func<DocumentField, bool>>(Expression.AndAlso(xEqualId, xEqualValue), pField); } /// <summary> /// f => f.<see cref="DocumentField.Id"/> == <paramref name="fieldId"/> && f.<see cref="DocumentField.StringValue"/> == <paramref name="value"/>. /// </summary> public static Expression<Func<DocumentField, bool>> FieldPredicate(int fieldId, string value) => FieldPredicate(fieldId, value, f => f.StringValue); /// <summary> /// f => f.<see cref="DocumentField.Id"/> == <paramref name="fieldId"/> && f.<see cref="DocumentField.FloatValue"/> == <paramref name="value"/>. /// </summary> public static Expression<Func<DocumentField, bool>> FieldPredicate(int fieldId, float? value) => FieldPredicate(fieldId, value, f => f.FloatValue); // more overloads here } 

Usage:

var fieldPredicates = new[] { DocumentExtensions.FieldPredicate(32, "CET20533"), // f => f.Id == 32 && f.StringValue == "CET20533" DocumentExtensions.FieldPredicate(16, "882341"), DocumentExtensions.FieldPredicate(12, 101746F) // f => f.Id == 12 && f.FloatValue == 101746F }; 

Second, I implement an extension method HavingAllFields(also in DocumentExtensions) that creates an IQueryable<Document> where all of the field predicates are satisfied by at least one field:

 private static readonly MethodInfo _miAnyWhere = ((MethodCallExpression)((Expression<Func<IEnumerable<DocumentField>, bool>>)(fields => fields.Any(f => false))).Body).Method; private static readonly Expression<Func<Document, IEnumerable<DocumentField>>> _fieldsAccessor = doc => doc.Fields; /// <summary> /// <paramref name="documents"/>.Where(doc => doc.Fields.Any(<paramref name="fieldPredicates"/>[0]) && ... ) /// </summary> public static IQueryable<Document> HavingAllFields(this IQueryable<Document> documents, IEnumerable<Expression<Func<DocumentField, bool>>> fieldPredicates) { using (var e = fieldPredicates.GetEnumerator()) { if (!e.MoveNext()) return documents; Expression predicateBody = Expression.Call(_miAnyWhere, _fieldsAccessor.Body, e.Current); while (e.MoveNext()) predicateBody = Expression.AndAlso(predicateBody, Expression.Call(_miAnyWhere, _fieldsAccessor.Body, e.Current)); var predicate = Expression.Lambda<Func<Document, bool>>(predicateBody, _fieldsAccessor.Parameters); return documents.Where(predicate); } } 

Test:

var documents = (new[] { new Document { Id = 1, Fields = new[] { new DocumentField { Id = 32, StringValue = "CET20533" }, new DocumentField { Id = 16, StringValue = "882341" }, new DocumentField { Id = 12, FloatValue = 101746F }, } }, new Document { Id = 2, Fields = new[] { new DocumentField { Id = 32, StringValue = "Bla" }, new DocumentField { Id = 16, StringValue = "882341" }, new DocumentField { Id = 12, FloatValue = 101746F }, } } }).AsQueryable(); var matches = documents.HavingAllFields(fieldPredicates).ToList(); 

Matches document 1, but not 2.

Comments

0

I usually do something like this: put all your desired Id's for your filter into a list, then use contains.

List<int> myDesiredIds = new List<int> { 1, 2, 3, 4, 5 }; db.documents.Where(x=>myDesiredIds.Contains(x.DocumentId)); 

11 Comments

Actually, every document has every field id. What is important is the value associated with the filter id. That was not clear in my question.
@ThomasSauvajon — Expanding on Robert's answer: db.documents.Where(x=>myDesiredIds.Contains(x.DocumentId) && x=>myDesiredValues.Contains(x.DocumentValue));
@InteXX: That would lose some of the information. E.g. the OP does not want an element with fieldid 32 and value 882341; which your suggestion will still include because it omits the pairing.
The question is more complex than your answer though. It's not just about a list of ints, it's about a list of combined criteria. If the OP was working with a list of ints, he would not have received the error that he mentions ("my filters List isn't a primitive type, and therefore cannot be used in a LINQ query")
@ThomasSauvajon — It looks like you've found your answer with ASpirin. Good enough; I like his solution as well. Just a quick reminder: I only saw your reply by accident, as it didn't include an @ notifier to make sure the note made it to my inbox.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.