13

Which is the easiest way to remove items that match some condition from a list and then, get those items.

I can think in a few ways, I don't know which is the best one:

var subList = list.Where(x => x.Condition);
list.RemoveAll(x => x.Condition);

or

var subList = list.Where(x => x.Condition);
list.RemoveAll(x => subList.Contains(x));

Is any of this one of the best ways? If it is, which one? If it's not, how should I do it?

BrokenGlass
  • 158,293
  • 28
  • 286
  • 335
Diego
  • 16,436
  • 26
  • 84
  • 136

4 Answers4

8

I like to use a functional programming approach (only make new things, don't modify existing things). One advantage of ToLookup is that you can handle more than a two-way split of the items.

ILookup<bool, Customer> lookup = list.ToLookup(x => x.Condition);
List<Customer> sublist = lookup[true].ToList();
list = lookup[false].ToList();

Or if you need to modify the original instance...

list.Clear();
list.AddRange(lookup[false]);
Amy B
  • 108,202
  • 21
  • 135
  • 185
  • I think it is much complex (and almost without knowledge I think its not really performance). Does this have any advantage? – Diego Apr 16 '12 at 18:36
  • 1
    Condition is evaluated exactly once per item. List instance is not modified, which can be a big advantage if that list instance is shared among threads. – Amy B Apr 16 '12 at 18:47
  • This is quite a genius idea. – Timo Apr 24 '18 at 15:48
6

I would go with the first option for readability purposes, with the note that you should materialize the list first, or you'll lose the very items you're trying to select on the next line:

var sublist = list.Where(x => x.Condition).ToArray();
list.RemoveAll(x => x.Condition);

The second example is O(n^2) for no reason and the last is perfectly fine, but less readable.

Edit: now that I reread your last example, note that as it's written right now will take out every other item. You're missing the condition check and the remove line should actually be list.RemoveAt(i--); because the i+1th element becomes the ith element after the removal, and when you increment i you're skipping over it.

Blindy
  • 65,249
  • 10
  • 91
  • 131
  • It's actually O(n^3), but I'm assuming the lack of materialization just slipped your mind ;) – Blindy Apr 16 '12 at 17:10
  • Would the items (as I wrote it) be removed from subList with the second instruction? :O – Diego Apr 16 '12 at 17:11
  • Er you never remove from `sublist`, nor do you intend to if I read it correctly. – Blindy Apr 16 '12 at 17:13
  • Your edit is right, I'll remove the third options because its really wrong and make it right would be very unreadable – Diego Apr 16 '12 at 17:14
  • No, no.. its not the idea to remove them from subList. I think I misunderstood the materialization. Whats the difference? – Diego Apr 16 '12 at 17:15
  • 3
    When you run a linq query over a collection, you don't get back an array, you get back an object that *when you iterate over it* you run your actual selection. So take your first example. `subList` will be just an object, you remove the items from the main array, and then when you do `foreach(var item in subList)` you get back nothing because the condition will always return false. – Blindy Apr 16 '12 at 17:17
  • So what you do instead is read the items out of your linq query and put them in an array immediately. You can do it manually or using `.ToArray()` or `.ToList()` (or a few others depending on how you want your data to look). – Blindy Apr 16 '12 at 17:18
  • "*you should materialize the list first, or you'll lose the very items you're trying to select*" Maybe true if you're using, say, a lazy-loading recordset from an ORM, but in the general case, I don't *think* this is correct. Check [this code](https://paste.ee/r/Soct1). `iterate` true or false, you'll get `2,2.1` every time. No `ToArray()` needed. – ruffin Jan 21 '17 at 18:51
  • I'll be honest with you, this is a few good years old and looking at it I don't understand the need for the first line at all. `sublist` isn't used by `RemoveAll` at all. – Blindy Jan 23 '17 at 15:44
5

The first option is good, but it makes two runs over collection. You can do it in one run by executing the additional logic inside predicate:

        var removedItems = new List<Example>();
        list.RemoveAll(x =>
        {
            if (x.Condition)
            {
                removedItems.Add(x);
                return true;
            }

            return false;
        });

You can also wrap it into extension for convinience:

public static class ListExtensions
{
    public static int RemoveAll<T>(this List<T> list, Predicate<T> predicate, Action<T> action)
    {
        return list.RemoveAll(item =>
        {
            if (predicate(item))
            {
                action(item);
                return true;
            }

            return false;
        });
    }
}

And use like this:

        var removedItems = new List<Example>();
        list.RemoveAll(x => x.Condition, x => removedItems.Add(x));
Eduard Grinberg
  • 129
  • 2
  • 8
  • This is honestly the best answer. It minimises runs over the collection and leverages built in functions. All the other answers end up calling the match function twice on every element in the list, generate multiple lists, or use heavier structures like lookups. – matt.rothmeyer Oct 05 '19 at 17:40
1

I was looking for the same thing. I wanted to do some error logging on the items that I was removing. Since I was adding a whole bunch of validation rules, the remove-and-log call should be as concise as possible.

I've made the following extension methods:

public static class ListExtensions
{
    /// <summary>
    /// Modifies the list by removing all items that match the predicate. Outputs the removed items.
    /// </summary>
    public static void RemoveWhere<T>(this List<T> input, Predicate<T> predicate, out List<T> removedItems)
    {
        removedItems = input.Where(item => predicate(item)).ToList();
        input.RemoveAll(predicate);
    }

    /// <summary>
    /// Modifies the list by removing all items that match the predicate. Calls the given action for each removed item.
    /// </summary>
    public static void RemoveWhere<T>(this List<T> input, Predicate<T> predicate, Action<T> actionOnRemovedItem)
    {
        RemoveWhere(input, predicate, out var removedItems);
        foreach (var removedItem in removedItems) actionOnRemovedItem(removedItem);
    }
}

Example usage:

items.RemoveWhere(item => item.IsWrong, removedItem =>
    errorLog.AppendLine($"{removedItem} was wrong."));
Timo
  • 7,992
  • 4
  • 49
  • 67