2

I had seen this question Update all objects in a collection using LINQ.

class

public class A
{
    public int B { get; set; }
}

sample data

var arr1 = new List<A>()
{
    new A() {B = 10},
    new A() {B = 20}
};


var arr2 = new List<A>()
{
    new A() {B = 10},
    new A() {B = 20}
};

arr1.Select(x=>{x.B=0;return x;}).ToList();
arr2.Select(x=>{x.B=0;return x;});
arr2.ToList();

Result

arr1 <===============>
0
0
arr2 <===============>
10
20

c# online

My question

Why does arr1.Select(x=>{x.B=0;return x;}).ToList(); set arr1 value without reassigning to arr1 like

arr = arr1.Select(x=>{x.B=0;return x;}).ToList();

but arr2 can't do the same thing using the following code

arr2.Select(x=>{x.B=0;return x;});
arr2.ToList();

I know it might be a Lazy Evaluation

But I want to know is there any official link or more elaborate explanation for this question?

Heretic Monkey
  • 11,687
  • 7
  • 53
  • 122
D-Shih
  • 44,943
  • 6
  • 31
  • 51
  • 1
    Your `arr2` example would do the same thing, if you would apply/call `ToList()` on the result of `arr2.Select(...)` (currently you discard that result and call ToList on arr2 instead). Note that the lambda expression in the `Select` clause will only be evaluated (executed) by the `ToList` method in your example, not the Select clause itself. And by the way, neither of your examples modifies arr1 or arr2. Both lists remain unchanged. What changes is only the content of the elements of the lists, not the content of the lists themselves... –  May 02 '19 at 16:15
  • Until you iterate the source, the `Select` will not execute. – Cleptus May 02 '19 at 16:16
  • It's like calling `someString.Replace('a', 'b');` without assigning it to anything. `Replace` returns a new `string` but it goes nowhere unless it's assigned, as in `var newString = someString.Replace('a', 'b');` – Scott Hannen May 02 '19 at 16:19
  • Thanks for reply, but I want to know why can `arr1` modify `B` property value only by `arr1.Select(x=>{x.B=0;return x;}).ToList();`? – D-Shih May 02 '19 at 16:44
  • 2
    You should **never use this technique**. A select should **never** mutate anything. If you want to mutate a collection, use *foreach*. – Eric Lippert May 02 '19 at 17:08
  • @EricLippert I know I won't use this in my real project. But I am curious how did it make it? – D-Shih May 03 '19 at 00:08

3 Answers3

2

Your title says "...without reassigning to itself...": Because A is a reference type, Select() doesn't create copies of the items in the source collection. Your lambda gets each actual object in turn. When it sets x.B = 0, it's acting on the original item that's still in the collection.

The more interesting question is why the arr1 and the arr2 code behave differently.

Let's take a look at what Select() returns:

var z = arr2.Select(x => { x.B = 0; return x; });
arr2.ToList();

Set a breakpoint on the second line there, and we find that this is type of z; this is the thing that is returned by arr2.Select(x => { x.B = 0; return x; }). It's the same type of object you're calling ToList() on in the arr1 line:

System.Linq.Enumerable.SelectListIterator<ConsoleApp3.A, ConsoleApp3.A>

Select() doesn't do much. It returns an object that's prepared to iterate over each item in arr2 in turn, set the B property of each item, and then return each item in turn.

It's ready to do that. But it hasn't done it, and it won't do it until you ask it to (lazy evaluation, as you suggest). Let's try this:

var a = z.First();

That tells the SelectListIterator to evaluate the Select() lambda for just the first item in arr2. That's all it does. Now the first item in arr2 has B == 0, but the rest don't, because you didn't touch them yet. So let's touch all of them:

var b = z.ToList();

Now the ToList() call will force the SelectListIterator to go through and execute your Select() lambda expression for every item in arr2. You did that right away for arr1, which is why B was zero for every item in arr1. You never did it for arr2 in your code at all. The thing that does the work is not the Select(), but the object returned by Select(). For arr2, you discarded that object without enumerating it, so it never did the work.

And we now understand that arr2.ToList() didn't do anything: In the case of arr1, t was the act of calling ToList() on the result of the Select() that applied the Select()'s changes to arr1. If you had called arr1.ToList(); instead, that would have had no effect either. It would just create an exact copy of arr1, and if you didn't assign that to anything, it would just be discarded.

All of this is one reason why we never put side effects in LINQ expressions: You create effects which are baffling even in a minimal, highly simplified example created for a StackOverflow question. You don't need that in production code.

Another reason is we never need to.

  • Thank for your reply. I think the key point is `Enumerable.WhereSelectArrayIterator` class , there is a `MoveNext` method in source code tell me the question answer :) – D-Shih May 03 '19 at 01:04
1
arr1.Select(x=>{x.B=0;return x;}).ToList(); //Enumerates the Select, so it is executed
arr2.Select(x=>{x.B=0;return x;}); //Creates the query, it is not executed
arr2.ToList(); //Enumerates the list you already have
Yuriy Faktorovich
  • 67,283
  • 14
  • 105
  • 142
0

What you are doing is causing side effects in your select, that is instead of selecting data, you are assigning data "within" the select.

When you do this that code (the code that changes the data) gets executed whenever the collection gets enumerated, however neither arr1 nor arr2 are the collections getting enumerated here :

// Here arr1 is not getting enumerated, what is getting enumerated is what 
// is returned by the select, and it is enumerated immdiately because the 
// ToList materializes it. This means that while the collection arr1 is 
// unchanged, you are changing the value of its members, hence why it shows in 
// your further console writelines
arr1.Select(x=>{x.B=0;return x;}).ToList(); 

// here you have the same select, but you discard it! a select doesn't affect 
// the collection at all, arr2 is the SAME before and after the select, you 
// would have to call ToList on what was RETURNED by the select, which is why 
// it worked on arr1 (because you chained the ToList, so it was applied to 
// what was returned by the Select)
arr2.Select(x=>{x.B=0;return x;});

// This does strictly nothing, you create a new list from arr2, which you do 
// not store
// arr2.ToList();

Basically if you wanted to split the query for arr2 you would have to write it as such: var tmp = arr2.Select(x=>{x.B=0;return x;}); tpm.ToList(); // call it on TMP not on arr2! arr2 was NOT changed, but tmp is what was returned by the select!

Also note that overall, you should NEVER do any of this, if you want to change every element of a collection, use a foreach, linq is there to shape and select data, not to modify it.

Ronan Thibaudau
  • 3,413
  • 3
  • 29
  • 78