C# is a marvelous language. It’s powerful, fast, and easy to learn. It also exposes advanced features that make doing otherwise difficult things simple. Even more impressively, it manages to make reading the code to do those complicated things easy, too. Many programming languages struggle with this last part. You might be able to do something really cool, but the code is opaque for other developers to read.
One example of a feature where C# makes the difficult easy is in the Language Integrated Natural Query (LINQ) interface. If you’ve been working with a C# code base for a little while, you might have run up against some LINQ code. The first time you see it can feel a little confusing. Some variations of LINQ code look like an entirely different programming language from the C# that you’re used to. Even for variations of LINQ that aren’t entirely different syntax, it can be difficult to parse just what the code you’re looking at is doing.
In this post, we’ll walk through the basics of LINQ and talk about how they work, so the next time you see some LINQ code, it’ll be because you’re writing it yourself.
Two Types of LINQ
The first thing to understand about LINQ is that it comes in two flavors. There’s the “query” syntax, and the “method” syntax. Method syntax will look familiar to anyone who’s written C# before. Writing the method syntax involves starting with a collection object that implements the IEnumerable interface. The function calls define how data from that collection should be dissected, ordered, and presented. Finally, the functions return the results in a new collection of data.
Method Syntax and Lambdas
One important thing to understand about writing method-style LINQ are lambda functions. Lambda functions are a type of anonymous function which are passed as parameters to LINQ functions. So, for instance, if you’re calling the Where function on a collection of data, you pass in a single argument. That argument is a function which returns a Boolean value. That function is then run with every element of the function passed in as a parameter, one by one. Each element which causes the function to return True is retained for the next step of the code. Elements which cause the function to return False are not evaluated in subsequent steps in the code.
A large part of understanding method-style LINQ is understanding lambda functions. Different LINQ methods require different return types for their lambda functions. As you experiment with LINQ functions, you’ll need to understand how to make those lambda functions do what you want. The best way to do this is through practice! Try out new LINQ methods and new lambda functions regularly and you’ll be a pro in no time.
The query syntax for LINQ can be a bit of a shock to the system compared to method-style. Instead of starting with a function called on a collection object, query-style LINQ starts with the from command. The second word in a query-style LINQ call is always the collection the query operates on. From there, additional filters or data-shaping operations run on elements of the collection, just like the method syntax.
Many developers prefer query-style syntax because it reads very similarly to SQL code. A common use case of LINQ is querying data in a database. Being able to use SQL-style queries within C# provides powerful benefits for developers working with those data sources.
Two Ways to Do the Same Thing
The critical thing to remember about method and query syntax in LINQ is that they’re two faces to the same coin. Neither is better than the other. Some complicated operations are better-expressed through query syntax, while simpler queries are often more readable in method syntax. To the underlying language, though, they’re identical. Often, companies will set a standard where one flavor of LINQ is used to the exclusion of the other. This isn’t usually for any performance reasons, but rather for consistency within the code base. At the execution level, they’re identical.
C# Select and Where: LINQ Cornerstones
LINQ exposes a variety of APIs for working with collection data. The two building blocks that are nearly universal are the Where and Select clauses. At first blush, it can be difficult to understand how these clauses work together and how they’re different. The distinction is actually very simple—a Where clause filters your data. The C# Select clause defines the shape it takes.
One important thing to understand about both of these clauses is that they don’t modify the data itself. When you run a Where or Select clause on your data, you’re not modifying the shape of the data inside the collection. In fact, if you’re using LINQ to query a database, the code doesn’t think you have any data at all. Instead, Where and Select perform their operations, then return new collections with the results of those operations inside. This is something that can often throw new developers for a loop. They expect that when they use LINQ on a collection, the collection itself changes. It doesn’t at all.
This is a powerful trait of LINQ. It means that you can interact with a single data set in multiple ways without needing to do any extra work.
C# Select and Where Don’t Return Data by Themselves
Another common mistake I see beginners make when it comes to dealing with LINQ: they expect that calling C# Select or Where will return data. This isn’t true. Instead, those functions return an IEnumerable object which represents that data. It isn’t until your code tries to do something with that collection that the code actually tries to retrieve results.
This is important because it allows LINQ to use what’s called Deferred Execution. Basically, if you have an IEnumerable representing a collection of data, C# won’t try to fill out that collection until you ask for the data it contains. What’s more, if you ask for items one at a time, C# will only retrieve those items one at a time. There are significant caveats to this statement. When you’re retrieving rows from the database, it won’t fetch rows one by one, for instance. But imagine that you have an IEnumerable which represents results from the Fibonacci Sequence. This is an infinite sequence of numbers. If you had to materialize all the results before you could use them, your program would never complete. Instead, C# would know to only compute items as you asked for them. If you only ever used the first ten numbers in the collection, that’s all C# would compute.
LINQ: Easy to Learn, Lots to Unlock
This is a pretty basic overview of LINQ. It contains a lot of tips that I wish I’d had someone tell me when I first started learning to use it. There are a lot of links to documentation in this article. I highly recommend spending some time reading Microsoft’s C# documentation about LINQ. It comes with piles of easy-to-understand code samples. Those samples go into greater detail than we have the space for here, but they’re still very straightforward.
One of my favorite attributes of LINQ is that it’s very simple to pick up, and it’s useful even for basic operations. If you have a list of integers and you want to pluck the even numbers, LINQ makes that trivial. You’ll wind up writing fewer lines of code and it’ll be more readable than if you tried to do it yourself. If that’s all you take away from this post, that’s a great start.
As you get more familiar with C# Select and Where clauses, you’ll discover new abilities LINQ has to offer. Each time you’ll get a little better and your code will get a little cleaner. So the next time you find yourself looking to filter or shape a collection of data, keep LINQ in mind. The code you improve will be your own!