The .NET Stacks #16: App trimming and more on ML.NET

We discuss app trimming in .NET 5, more on ML.NET with Luis Quintanilla, and more!

Dave Brock
Dave Brock

Welcome to 2020, where Thomas Running is finally getting the credit he deserves.

With the .NET 5 last preview hitting last week, you’d think we wouldn’t have much to talk about! Oh, no—there is always something to talk about.

This week:

  • App trimming in .NET 5
  • More with Luis Quintanilla on ML.NET
  • Community roundup

App trimming in .NET 5

With .NET 5 preview 8 shipped last week, Microsoft has been pushing a lot of articles about the performance improvements. This week was no exception as Sam Spencer discussed how app trimming will work in .NET 5. It’s not as sexy as Blazor, but crucially important.

A huge driver for .NET Core over .NET Framework, among several other things, is self-contained deployments. There’s no dependency on a framework installation, so setup is easier but the size of the apps is much larger. With .NET Core 3, Microsoft introduced app trimming, or assembly linking, that optimizes your deployment size. Long story short, it only packages assemblies that are used.

That’s great if you forget to remove dependencies you’re no longer using, but the real value comes in opening those assemblies. That’s what’s coming with .NET 5: they’ve expanded app trimming to remove types and members unused by your application as well. This seems both exciting and scary: it’s quite risky with extensive testing required and as a result, Spencer notes it’s an experimental feature not ready for large adoption yet. With that in mind, the default trimming in .NET is assembly-level, but you can use a <TrimMode>Link</TrimMode> setting in your project file to enable member-level trimming.

This is not setting and forgetting, though: it only does a static analysis of your code and, as you likely know, .NET Core depends heavily on dynamic code patterns like reflection—especially for dependency injection and object serialization. Because the trimmer can’t discover types at runtime that leverage these patterns, it could lead to disastrous results. What if you dynamically discover an interface at runtime and the analyzer doesn’t find it, then the types are trimmed out? Instead of trying to resolve all your problems for you, which will never be a perfect process, the approach in .NET 5 is to both provide feedback to you if the trimmer isn’t sure about something, and also annotations for you to use that will help the trimmer—especially when dealing with these dynamic behaviors.

Speaking of reflection, do you remember when we talked about source generators last week? The .NET team is looking at implementing source generators to move functionality from reflection at runtime to build-time. This speeds up performance but also allows the trimmer to better analyze your code—and with a higher level of accuracy.

A big winner here is Blazor—with the release in May, Blazor now utilizes .NET 5 instead of Mono (and with it, it’s increased size). Blazor WebAssembly is still a little heavy and this will go a long way to making the size more manageable.

Until native AOT comes to .NET—and boy, we’re impatiently waiting—this will hopefully clear the path for its success. I’m cautiously optimistic. I know app trimming has been a bumpy road this far, but the annotations might provide for a better experience—and allow us to confidently trim our apps without sacrificing reliability.

Take a look at the introductory article as well as a deep dive on customization.

Dev Discussions: More with Luis Quintanilla

Last week, we began a conversation with Luis Quintanilla about ML.NET. This week, we discussing when to use ML.NET over something like Azure Cognitive Services, practical uses, and more.

Luis Quintanilla

Where is the dividing line for when I should use machine learning, or use Azure Cognitive Services?

This is a really tough one to answer because there’s so many ways you can make the comparison. Both are great products and in some areas, the lines can blur. Here are a few of them that I think might be helpful.

Custom Training vs Consumption

If you’re looking to add machine learning into your application to solve a fairly generic problem, such as language translation or identifying popular landmarks, Azure Cognitive Services is an excellent option. The only knowledge you need to have is how to call an API over HTTP.

Azure Cognitive Services provides a set of robust, state-of-the-art, pretrained models for a wide variety of scenarios. However, there’s always edge cases. Suppose you have a set of documents that you want to classify and the terminology in your industry is rare or niche. In that scenario, the performance of Azure Cognitive Services may vary because the pretrained models most likely have never encountered some of your industry terms. At that point, training your own model may be a better option, which for this particular scenario, Azure Cognitive Services does not allow you to do.

That’s not to say Azure Cognitive Services does not allow you to train custom models. Some scenarios like language understanding and image classification have training capabilities. The difference is, training custom models is not the default for Azure Cognitive Services. Instead, you are encouraged to consume the pretrained models. Conversely, with ML.NET, for training purposes, you’re always training custom models using your data.

Online vs Offline

By default, Azure Cognitive Services requires some form of internet connectivity. In scenarios where there is strong network connectivity, this is not a problem. However, for offline or air-gapped environments, this is not an option. Although in some scenarios, you can deploy your Azure Cognitive Services model as a container, therefore reducing the number of requests made over the network, you still need some form of internet connectivity for billing purposes. Additionally, not all scenarios support container deployments therefore the types of models you can deploy is limited.

While Azure in general makes sure to responsibly protect the privacy and security of users data in the cloud, in some cases whether it’s a regulatory or organizational decision, putting your data in the cloud may not be an option. In those cases, you can leverage Azure Cognitive Services container deployments. Like with the offline scenario, the types of models you can deploy is limited. Additionally, you most likely would not be training custom models since you wouldn’t want to send your data to the cloud.

ML.NET is offline first, which means you can train and deploy your models locally without ever interacting with a cloud environment. That being said, you always have the option to scale your training and consumption by provisioning cloud resources. Another benefit of being offline first is, you don’t have to containerize your application in order to run it locally. You can take your model and deploy it as part of a WPF application without the additional overhead of a container or connecting over to the internet.

Do you have some practical use cases for using ML.NET in business apps today?

Definitely! If you have a machine learning problem, the framework you use is fairly agnostic as long as the scenario is supported. Since ML.NET supports classical machine learning as well as deep learning scenarios, it practically can help solve a wide variety of problems. Some examples include:

  • Sentiment analysis
  • Sales forecasting
  • Image classification
  • Spam detection
  • Predictive maintenance
  • Fraud detection
  • Web ranking
  • Product recommendations
  • Document classification
  • Demand prediction
  • And many others…

I would suggest for users interested in seeing how companies are using ML.NET in production today, visit the ML.NET customer showcase.

What kind of stuff do you work on over at your Twitch stream? When is it?

Currently I stream on Monday and Wednesday mornings at 10 AM EST. My stream is a live learning session where folks sometimes drop in to learn together. On stream, I focus on building data and machine learning solutions using .NET and other technologies. The language of choice on stream is F# though I don’t strictly abide by that and will use what works best for getting the solution working.

Most recently, I built a deep learning model for malware detection using ML.NET. I’ve been trying to build .NET pipelines with Kubeflow, a machine learning framework for Kubernetes on stream as well. I’ve had some trouble with that, but that’s what the stream is about, learning from mistakes.

Inspired by a reddit post which asked how to get started with data analytics in F#, I’ve started working on using F# and various .NET tools like .NET Interactive to analyze the arXiv dataset.

If any of that sounds remotely interesting, feel free to check out and follow the channel on Twitch. You can also catch the recordings from previous streams on YouTube.

What is your one piece of programming advice?

Just do it! Everyone has different learning styles, but I strongly believe no amount of videos, books or blog posts compare to actually getting your hands dirty. It can definitely be daunting at first, but no matter how small or basic the application is, building things is always a good learning experience. Often there is a lot of work that goes into producing content, so end users typically get the polished product and the happy path. When you’re not sure of where to start or would like to go more in depth, these resources are excellent. However, once you stray from that guided environment, you start making mistakes. Embrace these mistakes because they’re a learning experience.

Check out the full interview with Luis at my site.

🌎 Last week in the .NET world

🔥 The Top 3

📢 Announcements

📅 Community and events

😎 Blazor

🚀 .NET Core

⛅ The cloud

📔 Languages

🔧 Tools

📱 Xamarin

🎤 Podcasts

🎥 Videos

.NET Stacks