Designing an object with a long running constructor

Question

I have a class that is designed to provide quick access to some metadata for some specific files that are used multiple times throughout my application. Unfortunately some of the metadata can only be extracted with a very long running method call.

I have another class that provides an async wrapper for the long running method (can be 5 or more minutes depending on the size of the file), but I'm trying to figure out how to call this async method and if it would be appropriate to put it into the constructor, or if there is a better design pattern for this scenario.

Here's some pseudo code to try and illustrate my question:

public class MetaData
{
    public string Data1 { get; private set; }
    public string Data2 { get; private set; }

    public MetaData(String filepath)
    {
        var extractor = new ExtractMetaData(filepath);  //create instance of class that fetches the metadata

        this.Data1 = extractor.GetData1(); // short running method

        extractor.Data2Received += Data2Received;  
        extractor.GetData2Async();  // long running method, called with via async method

    }        

    private void Data2Received(object sender, MetaDataEventArgs args)
    {
        this.Data2 = args.Data;  // finally set Data2 property
    }
}

class ExtractMetaData
{

    public event Data2ReceivedEventHandler Data2Received;

    public ExtractMetaData (string filePath) { }

    public String GetData1();  // very fast method to get Data1
    public void GetData2Async();  // very slow method to get Data2

}

What I'm trying to figure out is if there is better way to accomplish this?

With my code now, there is virtually no wait for MetaData to be constructed, but if someone were to try to access the MetaData.Data2 property before the GetData2Async() method returns and fires the Data2Received event, they are going to get a null response. But if they call if after it is returned, it will contain the correct information. Since there is really no way to notify the user that this method has completed, I'm worried this will turn into a bad user experience as they don't havr to wait for the constructor, but have to wait for all of the properties to be set.

I think it depends on the context, as in what type of API you are providing, but I would probably make the call available in either an asynchronous callback form, or a blocking standard call. I don't think the wrapper is really going to gain anything in this case, and unless you might be able to return a partial answer, I don't think you can really design your way out of it. Just my opinion, but I would leave the option to the caller, and focus my efforts on optimizing that routine. — Tim, Jan 26 '13 at 02:13
@Tim the wrapper is actually part of a standalone library I wrote for other purposes, it just so happens to provide the async functionality to avoid blocking. It it was just a matter of 10 or 15 seconds, I would probably opt for a blocking method, but when I realized it could be 5 minutes, I started thinking of an alternative (which was calling the async method in the constructor). — psubsee2003, Jan 26 '13 at 02:16
Sorry, I guess what I sort of meant to say was that I don't think there are any tricks you could apply to the wrapper to gain anything in terms of client clarity or performance (unless partial answers have some use). The wrapper itself still allows you to separate the 'ugh this method is slow so I need to encapsulate it' logic from the parsing logic, so I'm sure it's worthwhile. — Tim, Jan 26 '13 at 02:19
@Tim as for optimizing the routine, that is impossible... the actual long running method is an external library that I have no control over. My options are really only, live without the data obtained in the long running method, or live with the long execution time. I opted for the latter since the extra data is extremely useful for some parts of my API. — psubsee2003, Jan 26 '13 at 02:29

score 2 · Accepted Answer · answered Jan 26 '13 at 03:48

First, you said that there is no way to notify the user that getting Data2 completed. That's not true, you can use any number of ways to notify the user of that, for example an event or Task.

But I think you should actually restructure your class. You said that getting Data2 takes very long, which most likely means it uses a lot of resources. Because of that, I think you shouldn't even try to initialize Data2 unless you have to. And how do you know that? The user will have to tell you. And ideally, if the user doesn't want Data2, he shouldn't even be able to access it, which means splitting MetaData into two classes: something like BasicMetaData and ExtendedMetaData, which inherits from BasicMetaData.

In ExtendedMetaData, either you would have some way of notifying the user that the initialization completed (most likely using an event), or you could make the constructor wait until the initialization is complete (you could use Monitor.Wait() and Monitor.Pulse() to do that).

Personally, I think the best option would be if you had a static factory method that would return Task<ExtendedMetaData>. That way, the user can wait for the result synchronously (using Result) or asynchronously (using ContinueWith() or await where it's available). This would be especially useful in .Net 4.5 (because of await), but also on .Net 4.0. Unfortunately, the tag on your question indicates that you're using .Net 3.5, which doesn't have Task. I would recommend you to upgrade, if possible.

I'm actually using .Net 3.5 for other reasons (mostly backward compatibility with other applications that are fixed to .Net 3.5 or I would be targeting .Net 4). But I like your suggestion about the Base and Extended classes. It complicates matters slightly as I have a class that already inherits from my `MetaData` class, but I can split that up into a Base and Extended type as well. — psubsee2003, Jan 26 '13 at 08:20

score 1 · Answer 2 · answered Jan 26 '13 at 02:08

1

I think you need to spend your attention on the following patterns: Lazy loading (to call 'long' methods only when you really need it) and Proxy (to implement caching layer if needed, hide internal realization, maybe multiple various different patterns on the bottom of your object). If you decide to use several objects to ensure whole functionality - then Facade also could be reasonable choice.

answered Jan 26 '13 at 02:08

Ph0en1x

9,943
8
48
97

I thought about Lazy loading, but as I commented on DarthVader's answer, since I am dealing with access a file on a local or network drive, I'd rather deal with any IO Exceptions when I call the constructor, and not when I first need the data since my code will be in a better position to handle the exception at that time. I'm afraid I don't know much about the Proxy and Facade patterns, but I'll look into them and see if they will suit my needs. – psubsee2003 Jan 26 '13 at 08:32

score 0 · Answer 3 · edited May 23 '17 at 10:34

0

You will get several different answers for this question. Here is my take on it.

IMO, you shouldn't call any operation in your constructor, such as what you are doing right now. All that stuff in your MetaData constructor shouldn't be there to start with.

While you are instantiating an object, it might throw an exception, which is fine but your object won't be constructed. Some of the best practices. constructor should be short running and should ensure that object graph will be created after constructor.

Look at this question also: How much code should one put in a constructor?

Alternatively, you should inject your dependencies and create methods to populate the data.

It would be more helpful, if you could describe your problem little more.

You really need to simplify your process and design.

edited May 23 '17 at 10:34

Community

1
1

answered Jan 26 '13 at 03:49

DarthVader

52,984
76
209
300

On the other hand, constructor should return a fully initialized usable object, not something that might start working sometime in the future. – svick Jan 26 '13 at 03:52
No!. ie: You can have a class for DB ORM and for any reason, this class might fail due to many reasons. That s why you have exceptions. But constructor is no way a place to do all these operations. – DarthVader Jan 26 '13 at 03:53
I think constructor is exactly the right place for initialization, that's what it's there for. And I'm not sure I follow your DB example, but if class is going to fail, I want to fail it as soon as possible. And I don't think requiring every public method of that class to call something like `InitializeIfNecessary()` (which is required by your approach) is a good idea. – svick Jan 26 '13 at 04:04
While I think your point has a lot of merit, I agree with svick on the initialization, especially for this scenario. What good is an object if it is not ready to be used once the constructor returns? Since this object is accessing a file (which could be on a local or network drive), I'd rather get the IO exception right away when I can best handle the exception rather than some time in the future when the code block has no knowledge of what to do about it. – psubsee2003 Jan 26 '13 at 08:26

Designing an object with a long running constructor

3 Answers3