1

When this command is executed while connected to a secure cluster:

Register-ServiceFabricApplicationType -ApplicationPathInImageStore 'MyType' -TimeoutSec 600 -Debug -Verbose 

...it throws a timeout exception. I can run Copy-ServiceFabricApplicationPackage without problems, so it's clearly possible to connect to the cluster.

Registering an app type shouldn't really be a heavy operation, I therefore suspect that there is some underlying problem.

Stack trace:

VERBOSE: System.TimeoutException: Operation timed out. ---> System.Runtime.InteropServices.COMException: Exception from HRESULT: 0x80071BFF
   at System.Fabric.Interop.NativeClient.IFabricApplicationManagementClient6.EndProvisionApplicationType(IFabricAsyncOperationContext context)
   at System.Fabric.Interop.Utility.<>c__DisplayClassa.<WrapNativeAsyncInvoke>b__9(IFabricAsyncOperationContext context)
   at System.Fabric.Interop.AsyncCallOutAdapter2`1.Finish(IFabricAsyncOperationContext context, Boolean expectedCompletedSynchronously)
   --- End of inner exception stack trace ---
Register-ServiceFabricApplicationType : Operation timed out.
At line:1 char:1
+ Register-ServiceFabricApplicationType -ApplicationPathInImageStore 'M ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationTimeout: (Microsoft.Servi...usterConnection:ClusterConnection) [Register-ServiceFabricApplicationType], TimeoutException
    + FullyQualifiedErrorId : RegisterApplicationTypeErrorId,Microsoft.ServiceFabric.Powershell.RegisterApplicationType

Any help greatly appreciated!

  • How many files are in your code package? We ran into this when we had >1,000 files in a package (not an exact limit). – Haukman Jul 01 '16 at 06:32
  • @Haukman The "pkg" folder has ~6500, but each Service has ~700. The services are not particularly big either, the bulk of the files are in "approot/packages". – Anders Olofsson Jul 01 '16 at 11:38
  • I suspect there is something wrong with the cluster setup, tried a manual publish (only published through VS before) towards a non-secure cluster and it worked. So it's either a bad cluster config / certificate problems, or both. – Anders Olofsson Jul 01 '16 at 11:45
  • We had the exact same issue, with about 6500 files (javascript packages). It worked most of the time (not always) to our non-secure cluster, but the production cluster always timed out. We received a workaround from Microsoft, I'll post it here for reference. – Haukman Jul 01 '16 at 17:58
  • node_modules is often responsible .. im working on not publsihing and running npm on the server at first run. – user1496062 Oct 06 '16 at 00:57

2 Answers2

7

If this is a cluster that is hosted by Azure, these settings will be reverted to default values the next time cluster code or configuration upgrade is performed. Please change them by setting the value in the Service Fabric resource via Azure Resource Explorer or PowerShell. This will ensure that these settings are preserved.

"fabricSettings": [
      {
        "name": "EseStore",
        "parameters": [
          {
            "name": "MaxCursors",
            "value": "32768"
          }
        ]
      }

],

On the local development cluster you can modify the default cluster manifest XML files located in C:\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup\ with following section and then recreate the cluster.

<Section Name="EseStore">
  <Parameter Name="MaxCursors" Value="32768" />
</Section>
VipulM-MSFT
  • 476
  • 3
  • 5
2

Here's what worked for us in a very similar situation:

Please apply the following mitigation steps (one node at a time) on all nodes:

  1. Go to data root of cluster, by default it is “D:\SvcFab\” or C:\ProgramData\Microsoft\SF if you didn’t specify datapath attribute in deployment. Look for folder by name . Inside that folder you will see file Fabric.Package.current.xml. create a backup.
  2. Open the Fabric.Package.current.xml file and look for Fabric.Config version:
  3. Go to config folder for that version and look for Settings.xml file.
  4. Take a backup of Settings.xml file and open the Settings.xml file.
  5. Look for section named “EseStore”.
<Section Name="EseStore">

</Section>
  1. If section is present, add this parameter to section (prefer to type these text in the Settings.xml file than copy-pasting as copy-pasting sometimes appends extra invalid characters at the end).
<Section Name="EseStore">
<Parameter Name="MaxCursors" Value="32768" />
</Section>
  1. If the section with name “EseStore” is not present, add the section name with parameter as and save the file.
  2. Kill FileStoreService.exe on the node.
  3. Wait for the FileStoreService.exe to come back up and then proceed to next node.

After you are done applying mitigation to all nodes and cluster is back up healthy, please retry deploying the package. Please make sure for testing in testing machine first before use on production.

Haukman
  • 3,726
  • 2
  • 21
  • 33