Usage of SOSS with Nuget

#1
Hi,

We have a cluster with 4 SOSS nodes. In the past we used the DLL and recently we move to use the Nuget package DLL.
After we upgraded or code to work with the Nuget, we're getting the following error once in 2-3 days:
(from the log)
21/07/08 03:11:06 - Service: internal consistency check: 364 qha579
21/07/08 03:11:06 - Tr [sys-ls 0] apm 9253 alcm 4580 nalc 1608376 nt 1191
21/07/08 03:11:06 - Tr [sys-ls 1] oi 51 alcm 6 nalc 187
21/07/08 03:11:06 - Tr [sys-ls 1] oi 53 alcm 400 nalc 498177
21/07/08 03:11:06 - Tr [sys-ls 1] oi 55 alcm 526 nalc 454722
21/07/08 03:11:06 - Tr [sys-ls 1] oi 128 alcm 2946 nalc 49225
21/07/08 03:11:06 - Tr [sys-ls 1] oi 132 alcm 611 nalc 605355
21/07/08 03:11:06 - Tr [sys-ls 1] oi 136 alcm 56 nalc 512
21/07/08 03:11:06 - Event logging message 40000014:
21/07/08 03:11:06 - done; now spawning restarter process
21/07/08 03:11:06 - Tr [stl-ini] 30 2 0 4 - 30 2 0 4
21/07/08 03:11:06 - SOSS: restarting service.
21/07/08 03:11:06 - SCM stop signal received
21/07/08 03:11:06 - Event logging message 40000002: (stop requested)
21/07/08 03:11:06 - Immediate service stop was enabled.
21/07/08 03:11:06 - Authorization: stopped.
21/07/08 03:11:06 - Tr [rep-rns 0] 0.0.0.0
21/07/08 03:11:06 - Stop notification sent to all active hosts.
21/07/08 03:11:07 - Tr [pro-phi 8]
21/07/08 03:11:07 - Tr [pro-psr 0] 54
21/07/08 03:11:07 - Tr [pro-psr 1] 54 10
21/07/08 03:11:07 - Server: AIO completion unexpectedly closing connection to host 10.0.62.219, hdl 54 54, code -1
21/07/08 03:11:07 - Tr [pro-phi 8]
21/07/08 03:11:07 - Tr [pro-psr 0] 76
21/07/08 03:11:07 - Tr [pro-psr 1] 76 10
21/07/08 03:11:07 - Server: AIO completion unexpectedly closing connection to host 10.0.62.219, hdl 76 76, code -1
21/07/08 03:11:07 - Tr [pro-phi 8]
21/07/08 03:11:07 - Successfully disconnected from group
21/07/08 03:11:07 - Tr [pro-psr 0] 66066
21/07/08 03:11:07 - Stop notification delivered to all active hosts.
21/07/08 03:11:07 - Tr [pro-psr 1] 66066 10
21/07/08 03:11:07 - Server: AIO completion unexpectedly closing connection to host 10.0.62.139, hdl 66066 66066, code -1
21/07/08 03:11:07 - Tr [pro-phi 8]
21/07/08 03:11:07 - Tr [pro-psr 0] 66040
21/07/08 03:11:07 - Tr [pro-psr 1] 66040 10
21/07/08 03:11:07 - Server: AIO completion unexpectedly closing connection to host 10.0.62.220, hdl 66040 66040, code -1
21/07/08 03:11:07 - Tr [com-ae 0] 10038
21/07/08 03:11:07 - Network error: server temporarily cannot accept incoming service connection.
21/07/08 03:11:07 - Closing management connection from local client (7ff6537bce20 s 10 r 33050).
21/07/08 03:11:07 - SOSS client lib: event handling cannot access the store for read (0 -1 0).
21/07/08 03:11:08 - Tr [stl-ini] 30 2 0 4 - 30 2 0 4

This caused the service to restart and therefore all the keys etc to get lost. In the event viewer, we're seeing the following message:
An internal error has occurred. The local service will be restarted. Please contact technical support at ScaleOut Software ([email protected]) for assistance. 364 qha579 5 828


Any idea what can cause this? We're using the latest SOSS server in all the cluster (5.10.9) and we don't have any issue when we're using the old DLL (5.0.0)

Thank you.
 

markw

Administrator
Staff member
#2
Did your app’s usage of the [SossIndex] attribute change when you migrated to the Scaleout.Client nuget package?

The log indicates that the server encountered an error as an object was being added to the cache--the service was indexing properties marked with the [SossIndex] attribute and found an inconsistency, but I cannot determine the exact cause from the log.

Can you share the class definition(s) of the type that’s being stored to your cache (that is, the TValue in Cache<TKey, TValue> that you’re using)? I’d like see how the [SossIndex] attribute is used and see if there’s anything unusual.
 
Last edited:
#3
Hi,

Thank you for your answer.
When we migrated to the nuget package, basically we added the following attributes: [NugetSossIndex, LegacySossIndex] which are defined as:
using NugetSossIndex = Scaleout.Client.QuerySupport.SossIndexAttribute;
using LegacySossIndex = Soss.Client.SossIndexAttribute;

We're using it in many classes so it's hard for me to know what class is responsible for this..
Is it an issue to use both the old and the new index arrtibute?

Thank you.
 

markw

Administrator
Staff member
#4
The two libraries ignore each other's [SossIndex] attributes, so using both should be fine. However, for every indexed property, I would check to make sure that the same HashIndexPriority value is passed into the constructor of both versions of the attribute.

But it's interesting that you left the old Soss.Client.SossIndexAttributes in your code--is there any particular reason for this? Do you run a mix of clients that use both the legacy and new libraries?

In any case, I’ve run tests and haven’t been able to reproduce this issue, so any additional information you could provide about your attribute usage would be helpful. Or, better yet, an example that reproduces the issue would be ideal. Some questions:
  • Do you store multiple types in the same cache? If so, do you use inheritance such that properties on the base class are indexed?
  • Do any of your classes use more than eight indexed properties marked with HashIndexPriority.HighPriorityHashable?
  • Is it possible that you have different versions of the same class with different indexed properties in a cache? (For example, perhaps you have multiple versions of your app running side-by-side?)
 
Last edited:

markw

Administrator
Staff member
#5
Update: We're changing the server to return an error to the client instead tripping a consistency check and restarting. The exception thrown back to the Cache caller will help track down which type is causing the problem.
 
Top