A few months ago we migrated to a new VOB server (we have a Solaris/Windows inter-op, Samba-based environment). The transition went smoothly and all seemed well.
Suddenly, after a month or so, some of our users started having problems accessing some files. It all seemed very random, and it was slowly spreading.
In the view log we saw numerous cleartext-related errors:
2012-05-01T12:37:15+03:00 view_server(3124): Error: view_server.exe(3124): Error: Unable to construct cleartext for object "0x96E3D" in VOB "atlas:/disk02/algotec_vobs/3rdparty.vbs ": error detected by ClearCase subsystem 2012-05-01T12:37:15+03:00 view_server(3124): Error: view_server.exe(3124): Error: Type manager "z_whole_copy" failed construct_version operation. 2012-05-01T12:37:15+03:00 view_server(3124): Warning: z_whole_copy: Error: Can't open input file "\\atlas\disk02\algotec_vobs\3rdparty.vbs\s/sdft\e/28/3-6c7f92d47de9400a948cc3a7b52 3a9f9-fb" - Invalid argument
In the samba log we saw the following errors:
[2012/05/01 15:29:15, 1] libads/kerberos_verify.c:442 (ads_verify_ticket) ads_verify_ticket: krb5_init_context failed (Too many open files) [2012/05/01 15:29:15, 1] smbd/sesssetup.c:342(reply_spnego_kerberos) Failed to verify incoming ticket with error NT_STATUS_LOGON_FAILURE!
Since I couldn’t find anything useful in IBM Rational Support website, or via Google, I contacted IBM Technical Support. After several back-and-forth emails, we nailed the problem. Turns out it wasn’t server-side at all!
The affected users were all using a relatively new PC, all installed from the same image. There were two things wrong with this image:
- The CLEARCASE_PRIMARY_GROUP environment variable was not defined.
- The ‘maximum mnodes’ value in the MVFS Performance configuration was set to 800 (suitable for 64-bit Samba) instead of 200 (suitable for 32-bit Samba, which we use).
After fixing this configuration, the problem was gone and everyone was happy again. Quite a challenge, it was…