Upgrade to v 1.7.7 -> daily or sometimes hourly client freezes

Since upgrade to v 1.7.7 of all clients and CMS, we’re
getting daily or sometimes hourly client freezes on multiple (10) clients
(there’s no error messages on screen at that time), the campaign will get stuck
on a certain layout, or layout will be missing a region and client will not
switch to the next scheduled layout.

CMS shows the following errors (and maybe others but I
didn’t correlate them to the crash time yet), it involves various layouts
without a recognized pattern, and doesn’t happen every time with a particular
layout, it will run fine for a while and then break:

[UI Thread] MainForm - ChangeToNextLayout

Prepare Layout Failed. Exception raised was: The process
cannot access the file ‘C:\Users\Xibo\Documents\Xibo Library\247.xlf’ because
it is being used by another process.

Change to C:\Users\Xibo\Documents\Xibo Library\247.xlf
failed. Exception raised was: The process cannot access the file
’C:\Users\Xibo\Documents\Xibo Library\247.xlf’ because it is being used by
another process.

System is Minix Z64 with windows 10, intel drivers, .net
framework and windows update are all latest.

Any ideas?

So if all your drivers are up to date, perhaps check your library location in CMS settings - in 1.7.7 it must be a fully qualified path.

Please see ‘Library location’ section of this post Xibo CMS Post-Installation Setup Guide

Library path is ok, it already was fully qualified

There wasn’t a massive number of changes between 1.7.6 and 1.7.7 - Can you revert back to the 1.7.6 player on one device and see if the problem goes away?

The 1.7.6 actually used to crash the client with the
"Object reference not set to an instance of a object…" error on
screen, so although more frequent, this is sort of better :slight_smile: Everything up unil
1.7.6 ran pretty much perfectly.

Well, in the meantime we tried about a million things
with this setup, fiddled with a bunch of settings, included exceptions with the
windows defender, installed new graphics drivers that came out in the meantime,
several windows updates, Xibo watchdog, auditing the screen, examining event
log… but to no avail, the above client error is about the only thing that
comes up related with the client “stalling”.

We have discovered that simply going into schedule and
"re-saving" any of the scheduled layouts or campaigns for that
screen, will in fact make it continue running normally without the need to
restart the client.

Since none of this is happening with our identically
configured non-Minix devices I’m assuming the problem is somewhere with the
Minix and Windows 10 drivers, but the error still appears after 2 intel HD
driver upgrades.

We’re now attempting to use a repeating priority slide
every few hours to see if it will automatically “kick” the player out
it once it freezes, but this has not been running long enough to draw any
conclusions.

Is there anything else you can suggest we try, or should
we just wait it out until 1.8?

Dear All, the above problem haunted me for over 2 weeks,
and today its was discovered that it was thumbnail_db cache was hanging the layout change and giving below error

Error: 5/9/2016 12:25:17 PMUI ThreadMainForm - ChangeToNextLayoutPrepare Layout Failed. Exception raised was: The process cannot access the file 'C:\Users\ScalaPlayer\Documents\Xibo Library\30.xlf' because it is being used by another process.
Error: 5/9/2016 12:25:17 PMUI ThreadMainForm - ChangeToNextLayoutLayout Change to C:\Users\ScalaPlayer\Documents\Xibo Library\30.xlf failed. Exception raised was: The process cannot access the file 'C:\Users\ScalaPlayer\Documents\Xibo Library\30.xlf' because it is being used by another process.

However, since this morning i havent encountered this issues on any of my players yet… over 6 hours of smooth running…

i had disabled the thumbanail cache with the below registry:
copy, paste and save as Disable_Thumbnail_Cache.reg

Windows Registry Editor Version 5.00

; Created by: Osman Irfan

[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer]
“NoThumbnailCache”=dword:00000001
"DisableThumbnailCache"=dword:00000001

[HKEY_CURRENT_USER\Software\Policies\Microsoft\Windows\Explorer]
“DisableThumbsDBOnNetworkFolders”=dword:00000001

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\Explorer]
“NoThumbnailCache”=dword:00000001
"DisableThumbnailCache"=dword:00000001

[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Advanced]
“DisableThumbnailCache”=dword:00000001
"NoThumbnailCache"=dword:00000001

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\Advanced]
“DisableThumbnailCache”=dword:00000001
"NoThumbnailCache"=dword:00000001


afer this i was monitoring the process with processmonitor (Procmon.exe) sysinternal with the filter of [RESULT is “SHARING”]

so far no errors encountered…

Cheers…

Osman Irfan

Thank you for the advice, your registry tweak however did not seem to help us, are you running your clients on Windows 10?
As you suggested we’re now running process monitor on the test client in order to catch any sharing events and will go from there. Thanks again.

Yes its on Win 10 and Win 7

Good Luck…

Well I am pleased we fixed that one at least.

You might consider looking at this tool: Handle - Sysinternals | Microsoft Learn

That is interesting - can you report back when you are sure there are no further problems? If so we can include the registry changes in the manual and perhaps installer.

Although registry tweak has not fixed the problem entirely, it does seem to occur less frequently since we applied it?

It’s worth noting that we have also disabled the “Display file icon on thumbnails” and enabled “Always show icons, never thumbnails” options in file explorer option settings (Win10).

Also, the scheduled priority slide workaround that I mentioned earlier does actually work as far as “restarting” client, or rather making it continue showing content normally (so I guess the issue could be somehow handled entirely from the client side?).

But this is not really elegant as far as schedule goes, and depending on crash vs priority slide timing, stalled client could still end up showing broken content for a while until next priority slide.

PS: Of course the test client that has the process monitor running so we can pinpoint the cause, has not crashed in several days now :smiley:

Haha, always the case!!

It is possible that Xibo itself is responsible for locking the file we are trying to read, which would explain how moving to a different layout and then back again clears the problem.

Unfortunately I think the root cause we need is the process locking the file - without that we are just guessing. An odd thing to say but I hope your test player freezes soon so we can analyse the results.

Yes Jasmina, i agree after this it has reduced a lot… however the above error still happens mostly in the mornings… i beielve it happens after the dataset is edited… and when it tries to upload to the player…

i even gave the exclusive rights to the folder as below

as Dan said, Xibo itself is locking the file, and not releasing it, and the worst part is that after that player hangs / freezes… and after that if you look at the logs and the stats it is constantly communicating with the CMS,

and once you go to the layout and in editing and save the layout, CMS automatically pushes the layout to the player… and it works normally,

i even have configured XiboClientWatchdog as well but it as well does not help, as i said the player is working normally,

Dan, if the watchdog configuration file can contain a timely check to update the layout this would solve our major issue…

if you could advise the command to update the layout to be put in the XiboClientWatchdog.exe.config, i would highly appreciate.

Thank you in advance…

Warm Regards…

Osman Irfan

We really need to prove this as at the moment we are supposing that is the case. This tool should show what has locked the file if you can catch it in the locked position (which sounds like it should be easy).

Unfortunately that isn’t going to work - as you said the player is actually working “normally” as far as it knows. The watchdog doesn’t have any CMS communication abilities and it would be better to keep it that way. The more logic that ends up in the watchdog makes the watchdog more likely to have a problem - and you end up with the watchdogwatchdog, etc :slight_smile:

This does make it sound like it is Xibo itself that has the file locked - BUT I would have thought if there was the potential for this to happen - we’d be seeing it a lot. Running the tool and getting proof would be helpful.

Mean while I will look for file operations that might not be disposed correctly. I can’t find any i’m afraid, so we will need to see the results of running the lock detection.

Well Murphy’s law in effect, the test client did stall, but the process monitor was not running for some reason; possibly Windows update rebooted the machine and the problem event occurred after reboot (which did restart Xibo but didn’t restart process monitor :D).

Anyhow, we are now running process monitor (GUI version of Handle) on ALL our 10 Minix devices so it should not be long until we catch the culprit. Although problems have definitely decreased significantly after previously discussed actions, they still occur on at least one or two clients daily.

However, I’m a bit confused as to what exactly should we be looking for with process monitor (since I can’t really locate any comprehensive Sysinternals manuals online).

Recording ALL processes, even if only for Xibo Library folder, for several days would probably be problematic for Minix machines. Irfan suggested filtering by “Result is SHARING” but I’ve only been able to find “SHARING VIOLATION” as a similar result state. So we are currently using “Result CONTAINS SHARING” filter. Any ideas on this?

Sorry if I am missing the point - but you don’t need to leave process monitor (or handle) running? Do you not just need to, for the want of a better word, “notice” a frozen player and then task switch out of Xibo to process monitor and see what has the file in the error message locked?

Tried that earlier with the process explorer and there wasn’t anything suspicous going on with the affected files. Seems to me that whatever locks the file, doesn’t keep it locked, the event seems to happen, Xibo client freezes display on a certain layout, and everything continues running normally in the background (schedule updates and everything).

As we mentioned earlier, any content updates on CMS side or even previously scheduled priority layouts (once they come about in the schedule) will make the client continue running normally again, no manual intervention on the client side is necessary.

So I’m guessing we really need to catch it as it happens?

Dan, as i cited earlier, the file gets locked as the log indicates, and if you run the handle or the process explorer it will have no indication. As it is in xibo itself, and it is the xx.xlf file the only thing that is happening after that lock is that it is showing the layout but not the content of the layout which is being populated from the Datasets.

this is the reason why i suggested some kind of mechanism in the player itself to refresh the layout periodically, as it has been proved that if you go on the CMS to the layout in the design mode and save it with out any changes, it gets refreshed on the player, which means the player is in sync with the CMS,

i was going through various sites came through one of the below example

Ways to avoid

Don’t forget I/O operations can always fail, a common example is this:

if (File.Exists(path))
File.Delete(path);
If someone deletes the file after File.Exists() but before File.Delete(), then it’ll throw an IOException in a place where you may wrongly feel safe.

Whenever it’s possible, apply a retry pattern, and if you’re using FileSystemWatcher, consider postponing action (because you’ll get notified, but an application may still be working exclusively with that file).


as well as i mentioned earlier this happens only after data gets edited in the dataset, that might be able to point to something. i believe, that the dataset field is live what ever you edit on the CMS portal, it gets shown, as there is no update button on that that is where the lock might be happening and not getting refreshed after that point.

thanks…

Grrrr, turns out catching something as it happens is not that straightforward.

Process monitor apparently populates the “results” (and only the “results”: http://forum.sysinternals.com/topic8791_post37368.html#37368) column asynchronously, so you can’t really drop filtered events “realtime” by filtering the “result” column. If we’re not dropping filtered events that will kill the client with sheer amount of logs before Xibo client crashes.

So, we’re back to only the test client and filtering by “path contains”: “Xibo Library”, and “xlf”, and excluding: “path ends” with: jpg, png and xml, which produces logs at a more moderate (and hopefully manageable) rate. We’re going to leave it like that and see what happens since I currently see no other way of going about this.

are you running the maintenance script?
i’m trying to figure out if i can run it from the command line…

I tried from the scheduled tasks as per the document but it keeps running
but when i run it from command line
c:\xampp\php\php-win -f c:\xampp\htdocs\xibo\maintenance.php ik

it does not give me any confirmation, but seems like it has completed running the script and statistics page shows no info… (means script line is working)

since i did that yesterday and today, i have not yet encountered the error…

That is very different from what I understand to be the problem - my understanding was that you received the XLF locked message and the player reverted to the default layout or splash screen. This is the first mention of it actually showing the layout without data? I don’t think Jasmina has the same issue.

It isn’t safe to assume that the file was not updated on the player during this operation. In fact it is likely that it was. As you say, by this point it is clear that the file is not locked any more.

It will already do that - the layout it ends up showing (default / splash screen) will have a duration associated with it. The layout will be shown for that duration and then the player will go back to the schedule and try again.

You have the player and the CMS running on the same PC? I didn’t realise that… are you 100% sure that your CMS library and player library locations are different?

I don’t think this can be a CMS problem - it must be something player side.

Sure, I am aware of that already - there will always be race conditions with files. However this is a regular, repeatable occurrence which rules out a race condition. In addition we are only opening the file with exclusive locks in 1 place - the download thread - everywhere else has a shared lock for reading only.

In short I don’t think Xibo is locking the file.

If you have handle.exe installed I might be able to give you a modified player exe which can get the results of handle when the error occurs.

Allow me a little while to look into that.