FAQ  •  Register  •  Login

Proposal to Change Online Sources URLExtract Logic

<<

jhb50

DLNA master

Posts: 2843

Joined: Thu Jun 30, 2011 9:32 pm

Post Thu Jan 10, 2013 4:46 am

Proposal to Change Online Sources URLExtract Logic

Based on my experience in using Serviio's support of online media and developing many plugins to offer many different feed items to Serviio users, I would like to propose a major change in how online media is handled. Fortunately this change would be transparent to users and existing plugins, but would significantly reduce the currently wasted resources extracting urls that will never be used, which is the major downside in Serviio's online support compared to it competitors.

While I agree that the pre generation of metadata for local files is appropriate, because why have the file if you do not intend to play it, so make the playback as efficient and fast as possible, I do not agree that it is appropriate for online media.

The goal of the online media plugins is to offer a set of streams to the users, the majority of which will never be played, and so it is inappropriate to waste resources predetermining the stream urls and generating the metadata for all of them, particularly when the stream may go ofline between the point of predetermination and actual playback. All that is really required is to present the user with a list of possible stream items, and then to extract the links and obtain the meta data at playback time.

Currently Serviio uses the extracturl method at every refresh time to:
1. Use the code to determine the stream url.
2. use the code to determine the cache_key.
3. if the cache_key is not in the cache, use ffmpeg to get the metadata for the url and store it in the cache.

and it seems that on playback the logic sequence is:
4. get the metadata from the cache
5. use it to determine if the file needs to be transcoded
6. if stream expires immediately, run extracturl and determine the stream url per step 1 above, but skip step 2 & 3.
7. transcode the (new) stream url based on the original metadata.

This results in the first set of metadata being used at playback time, regardless of whether the metadata (other than the stream url itself) has been changed prior to playback as would have been indicated by a new cache_key.

It seems it would be a simple change at refresh time to skip steps 1 thu 3 and at playback time to execute steps 1 to 3 followed by steps 4, 5 and 7. In that way, the cycles to required to determine online stream metadata would only used at playback time and determined for the currently available url.

The advantage of this change is that it would eliminate the time consuming extract of all the urls on each refresh when the vast majority of that information will never be used, defer it to playback time and eliminate the major disadvantage of Serviio when accessing online sources.

Proof of concept is already available in the beeg groovy where all the urls have common metadata and so following every refresh urls are is generated using a single url and cache_key rather than extracted, and a generated flag is set along with expires immediately. Then when a file is played, it is extracted again because of the expires immediately, and the generated flag is used to cause the real url to be extracted and played utilizing the common metadata. Unfortunately due to Serviio using the existing metadata and not updating it when when the extract url forced by expires immediately is executed at playback time, it was not possible to also get the actual metadata for the file as is suggested above.

The result of is change has been to eliminate the 22 minutes required to extract the 400 beeg urls with every refresh and simply generate them in less than one second, in exchange for less than 10 seconds to extract the true currrent url and access its metadata each time one of the beeg videos is actually played.

Given the lack of value in preextracting the online streams and what appears to be a transparent and simple logic order change, I hope this proposal can be given serious consideration.
<<

mqojdn

Streaming enthusiast

Posts: 44

Joined: Thu Jan 03, 2013 6:39 pm

Post Thu Jan 10, 2013 7:38 am

Re: Proposal to Change Online Sources URLExtract Logic

I'd clearly vote against cause what you say may be true for Beeg, where there are 300+ useless porn vids that no one would ever need, but not for normal feeds, where almost all files are relevant. And I don't want for these to load any longer.

And right now we have the best of both worlds:
- Preloading for normal feeds,
- The setExpiresImmediately() trick for 100+ items feeds.
And everyone is happy.

And it's actually up to users to decide, either they put limits on the number of feed items, and create reasonable requests to filter the results and get what they need in time, or just add 10 pages of Beeg and wait forever...
<<

jhb50

DLNA master

Posts: 2843

Joined: Thu Jun 30, 2011 9:32 pm

Post Thu Jan 10, 2013 12:43 pm

Re: Proposal to Change Online Sources URLExtract Logic

It really has nothing to do with beeg. The point is that the user should not have to manage anything to be able to have access to the available online feeds via Serviio, just as he does not need to manage anything to access the same feeds on his PC via web pages. Nor should the user need to utilize extensive resources on his PC to maintain a list of the currently available feed items on his DLNA client, nor should the source web servers be subject to a never ending stream of requests to extract stream urls that will never be used.

I don't know what you mean by "but not for normal feeds, where almost all files are relevant. And I don't want for these to load any longer." since it is not true when a plugin offers a list of all currently available sports streams, which are changing hourly, and need to be constantly refreshed, or news feeds that need the same, but only a few would be actually viewed. It makes no sense to extract (your preload?) the same url many times just to have it available should the user choose to play it once, particularly when that single extract at playback time only takes a few seconds.

Simply said, Serviio users should not be limited as they are now in the number of feeds they are able to enable or in the frequency of updating the the current list of available items because of the load imposed by extracting urls with either the initial list population or each refresh.
<<

mqojdn

Streaming enthusiast

Posts: 44

Joined: Thu Jan 03, 2013 6:39 pm

Post Thu Jan 10, 2013 1:32 pm

Re: Proposal to Change Online Sources URLExtract Logic

So where is the problem? As I've said, now we have both:
1. Preloading for normal feeds,
2. The setExpiresImmediately() trick for 100+ items feeds.

And you literally want to leave only the second option...
<<

jhb50

DLNA master

Posts: 2843

Joined: Thu Jun 30, 2011 9:32 pm

Post Thu Jan 10, 2013 4:20 pm

Re: Proposal to Change Online Sources URLExtract Logic

You don't seem to understand that Serviio does not just extract a url once when it first starts or a feed is added. The problem is that it regularly refreshes each feed as set in the console or as dictated by the plugin and not only refreshes the list but also rebuilds (extracts) the url. It is the load that is imposed by the reextract of the same url's over and over each time the list is refreshed that is the problem. News plugins need to refresh frequently in order to keep up with the new items that are constantly added and my sports plugins need to be refreshed frequently prior to the start and after the end of each scheduled event in order to maintain the list of currently available event streams. I've already spelled out the consequences on your server, the feed servers and you of continuing with your first option.
<<

mqojdn

Streaming enthusiast

Posts: 44

Joined: Thu Jan 03, 2013 6:39 pm

Post Thu Jan 10, 2013 6:27 pm

Re: Proposal to Change Online Sources URLExtract Logic

Well, you might ask zip to add a flag to extractUrl(), something along the lines:
  Code:
mandatoryUpdate = true, for playback
mandatoryUpdate = false, for refreshes

So you'd skip refreshes for frequently updated feeds.

That would be ok, instead of proposing to make everyone's experience with Serviio abysmal...


Simply said, Serviio users should not be limited as they are now in the number of feeds they are able to enable or in the frequency of updating the the current list of available items because of the load imposed by extracting urls with either the initial list population or each refresh.

Neither they should stare at a blank screen for 15-20 seconds each time a video is started. I'd say online video playback is quite slow as it is, so the efforts should go into making it faster, not slowing it down even more.
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Sat Jan 12, 2013 2:05 pm

Re: Proposal to Change Online Sources URLExtract Logic

Let me write down what we have at the moment, and we can take it from there.

Currently, there are 2 caches for processing online feeds -

1) temporary in-memory cache of online feeds and their items, as they are parsed into the Java objects. these include descriptive , like title, date, feed URL. the key is the feed URL.
2) persisted cache of technical metadata for processed items (result of ffmpeg -i), key is either the item URL of cacheKey (if provided)

The flow that you described is correct. When the feed expires (or one of it's items) it'll try to parse it again (this is the process you describe as update/refresh).
Serviio parses the feed again and calls ffmpeg -i only when not found in cache 2.

To achieve something like what you want I'd have to not expire the whole cached feed (cache 1), but check if the item is already in place, and if it is, don't run the the URL extraction again (if expiresImmediately=true). Obviously for this to work there would have to be a cache key for each item. In the case of RSS feeds/plugins, it could be the item URL (although this might be difficult to get for some feeds, as each feed includes different elements in its XML body). For web resource plugins I'd have to introduce a new attribute on the WebResouceItem (a cacheKey possibly).

With these changes Serviio would run the feed extract methods always, but the content URL methods only if not found in the cache 1 (by the new key ). Them, if expiresImmediately=true, it'd run the URL extract again when playback is requested.

Will this achieve the effect you're after?
<<

jhb50

DLNA master

Posts: 2843

Joined: Thu Jun 30, 2011 9:32 pm

Post Mon Jan 14, 2013 7:07 am

Re: Proposal to Change Online Sources URLExtract Logic

Bingo! With one addition: if the expiry date as set in extractURL has expired, the item must be extracted again.

I have in fact implemented this logic in one of my plugins and what a difference. I can now refresh to see if there are new items and the regeneration of all the urls takes less than 1 second!

I simply created 2 global feed item lists, and added some elements to WebResourceItems to control and contain the data required to regenerate the urls.

List1 is populated at the end of ExtractURLs with the expanded WebResourceItem so it contains the current list of extracted urls.
List2 is copied from List1 at the start of ExtractItems, and ExtractItems creates an itemkey for each item and if it is present in List2, and is before the expiry date in List2, sets a generate flag and the List2 index to the item in WebResourceItems. Extract URLs then can either regenerate the URLContainer using the pointer to and the data in List2, or extract the URL and save the URL data in List1 along with the expanded WebResource Items, where it is available to copy into List1 when ExtractItems runs again on a refresh.

Since these changes are in 5 code blocks within the groovy, I will probably add them to most of mine, but it would be better if this were all done by Serviio under the covers at the start and end of each method.

Add Global List Declaration
Add List1 to List2 copy at start of Extract Items
Add ItemKey to WebResourceItem item and Add Block following
Add Generate Switch at start of ExtractURL
Add Block at end of ExtractURL to generate URL and save List1
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Mon Jan 14, 2013 10:55 am

Re: Proposal to Change Online Sources URLExtract Logic

So my proposal is ok? I didn't get if you want me to add your 5 blocks or if that is just a workaround.
<<

jhb50

DLNA master

Posts: 2843

Joined: Thu Jun 30, 2011 9:32 pm

Post Mon Jan 14, 2013 3:27 pm

Re: Proposal to Change Online Sources URLExtract Logic

Well right now its a workaround for me until you decide to add the logic to Serviio. I think you should, along with the ability to extend the 30 second timeout on ExtractItems, and probably the ability to shrink it on ExtractURLs depending on the plugin authors logic. For example, last night the Coolsports site was very slow and it took over 12 minutes for the 25 feed items to individually timeout in ExtractURL's. 10 seconds would be more than enough for it.

I'll add my 5 blocks to my ilive plugin and send it to you so you can test it and use it as a template.
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Mon Jan 14, 2013 4:11 pm

Re: Proposal to Change Online Sources URLExtract Logic

<<

mqojdn

Streaming enthusiast

Posts: 44

Joined: Thu Jan 03, 2013 6:39 pm

Post Mon Jan 14, 2013 7:06 pm

Re: Proposal to Change Online Sources URLExtract Logic

Wouldn't it be simpler and more convenient to add per source feed expiry interval?
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Mon Jan 14, 2013 8:22 pm

Re: Proposal to Change Online Sources URLExtract Logic

Don't think it's very user-friendly.
<<

jhb50

DLNA master

Posts: 2843

Joined: Thu Jun 30, 2011 9:32 pm

Post Mon Jan 14, 2013 11:03 pm

Re: Proposal to Change Online Sources URLExtract Logic

The goal is to not extract the same url's again every time we refresh, rather than to control when we refresh to update the list of available feed items. My refreshes for scheduled sports event are controlled by the plugin, not what is set in the console, and occur frequently to keep up with streams as they become live or are banned afterwards. They should not be limited by the redundant reextract of existing urls.
<<

mqojdn

Streaming enthusiast

Posts: 44

Joined: Thu Jan 03, 2013 6:39 pm

Post Tue Jan 15, 2013 10:14 am

Re: Proposal to Change Online Sources URLExtract Logic

jhb50 wrote:They should not be limited by the redundant reextract of existing urls.


I'm repeating myself here, but these aren't redundant. Extracting that info prior to playback does allow for skipping doing that at playback time.
Fast startup of online streams IS the advantage of Serviio over PS3 Media Server and the like. And I believe that it should stay that way.

And if some plugin is dealing with frequently updating items, and plugin's author prefers to prioritize updates over startup time, that logic may as well be coded into the plugin, and not Serviio, thus enforcing these priorities over all online streams.

However, I'm with you on that the current Serviio online stream handling is somewhat confusing, but that needs some rethinking and careful planning, instead of eliminating the background stream extraction altogether.
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Tue Jan 15, 2013 1:06 pm

Re: Proposal to Change Online Sources URLExtract Logic

mqojdn wrote:I'm repeating myself here, but these aren't redundant. Extracting that info prior to playback does allow for skipping doing that at playback time.
Fast startup of online streams IS the advantage of Serviio over PS3 Media Server and the like. And I believe that it should stay that way.

What we're aiming for is to have the original scan done the same way. It's just the refresh (as set in the console or per plugin) would not run the extraction logic for feed item that are already there and have not expired - because by definition they are still valid. All the other behaviour will stay the same, ie if the UR doesn't expireImmediately, fetch the URL during the refresh, if it does, fetch it at the time of playing.
<<

mqojdn

Streaming enthusiast

Posts: 44

Joined: Thu Jan 03, 2013 6:39 pm

Post Tue Jan 15, 2013 1:17 pm

Re: Proposal to Change Online Sources URLExtract Logic

That should be easy, just add setCacheKey() and setExpiresOn() to WebResourceItem, the same way it's done with ContentUrlContainer and you should be set.

The problems are:
- it's not always possible to determine the lifetime of an item,
- not every plugin developer is aware about the need to set these carefully, or set them at all.
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Tue Jan 15, 2013 5:08 pm

Re: Proposal to Change Online Sources URLExtract Logic

if they are not set, I'll expire them the same way as now... so it's up to the plugin dev to use this functionality to enhance their plugins. For most plugins it should not be a big deal, as they usually follow the console setting of max items, but if someone want to return 400 items, then it should help
<<

jhb50

DLNA master

Posts: 2843

Joined: Thu Jun 30, 2011 9:32 pm

Post Wed Jan 16, 2013 1:02 am

Re: Proposal to Change Online Sources URLExtract Logic

I'm really pleased with how well this is working.

An additional consideration is that a plugin may be used to provide multiple feeds. Hahasport for example maybe used to provide both football and tennis feeds and both maybe enabled simultaneously. I had assumed that Serviio would run these as separate instances so that the data in football would be separate from tennis, but this does not appear to be the case.

In my implementation, the list of saved items is rebuilt with each refresh and is available for generation when the items are extracted again. I am finding however that when tennis refreshes following football, that the saved tennis data within the plugin has been replaced by the football data, meaning the plugin only has one instance and the separation of the resulting different menus is external to the plugin.

My solution to this is to now maintain a single list of all items extracted by a plugin by itemkey regardless of feed, and purge it of any expired items prior to each refresh.

Thoughts?
<<

jhb50

DLNA master

Posts: 2843

Joined: Thu Jun 30, 2011 9:32 pm

Post Sun Feb 10, 2013 6:58 am

Re: Proposal to Change Online Sources URLExtract Logic

I have completed my implementation of the proposed Quick Refresh logic within a number of my plugins that require frequent refreshes and I have documented the process at http://wiki.serviio.org/doku.php?id=quick_refresh

The bit bucket request to implement Quick Refresh within Serviio for Expires Immediately urls may be found at https://bitbucket.org/xnejp03/serviio/i ... k-per-item.
Next

Return to Plugin development

Who is online

Users browsing this forum: No registered users and 16 guests

Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software for PTF.