FAQ  •  Register  •  Login

Utility to unescape HTML character entities?

<<

ottomatic

DLNA master

Posts: 224

Joined: Fri Nov 09, 2012 10:15 am

Post Mon Nov 19, 2012 10:54 am

Utility to unescape HTML character entities?

Hi,

I am building a WebResouceExtractor for Svtplay.se.

When parsing html documents to extract the title info for a web resource, I (naturally) get the titles in HTML format.

So, a show titled "Räksmörgås & annat" would be described as "R&auml;ksm&ouml;rg&aring;s &amp; annat".

Now, I am wondering if there is a utility in any of the the serviio / groovy namespaces wich could help me unescape the entity characters?

(It is my understanding that the common way to do this in java is to use org.apache.commons.lang.StringEscapeUtils.unescapeHtml() for this. But the org.apache.commons.lang package is not included in a standard serviio install, as far as I know.)
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Mon Nov 19, 2012 1:05 pm

Re: Utility to unescape HTML character entities?

I was looking at it recently too and didn't find anything. Does the Apache library actually deal with these advanced codes or does in only understand special characters, like ampersand, etc?
<<

ottomatic

DLNA master

Posts: 224

Joined: Fri Nov 09, 2012 10:15 am

Post Mon Nov 19, 2012 1:56 pm

Re: Utility to unescape HTML character entities?

From the source code found at
http://www.docjar.com/html/api/org/apac ... .java.html
and
http://www.docjar.com/html/api/org/apac ... .java.html

It seems like the Entities.HTML40.unescape covers just about anything.
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Mon Nov 19, 2012 1:57 pm

Re: Utility to unescape HTML character entities?

Ok, I'll add the package to 1.1.
<<

ottomatic

DLNA master

Posts: 224

Joined: Fri Nov 09, 2012 10:15 am

Post Mon Nov 19, 2012 2:02 pm

Re: Utility to unescape HTML character entities?

Cool.

So, if I want to reference it in my plugin before the release of 1.1, I presume it is alright to attach a jar file together with the plugin and a proposed FFMPeg wrapper (which will also be necessary for the plugin to work properly before the comma escape bug is fixed)?
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17212

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Mon Nov 19, 2012 2:19 pm

Re: Utility to unescape HTML character entities?

yes. Obviously it'd be nice if you can try it all in the upcoming beta.
<<

ottomatic

DLNA master

Posts: 224

Joined: Fri Nov 09, 2012 10:15 am

Post Mon Nov 19, 2012 2:27 pm

Re: Utility to unescape HTML character entities?

Will do.

I'll apply for the betatester section asap.
<<

ottomatic

DLNA master

Posts: 224

Joined: Fri Nov 09, 2012 10:15 am

Post Mon Nov 19, 2012 3:01 pm

Re: Utility to unescape HTML character entities?

Petr,

I have now announced the new plugin in this forum thread:
viewtopic.php?f=20&t=8062

You may mark the old plugin as obsolete if you wish.

Regards

/ O

Return to Plugin development

Who is online

Users browsing this forum: No registered users and 14 guests

Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software for PTF.