Page 1 of 1

Subtitle file charset on the fly conversion

PostPosted: Mon Jan 24, 2011 9:54 pm
by sefo
zip, could you please add support for on the fly subtitles charset conversion ?
I have all SK/CZ subtitles in windows-1250 encoding on my DLNA server and I'm using serviio with 2 TVs. Samsung TV supports only windows-1250 and Panasonic Viera supports only Latin2 (ISO-8859-2) character set for central European characters.
So I cannot convert subtitle files directly on the server, because they still will be not working correctly on the both TVs.

Currently I'm using the following ugly hardcoded patch for on the fly conversion for Panasonic TV. But I think better way would be to add new parameter for that in the TV profile.
For example:
<SubtitlesCharsetConversion sourceCharset="windows-1250" destinationCharset="ISO-8859-2" />
If the element will be defined then serviio automatically convert charset from sourceCharset to destionationCharset on the fly.
If it won't be defined for TV then the subtitles are not converted.

  Code:
package org.serviio.delivery;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;

import org.serviio.dlna.MediaFormatProfile;
import org.serviio.dlna.UnsupportedDLNAMediaFileFormatException;
import org.serviio.library.service.SubtitlesService;
import org.serviio.util.FileUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class SubtitlesRetrievalStrategy
  implements ResourceRetrievalStrategy
{
  private static final Logger log = LoggerFactory.getLogger(SubtitlesRetrievalStrategy.class);

  public DeliveryContainer retrieveResource(Long mediaItemId, MediaFormatProfile selectedVersion, Client client, boolean markAsRead)
    throws FileNotFoundException, IOException
  {
    File subtitleFile = SubtitlesService.findSubtitleFile(mediaItemId);
    if (subtitleFile == null) {
      throw new FileNotFoundException(String.format("Subtitle file for media item %s cannot be found", new Object[] { mediaItemId }));
    }

    log.debug(String.format("Retrieving Subtitles for media item with id %s", new Object[] { mediaItemId }));

    ResourceInfo resourceInfo = retrieveResourceInfo(mediaItemId, subtitleFile, client);
    DeliveryContainer container;
    if (client.getRendererProfile().getId().equals("12")) {
       InputStreamReader sr = null;
       OutputStreamWriter sw = null;
       try {
          sr = new InputStreamReader(new ByteArrayInputStream(FileUtils.readFileBytes(subtitleFile)), "windows-1250");
       
          StringBuffer sb = new StringBuffer();
          char[] chars = new char[2048];

          int numRead = 0;
             while ((numRead = sr.read(chars)) > 0) {
                sb.append(chars, 0, numRead);
             }
             
          sr.close();
          String s = sb.toString();
          s = s.replace("<b>", "");
          s = s.replace("<i>", "");
          s = s.replace("<u>", "");
          s = s.replace("</b>", "");
          s = s.replace("</i>", "");
          s = s.replace("</u>", "");
       
          ByteArrayOutputStream os = new ByteArrayOutputStream();
          sw = new OutputStreamWriter(os, "ISO-8859-2");
          sw.write(s);
          container = new StreamDeliveryContainer(new ByteArrayInputStream(os.toByteArray()), resourceInfo);
       }
       catch (Exception e) {
          throw new IOException(String.format("Subtitle file for media item %s cannot be converted", new Object[] { mediaItemId }));
       }
        finally {
           if (sr != null) {
              sr.close();
           }
           if (sw != null) {
              sw.close();
           }
        }
    }
    else {
       container = new StreamDeliveryContainer(new ByteArrayInputStream(FileUtils.readFileBytes(subtitleFile)), resourceInfo);
    }
       
    return container;
  }

  public ResourceInfo retrieveResourceInfo(Long mediaItemId, MediaFormatProfile selectedVersion, Client client)
    throws FileNotFoundException, UnsupportedDLNAMediaFileFormatException
  {
    File subtitleFile = SubtitlesService.findSubtitleFile(mediaItemId);
    if (subtitleFile == null) {
      throw new FileNotFoundException(String.format("Subtitle file for media item %s cannot be found", new Object[] { mediaItemId }));
    }

    log.debug(String.format("Retrieving info of Subtitles for media item with id %s", new Object[] { mediaItemId }));
    return retrieveResourceInfo(mediaItemId, subtitleFile, client);
  }

  private ResourceInfo retrieveResourceInfo(Long mediaItemId, File subtitleFile, Client client)
    throws FileNotFoundException
  {
    ResourceInfo resourceInfo = new SubtitlesInfo(mediaItemId, Long.valueOf(subtitleFile.length()), client.getRendererProfile().getSubtitlesMimeType());
    return resourceInfo;
  }
}

Re: Subtitle file charset on the fly conversion

PostPosted: Mon Jan 24, 2011 10:00 pm
by sefo
Also in my code you can see that I'm trimming subtitles formating tags ("<b>", "<i>", ...) because some TVs doesn't support them.
So maybe It would be fine to add one more element in TV profile for that.
For example: <SubtitlesFormatingSupported>true|false</SubtitlesFormatingSupported> or <SrtFormatingSupported>true|false</SrtFormatingSupported>

But this is not such important request for me as the first one. :)

Re: Subtitle file charset on the fly conversion

PostPosted: Mon Jan 24, 2011 10:11 pm
by zip
this would mean that all your files have to be in windows-1250, if you had just 1 file in other encoding it would not work, right?

Did you try encoding the files in UTF-8?

Re: Subtitle file charset on the fly conversion

PostPosted: Tue Jan 25, 2011 7:43 pm
by sefo
zip wrote:this would mean that all your files have to be in windows-1250, if you had just 1 file in other encoding it would not work, right?

yes, you are right, but it's still better to have unreadable only 2-3 subtitle files (which have another encoding as the default one) as to have unreadable all subtitle files.

zip wrote:Did you try encoding the files in UTF-8?

Yes, I did. It didn't work, Panasonic Viera supports only Latin1(ISO-8859-1), Latin2 (ISO-8859-2) and Cyrillic encoding.

Re: Subtitle file charset on the fly conversion

PostPosted: Sun Jan 31, 2016 2:44 pm
by lagunaa
I have a lot of subtitles with different character encoding, so a single character encoding value in the Serviio console is not enough...
If I change the subtitle character encoding in the Serviio console on-the-fly, then I have to stop the Serviio service to get the new character encoding active for the current video stream.
It would be useful if changing the subtitle character encoding will stop the current video streaming and restart it with the new settings.

Re: Subtitle file charset on the fly conversion

PostPosted: Tue Feb 16, 2016 8:54 pm
by lagunaa
According to http://serviio.org/index.php?option=com ... icle&id=33, we have to use the character encodings available for libiconv:
When using hardsubs, Serviio gives you, however, the option to specify the file's character encoding, so that it renders correctly on the video stream. Enter the character encoding to the Subtitle character encoding field (e.g. cp1256 for Arabic). You can find available encodings in the libiconv page.

It could be useful to make the Subtitle character encoding field an auto complete field to propose all these valid values.
Cheers.

Re: Subtitle file charset on the fly conversion

PostPosted: Wed Feb 17, 2016 7:45 pm
by zip

Re: Subtitle file charset on the fly conversion

PostPosted: Sun Mar 06, 2016 5:28 pm
by lagunaa
zip, any chance on this?
lagunaa wrote:If I change the subtitle character encoding in the Serviio console on-the-fly, then I have to stop the Serviio service to get the new character encoding active for the current video stream.

Thanks in advance.

Re: Subtitle file charset on the fly conversion

PostPosted: Mon Mar 07, 2016 8:56 pm
by zip
lagunaa wrote:zip, any chance on this?
lagunaa wrote:If I change the subtitle character encoding in the Serviio console on-the-fly, then I have to stop the Serviio service to get the new character encoding active for the current video stream.

Thanks in advance.

It should not be thew case. It should pick up the current value stored in the console.

Re: Subtitle file charset on the fly conversion

PostPosted: Sat Mar 12, 2016 8:06 pm
by lagunaa
Nope... If I had an active video stream before I changed the character encoding in the console, then I have to stop the Serviio service to get the new character encoding picked up for also that video stream.

Re: Subtitle file charset on the fly conversion

PostPosted: Mon Mar 14, 2016 12:30 pm
by zip
Ah ok... that's because the transcoded file has already been created I guess.

Re: Subtitle file charset on the fly conversion

PostPosted: Wed Mar 16, 2016 7:20 pm
by lagunaa
So, I would suggest an enhancement to stop the current video streaming and restart it with the new subtitle character encoding when settings has been changed by user.

Re: Subtitle file charset on the fly conversion

PostPosted: Thu Mar 17, 2016 9:07 am
by zip
In 1.6.1 you will be able to just click the Stop server button in the console, which will kill any running ffmpeg too. After you start the server again it will create a new file.

Re: Subtitle file charset on the fly conversion

PostPosted: Wed Jul 06, 2016 12:27 pm
by lagunaa
lagunaa wrote:It could be useful to make the Subtitle character encoding field an auto complete field to propose all these valid values.

zip wrote:Good idea, created ticket: https://bitbucket.org/xnejp03/serviio/i ... setting-in

I see ticket 872 related to subtitles character encoding has been resolved, but this similar ticket 929 has no version set yet.
Any ETA on this?
Thanks in advance.

Re: Subtitle file charset on the fly conversion

PostPosted: Wed Aug 03, 2016 9:40 am
by lagunaa
lagunaa wrote:I see ticket 872 related to subtitles character encoding has been resolved, but this similar ticket 929 has no version set yet.
Any ETA on this?

Now I see ticket 929 marked as WON'T FIX. In my opinion, it could still be a useful feature for users.
On that ticket zip wrote:
zip wrote:With the automatic detection of encoding this should be pretty useless now
https://bitbucket.org/xnejp03/serviio/issues/929/typeahead-for-subtitle-encoding-setting-in#comment-29026296

but we still have to use only the character encodings available for libiconv...
Cheers.

Re: Subtitle file charset on the fly conversion

PostPosted: Thu Sep 15, 2016 10:05 am
by lagunaa
Bump after v1.7 released.

Re: Subtitle file charset on the fly conversion

PostPosted: Wed Mar 14, 2018 9:34 am
by lagunaa
Every time I need to change subtitles character encoding I wonder why ticket 929 had been closed... :roll:

Re: Subtitle file charset on the fly conversion

PostPosted: Wed Mar 21, 2018 2:59 pm
by zip
doe it not guess the encoding automatically?