DVB subtitle conversion

Begonnen von HeartWare, November 05, 2016, 23:02:55

« vorheriges - nächstes »

HeartWare

Being a long-time (registered) user of TS Doctor 1.x, I was thrilled when I discovered the v2.0. Even more thrilled when I saw the DVB->SRT subtitle conversion. I have just finished my first pass of such a file, and it looked like it processed four subtitle tracks and converted them to text, but once all four tracks were processed, it seemed like the program got stuck, and after a while, I clicked on the "X" in the upper right corner of the DVB->SRT conversion dialog (the "Cancel" button doesn't work). TS Doctor then proceeded to tell me that the processing was successful, but I am unable to locate the .SRT files - where are they stored?

Also, would it be possible to include a DVB->PGS subtitle option in addition to the SRT option? If so, could you post-process the DVB subs before saving them as PGS by cropping it to only include the area that actually contains any non-transparent pixels (with a small safety border). Canal+/Canal Digital use full-screen DVB subs with only the lower part of the screen showing any actual subtitle, so scan from top to bottom to find the first scan line with non-transparent pixels (and do the same for bottom, left and right), and then subtract X pixels from the top and left borders, and add X pixels to the bottom and right borders, then save it as PGS (remembering to "shift" the left and top side positions by the amount cropped).

Also: When TS Doctor goes on to process the subtitle tracks, the "DVB Subtitle OCR" dialog is shown with "DVB Language" and "OCR Language" as "German" (there's no German language subtitle track) and nothing happens (there's no update in the window) until I click it (to make it the active window), then it "springs into action" and starts displaying the parsing (that has been going on all along, as I can see when it starts updating the dialog, it has already processed the first DVB subtitle track and is working on the second one).

Finally: It seems like there's a background thread that's stuck. The program uses a full CPU core 100% (25% CPU load on my Quad Core), and I can't shut down the program (all clicks on the "X" is ignored, and so is the File->Close command).

Cypheros

What language do you use for the GUI and what language your subtitles have?

The subtitles are stored in the same directory the final .ts file is stored.

HeartWare

#2
Zitat von: Cypheros am November 07, 2016, 00:52:38
What language do you use for the GUI and what language your subtitles have?

GUI is English. Subtitles are Danish, Norwegian, Swedish and Finnish. TS file is from channel C More First HD, recorded on a DreamBox DM8000 with - I believe - Enigma firmware.

Zitat von: Cypheros am November 07, 2016, 00:52:38
The subtitles are stored in the same directory the final .ts file is stored.

There are no subtitles stored in that directory after processing (no .srt files with a time stamp from "today"). Are the subtitles saved "on the fly" (ie. saved after each language is processed) or only at the very end of successful processing? If the first, there should at least be subtitles for the first three languages, even though the process stalls at the end, but there isn't.

Cypheros

You have to activate the DVB subtitle extraction under Settings/Preferences/DVB subtitles.

HeartWare

Zitat von: Cypheros am November 07, 2016, 13:28:35
You have to activate the DVB subtitle extraction under Settings/Preferences/DVB subtitles.

That is already done - otherwise, I presume it wouldn't display the "DVB Subtitle OCR" dialog where I can see it scans through the subtitles, as described in my first message...

Cypheros

You you have a short sample to reproduce the problem?

HeartWare

Zitat von: Cypheros am November 07, 2016, 23:59:53
You you have a short sample to reproduce the problem?

PM with file link sent...

Cypheros

Have no problem with this file. TS-Doctor is creating 4 subtitle files.

Here the content of the file "TimeSeekBug(PTS&PCR Wrap)_NoPatch_fixed.dan.srt":
1
00:00:07,030 --> 00:00:10,030
Jeg vidste at Leo G. Carrnll
havde mistet al kontrol

2
00:00:12,553 --> 00:00:15,553
Da edderkopperne stak a*

3
00:00:17,640 --> 00:00:20,640
Og det føltes helt råt
da jeg så Jeanette Scott

4
00:00:22,710 --> 00:00:25,710
Slås i "Trifidernes dag"

5
00:00:27,719 --> 00:00:30,719
Dana Andrews så rødt
og følte sig træt

6
00:00:33,165 --> 00:00:36,165
Da dæmonerne fik oveItag

7
00:00:37,876 --> 00:00:40,876
George Pal sagde til sin ven
at nårjorden styrter igen

8
00:00:42,760 --> 00:00:45,760
Skal jeg gi' dig
ristil din egen bag

9
00:00:54,115 --> 00:00:57,115
Anne Jensen
Subtitlers, Languageland

10
00:02:41,840 --> 00:02:44,840
Herknmmerde.

11
00:02:56,114 --> 00:02:57,114
Den nærmeste familie.



or "TimeSeekBug(PTS&PCR Wrap)_NoPatch_fixed.fin.srt":
1
00:00:07,030 --> 00:00:10,030
Leo G. Carrollin
Maha meni sekaisin

2
00:00:12,553 --> 00:00:15,553
Kun tarantula karkasi Iabrastaan

3
00:00:17,640 --> 00:00:20,640
Sanoin: "Onpa hyvät keuhkot"
Kun upea Janet Scott

4
00:00:22,710 --> 00:00:25,710
Taisteli myrkkyä sylkeviä
Triffideiä vastaan

5
00:00:27,719 --> 00:00:30,719
Dana Andrews huolta kantoi
Sillä oudot riimutioku antoi

6
00:00:33,165 --> 00:00:36,165
Paha oli päästä hänen vallastaan

7
00:00:37,876 --> 00:00:40,876
Kun maailmattörmää ryskyen
George PaI sanoo yIpeiIIen:

8
00:00:42,760 --> 00:00:45,760
"Ihmiset ovat aivan kauhuissaan"

9
00:02:41,840 --> 00:02:44,840
Nyt he tulevat.

10
00:02:56,114 --> 00:02:57,114
Koko perhe kuvaan.



This are the settings I used:
[attachimg=1]

HeartWare

#8
Zitat von: Cypheros am November 08, 2016, 19:23:11
Have no problem with this file. TS-Doctor is creating 4 subtitle files.

Even with those exact same settings (I had a different settings previously, such as .123.srt and UTF-8 and both TeleText and DVB subs checked) I still get the same problem - it stalls at the end of processing the DVB subs, and I have to click the "X" on the dialog ("Cancel" button doesn't work), and no .srt files are created. I have attached my .log file in case there's anything in there that can help you diagnose the problem.

But I can live without the OCR pass, if I instead can get a DVB->PGS conversion  :D

HeartWare

Would it be possible to get a DVB->PGS subtitle conversion in an upcoming TSDoctor? It would help me a lot...

Cypheros

Sorry, not at the moment.

I guess SRT is the best supported format as it's easy to read and display. PGS and DVB suptitles are hard to decode.

You are the first one reporting this OCR issue. The question is, why does the OCR task hang on your computer?
Do you have the problem with the short example you uploaded ?

Did you try to repair the file first, before you cut and convert the subtitles?
You named the file "TimeSeekBug(PTS&PCR Wrap)_NoPatch_fixed.ts" so I guess there is a PCR wrap in the file. Maybe that's the reason for the problem.
Try to deactivate DVB subtitle extraction, open the file without commercial detection, cutting or any other changes and click on "Save new file". The new file should have the PCR wrap removed and if you reactivate DVB subtitle extraction, it should work without hanging on OCR task. 



HeartWare

Zitat von: Cypheros am November 12, 2016, 11:53:05I guess SRT is the best supported format as it's easy to read and display. PGS and DVB suptitles are hard to decode.

My players (VLC and PopCorn Hour A-500) have no problems decoding PGS subtitles. At the moment, I use TMPEG DVD Author do do the conversion, but this requires the entire movie to be re-encoded just to convert the subs (I don't use the re-encoded video in the final file - just the converted PGS subs), so an option to have the PGS subs created without the need to spend time to (needlessly) re-encode the entire movie would be greatly appreciated.

The OCR is not faultless - In the clip I put online, the OCR you produced contained several errors. That's also a reason to use the graphical subtitles directly - they are always correct.

Zitat von: Cypheros am November 12, 2016, 11:53:05Do you have the problem with the short example you uploaded ?

Yes - I verified that the problem persisted on the exact file I uploaded.

Zitat von: Cypheros am November 12, 2016, 11:53:05Did you try to repair the file first, before you cut and convert the subtitles?

The extracted portion was made with TS Doctor 1.2 and then I tried etxracting the subtitles with TS Doctor 2.x

Zitat von: Cypheros am November 12, 2016, 11:53:05You named the file "TimeSeekBug(PTS&PCR Wrap)_NoPatch_fixed.ts"

No, I didn't. The file is called "RockyHorror.ts". Perhaps you tried your test run on a different file than the one I put up? The text looks like the right one (with OCR errors), but the file name is wrong.

Zitat von: Cypheros am November 12, 2016, 11:53:05Try to deactivate DVB subtitle extraction, open the file without commercial detection, cutting or any other changes and click on "Save new file". The new file should have the PCR wrap removed and if you reactivate DVB subtitle extraction, it should work without hanging on OCR task.

That's already what I did (more or less). I took the original 7+ Gb .ts file and cut out a sample of a few minutes using TS Doctor 1.2 and then verified that the problem persisted with that clip on my TS Doctor 2.0, then I put the exact problem file online for you to download.

Mam

Zitat von: HeartWare am November 12, 2016, 21:06:37
That's also a reason to use the graphical subtitles directly - they are always correct.

But there is a much more important reason NOT to use the graphicals subs at all!

If you plan to process the video later on for instance changing resolution, THEY DONT WORK ANYMORE!!!
GSubs are not scalable, thats why everybody uses textual representations when possible.

GSubs were only invented for "stupid" playback devices without much intelligence. They are a pain in the ass elsewhere.


HeartWare

Zitat von: HeartWare am November 12, 2016, 21:06:37
That's also a reason to use the graphical subtitles directly - they are always correct.

Zitat von: Mam am November 13, 2016, 09:26:15But there is a much more important reason NOT to use the graphicals subs at all!

If you plan to process the video later on for instance changing resolution, THEY DONT WORK ANYMORE!!!
GSubs are not scalable, thats why everybody uses textual representations when possible.

I don't know which players you are using, but both my PopCorn Hour A-500 (and my previous C-200) as well as VLC can play back PGS Subs that are different in resolution than the video portion of the movie. They are scaled by the playback device to fit the video.

Also, BDSup2Sub can scale (resize) PGS subs from 1080p to 720p, 576p or 480p without any problems.

So I have absolutely no problem with using PGS subs in my recordings. They work 100% perfectly on all my playback devices.


www.cypheros.de