This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Luma Scale Considerations and the "Full Luma Range" option.

Tags: None
(comma "," separated)
TheDiveO
Registered Member
Posts
595
Karma
3
OS
What is also interesting to me: albeit I have a company logo with RGB 255,255,255, the final rendered output correctly is within the safe zone, even when there are only other svg clips involved.
Inapickle
Registered Member
Posts
157
Karma
3
TheDiveO wrote:Inapickle, thank you very much! I followed your instructions on a Kubuntu 15.10 installation and got everything up and running.


You're welcome. I wanted to get VapourSynth up and running anyway as it has other potential uses. Hope the procedure is clear enough to anyone else wanting to use it. What had me befuddled when trying to set it up first time around (and before I discovered that ppa) were the directory paths. The key piece in the jigsaw was exporting the PYTHONPATH environment variables. Now I'm into it, it's quite similar to working with AVISynth, so familiar territory.

TheDiveO wrote:Now if this we could only get a YUV histogram as another scope in Kdenlive!


I'll come back to you on that and your last post. I've just been looking at what KDenLive currently has to offer and running some tests. I'll elaborate on that some more, but my time is limited today.
Edit: Might not be for a day or two; domestic demands this weekend.
Inapickle
Registered Member
Posts
157
Karma
3
Unfortunately the domestic demands turned to domestic crisis yesterday as my truck abruptly lost power and locked - alternator failure, again. Fortunately not on the highway and no injuries, but the the city taxi drivers were, to say the least, not pleased that I come to rest in their taxi rank lane, hindering their principal source of income. Having done all and awaiting arrival of the meandering tow-truck and consternated wife, I contemplated the tranquil subject of KDenLive histograms to ease my troubled thoughts.

And my thoughts are these:

As mentioned earlier, the reason why these YUV Histograms proved so helpful in elucidating the luma scaling behavior of the different inputs and resulting outputs is because the AVISynth source filters (effectively decode/indexing frame servers) they use output raw, full scale (0-255) YV12. So the YUV Histograms show the actual luma values as they fall on that scale.

As far as I can see, there are currently two ‘tools’ available in KDenlive that specifically derive and display YPbPr (Y'CbCr) data... what we have been loosely calling ‘YUV’.

One is the Vectorscope monitor:

http://i.imgur.com/aBvG9cD.png

The other is the PrOfile effect listed under the Analysis and Data effects, which displays the selected channel values, including Y, Pb and Pr, as mean line plots.

http://i.imgur.com/oi4lviL.png

Being an “effect”, those plots are burned into the video on rendering. But when applied on the time-line it can be used as a monitor of sorts; personally I don’t find the information provided by these graphs that useful, but they give some clues as to how the YPbPr data might be derived.

What I did was to convert my Canon HF-G10.mts test clip (with the 16-255 YUV range ) to lossless UTVideo in YV12 space using AVISynth, which preserves the original scaling. I then created a second lossless clip this time first pre-converting the YV12 to RGB and back to YV12 using Rec709 coefficients. As expected, this clip now showed clamped 16-235 YV12 scaling - “TV Levels” as it is often referred to.

I then loaded these two clips on the KDenlive timeline side by side, and to each applied the Motion: Freeze effect, applying the freeze at exactly the same frame. Pulling up the Vectorscope monitor, I scrubbed back and to between the two 'frozen' clips to see if there was any difference in the Vectorscope displays. I could see none. Yet on rendering out to Matroska.mkv the resulting AVISynth YUV Histogram showed that the respective 16-255 and 16-235 scaling of the two clips was preserved. Which is what I expected; as already established, the Freeze effect does not affect in the YUV profile in anyway - it is “passed through”.

I then did the same, this time applying (on top of the Freeze effect) the PrOfile effect, configured to plot the Y, Pb and Pr channel values, with the color matrix set to CCIR Rec709. Again, I could so no difference in the plots. But when I came to examine the AVISynth YUV Histogram of the (Matroska.mkv) render, this time the luma had been uniformly clamped to 16-235.

http://i.imgur.com/yFawQpE.jpg

Note: in that particular example, I selected just the Y channel plot.

So what do these observations reveal? Well to my mind they provide conclusive evidence that the YPbPr data for the Vectoscope and PrOfile displays is being derived from Rec709 conversion to 16-235 scaling. In other words, if the YPbPr values were being taken from directly from the input clips one would expect to see some differences in the respective Vectorscope and PrOfile displays, and there was none. Furthermore, the fact that the applied PrOfile effect resulted in uniform clamping to 16-235 suggests that the effect “requires RGB” and derives the 16-235 scale YPbPr values by Rec709 conversion.

Again, if I’m totally wrong about that, I would be more than happy to be put straight.

In conclusion, I agree entirely that it would be helpful to have other representations of YPbPr data in KDenLive. The Vectoscope is a very useful reference monitor for color correction and grading, for skin tones especially, but it is primarily a chroma tool. The PrOfile effect, again, might be meaningful to some, but personally I’d find a true Y' channel WFW of more value.

So, yes, I too support the request for a YPbPr Histogram. But if the desire is that it provides an equivalent of the AVISynth/VapourSynth YUV Histogram, I can’t see that happening unless the MLT developers have some way of referencing raw 0-255 scale YPbPr values. Edit: thinks - if there's no scope in MLT, would it be possible to pipe out/from to VapourSynth in some way to derive the raw full scale YPbPr data?

But wait, aren’t we getting already getting at least “Y” component in the existing Histogram monitor? That shows 0-255 scaling:

http://i.imgur.com/2MXGSUO.png

But looking at the “Y” histogram in that screen-shot with the HFG10 clip; if that was mapping actual “Y” values surely the Histogram would reflect the 16-255 range, or else a limited 16-235 range if the values were derived by Rec709 conversion. But they are not - the profile extends across the entire 0-255 scale. So what does this “Y” value truly represent? Is it simply taking the perceived “black” and “white” points and scaling/mapping out to 0-255 for the sake of uniform presentation ? Possibly. Or is it not “Y Luma” at all, and is really an RGB value i.e. what is variously referred to in other graphics programs as “Luminance” "Luminosity" or “Value” ? I’m still really not sure, and it’s more difficult to deduce.

Same goes for the “Luma” curve in the Curves and Bezier Curve Effect, especially with the arbitrary scaling units. And again there, when the effects are applied, the resulting AVISynth YUV Histograms show a clamped 16-235 luma range.

So, I would find it helpful if the KDenLive/MLT developers could explain exactly what these “Y” and “Luma” values actually represent and how they are derived.

Edit: I see you posted the idea of a YUV Histogram in the forum Development section:

viewtopic.php?f=279&t=130705

Would seem superfluous to duplicate all of the above there, so I've just posted a link to this post.
Inapickle
Registered Member
Posts
157
Karma
3
Aha, just came across this article which gives more insight into the way the Y value is derived in the KDenLive Histogram, or at least how it was in 0.7.8:

https://kdenlive.org/users/granjow/intr ... -histogram.

So there we have it:

Quote:

Histogram options

In kdenlive 0.7.8 the histogram can be adjusted as follows:

Components – They can be enabled individually. For example, you might only want to see the Luma component, or you want to hide the Sum display.
Y or Luma is the best known Histogram. Every digital camera shows it, digikam, GIMP, etc. know it. See below how it is calculated.
Sum is basically a quick overview over the individual RGB channels. If it shows e.g. 5 as the minimum value, you know that none of the RGB components goes lower than 5.

RGB show the Histogram for the individual channels.
Unscaled (Context menu) – Does not scale the width of the histogram (unless the widget size is smaller). Just a goodie if you want to have it 256 px wide.
Luma mode (Context menu) – This option defines how the Luma value of a pixel is calculated. Two options are available:
Rec. 601 uses the formula Y' = 0.299 R' + 0.587 G' + 0.114 B'
Rec. 709 uses Y' = 0.2126 R' + 0.7152 G' + 0.0722 B'

Most of the time you will want to use Rec. 709 which is, as far as I know, mostly used in digital video today.


So, this brings some clarification on the points I made in the last post:

Inapickle wrote:But wait, aren’t we getting already getting at least “Y” component in the existing Histogram monitor? That shows 0-255 scaling:

http://i.imgur.com/2MXGSUO.png

But looking at the “Y” histogram in that screen-shot with the HFG10 clip; if that was mapping actual “Y” values surely the Histogram would reflect the 16-255 range, or else a limited 16-235 range if the values were derived by Rec709 conversion. But they are not - the profile extends across the entire 0-255 scale. So what does this “Y” value truly represent? Is it simply taking the perceived “black” and “white” points and scaling/mapping out to 0-255 for the sake of uniform presentation ? Possibly. Or is it not “Y Luma” at all, and is really an RGB value i.e. what is variously referred to in other graphics programs as “Luminance” "Luminosity" or “Value” ? I’m still really not sure, and it’s more difficult to deduce.


So, as I thought, the Y (luma) component Histogram is derived from RGB through the Rec709 coefficients. And it is RGB "Sum" component that corresponds to what is variously called "Luminance" "Luminosity" or "Value" in other graphics programs. Fine with that.

But still, why is the scaling of the Y component Histogram in KDenlive 15.12 0 different from that shown there for 0.7.8 ?

http://kdenlive.org/sites/default/files ... togram.png

There the Y histogram looks to display dynamic ('live') minimum and maximum range luma range values, which I would find infinitely more useful than the (? stretched ) 0-255 scaling that I'm seeing 15.12.0. The "Unscaled" option (right click on Histogram view) referred to just squashes down the Histogram and is still scaled 0-255 ? Is there some other setting in 15.12.0 that allows reverting to the Y scaling used in the 0.7.8 Histogram? If not, please bring it back - as it would be far more useful.

As for the KDenLive "Waveform" monitor. Well the pertinent article in that series "Introducing Color Scopes" does refer to that as displaying "Luma"

https://kdenlive.org/users/granjow/intr ... rgb-parade

So it's reasonable to assume that "Luma" there is derived in the same way as the "Y" component in the Histogram, from Rec709 conversion of RGB. But again why in this example is it showing the luma range of the HFG10 clip spanning the entire 0-255 scale when by Rec709 conversion it should be limited to 16-235.

http://i.imgur.com/TmAqnvj.png

One can only assume again that it has been 'stretched' to 0-255 scale, which I find to be misleading and not at all helpful.
TheDiveO
Registered Member
Posts
595
Karma
3
OS
Incredible good digging! Please keep on. I don't think there is scaling going on inside the histogram, but elsewhere, this might be inside MLT. Why?

The source for the histogram calculation can be found in histogram.cpp. I don't notice here any scaling as such, but I notice that the histogram always takes a QImage which seems to be always RGB. Maybe the scaling thus kicks in due to the YUV to RGB conversion? We need to dig deeper.
Inapickle
Registered Member
Posts
157
Karma
3
Ah, I see what's happening now.

I took my (16-255 range) HFG10 clip, deliberately compressed it down to 32-220 (with AVIsynth) and transcoded, as before, to UTVideo (YV12).avi. Loaded that transcode in KdenLive and pulled up the Histogram in Y (Luma) mode. Here's the screen shot:

http://i.imgur.com/tPYQSj3.png

But look at the stated minimum and maximum luma values - 3 and 239. So what was 32-220 is being displayed as 3-239.

So here's what's happening:

As already established, the Y (Luma) values for the Histogram are derived by Rec709 conversion from RGB which will always result in 'limited' 16-235 range. So, when the native HG10 (16-255 range) clip put on the time-line, the derived Y Luma range should be (clamped) 16-235. But it's not - it's presented as 0-255.
When the compressed 32-220 transcode is placed on the time-line what we should be seeing is just that, because the source 32-220 luma range already falls within the limited 16-235 range. Instead it is presented as 3-239. Why? Because the Histogram literally takes what should always be 16-235 scaling and presents it as if it were 0-255. And there is no further scaling out to 0-255 going-on. Looking at the Histogram of the source 32-220 transcode using the AVISynth YUV Histogram, it is exactly the same (neglected to take a frame shot) if one neglects the brown bars at the sides that delineate the 'limited' 16 and 235 boundary points.

So the problem is simply that the Histogram is presenting the luma range with the wrong scaling units - the scaling for the luma component should be 16-235, not 0-255. This is the source of the confusion, I'm sure of it. Needless to say this is very misleading. The only case for presenting the Luma Histogram with a 0-255 scale would be if it is actually deriving raw full scale luma directly from the source input (as the AVISynth/VapourSynth YUV Histogram does). But it doesn't, it's always Rec709 limited 16-235 luma.

Last edited by Inapickle on Mon Feb 08, 2016 8:25 pm, edited 1 time in total.
TheDiveO
Registered Member
Posts
595
Karma
3
OS
The histogram code actually takes the luma formula from Rec.709 but not the representation range. This is the important code line from histogramgenerator.cpp:
Code: Select all
y[(int)floor(.2125*qRed(col) + .7154*qGreen(col) + .0721*qBlue(col))]++;
Here, qRed etc are the RGB components taken from an individual image pixel. So this function simply calculates Y in RGB space with the 0-255 range in the source channels. The y array is 0..255.
TheDiveO
Registered Member
Posts
595
Karma
3
OS
BTW, MLT uses this beast here under its hood for image scaling and colorspace conversion: https://www.ffmpeg.org/libswscale.html
Inapickle
Registered Member
Posts
157
Karma
3
TheDiveO wrote:The histogram code actually takes the luma formula from Rec.709 but not the representation range. This is the important code line from histogramgenerator.cpp:
Code: Select all
y[(int)floor(.2125*qRed(col) + .7154*qGreen(col) + .0721*qBlue(col))]++;
Here, qRed etc are the RGB components taken from an individual image pixel. So this function simply calculates Y in RGB space with the 0-255 range in the source channels. The y array is 0..255.


Well we're getting closer.... I think....or have we already got there ??

I recognize those as Rec709 matrix coefficients, but I can't say I really follow all of the computations preceding and following that line in the code to figure out how they are being applied, so I'll take your word on on that.

I can only make my best interpretation of what's going on based on what I see and some knowledge of the calculations involved in the different YUV<>RGB transforms. Earlier I made reference to what I called special or modified Rec709 coefficients for converting full scale (0-255) YUV sources, so called "PC.709". Actually both use the same Rec709 coefficients (as shown in that line of code). It's just that in the modified "PC.709" transforms a different set of calculations are applied to those coefficients to accommodate the 0-255 scaling. This AVISynth documentation sets out the calculations involved, which are in turn based on the Rec709 (BT.709) standards:

http://avisynth.nl/index.php/Color_conversions

And I would assume that MLT is Rec709 compliant. But just where those calculations are being applied the code is beyond me. What's the "floor" for instance ?

Best I can do is try to simulate the different options and look at the evidence, and that's what I was doing when you posted, so I might as well report it.

Taking the HFG10 (16-255) clip that compressed to 32-220 and transcoded to UTVideo (YV12). Here's what the AVISynth YUV Histogram looks like:
(Edit: replaced with correct frame)
http://i.imgur.com/heDzfeY.jpg

And here's the screenshot of the KenLive Luma Histogram again with that transcode loaded on the time line. I think it's the same frame:

http://i.imgur.com/tPYQSj3.png

Taking the compressed clip again, I then converted it (with AVISynth) to RGB and then back to YV12 using different combinations of the standard Rec709 and PC.709 computations.

1. converted to RGB with Rec709 and back to YV12 with Rec709:
http://i.imgur.com/Ufl0tuE.jpg

2. Here, converted to RGB with Rec709 and back to YV12 with PC.709:
http://i.imgur.com/bYFWH8j.jpg

3. Here, converted to RGB with PC.709 and back to YV12 with PC.709:
http://i.imgur.com/UFYM8MH.jpg

4. converted to RGB with PC.709 and back to YV12 with Rec709:
http://i.imgur.com/tRA2Nr9.jpg

Clearly not #4, but which of the other three do you think is the most likely scenario given the possibilities that the KDenLive Histogram is either:
a) Deriving "full scale" Y values from RGB and so the 0-255 scaling is valid - which would apply to #2
or, as I suggested before:
b) Deriving limited (16-255) Y values from RGB but presenting them (incorrectly) as a 0-255 scale - which would apply to #1 and #3

I guess it's a bit of a redundant exercise now if you are certain about what the code reveals.....I'm just not entirely clear what you are saying the conclusions are ? So the "array" merely refers to the "representation" scaling then ?

Last edited by Inapickle on Tue Feb 09, 2016 5:57 am, edited 1 time in total.
Inapickle
Registered Member
Posts
157
Karma
3
TheDiveO wrote:BTW, MLT uses this beast here under its hood for image scaling and colorspace conversion: https://www.ffmpeg.org/libswscale.html


Thanks, yes, another facet of MLT/FFMPEG I've yet to acquaint myself with, except to learn that the full scale pix_format (yuvj420p, yuvj422p) issues touched on before stem from it.

Actually, VapourSynth recently abandoned libscwale and now uses Zimg as it's default core resizer library:

http://www.vapoursynth.com/2015/12/r29- ... o-swscale/

And the plugin fmtconv offers further options for colourspace and bit depth conversion with dithering. Having got VapourSynth up and running I aim to look at some of the other things we discussed earlier - compression with dithering to minimize 'luma' banding etc. But for now, getting to grips with the basics - scripting syntax, colour-space conversions and piping out to FFMPEG.
TheDiveO
Registered Member
Posts
595
Karma
3
OS
It would be interesting to see if zimage does better scaling ... because that has in some part been a weak spot of MLT and thus Kdenlive people have complained about.
Inapickle
Registered Member
Posts
157
Karma
3
Yes, that's what I aim to find out and also whether there's any way to get full scale transcodes (other than RawYV12/YU2 and x264) out of ffmpeg that avoid this yuvj420p pix_format issue. Consensus in the VapourSynth camp is avoid letting ffmpeg (swscale) do the format conversions if you can, as zimg or fmtconv will do it better. We'll see. Although similar to AVISynth in some ways, it uses syntax for format conversions that are a bit of a learning curve and I don't want to bug people too much about newbie stuff. What I'm finding though is that some fairly complex scripted AVISynth routines that I've used for a long time can be simplified in VapourSynth as long as you get the syntax correct. But I'll get there.

And your thoughts about the previous post? I'd go onto to do more tests with other effects (curves, levels) to see if I could deduce more about the behaviour, but if it's already clear from the Histogram code, it would seem a bit pointless. I'm just not clear whether your conclusion from the code is that full scale luma values are being derived from RGB or that it derives limited 16-235 scale luma values and presents them on the Histogram as though 0-255. As seen those simulations I did, both derivations produce a similar luma profiles (and to the KenDenLive Histogram) , just different scaling.

Edit: Just one comment though about that KDenLive Histogram screen shot I put up before:

http://i.imgur.com/tPYQSj3.png

One might question why we are seeing a minimum luma value of 3 there. Surely if it is 16-235 range being presented as 0-255, one would expect the minimum luma value of that compressed 32-220 range transcode to be up around 20-24. Probably what we are seeing there are trace roll-off values from the scaling. Looking at the equivalent Luma WFM:

http://i.imgur.com/S3frr1X.png

If you move the hairline cursor, that gives luma value on the vertical scale, up to the floor of the waveform, it's around 20 but you can make out those trace luma values falling below it. Can't take a screen shot with the positioned hairline as both require the cursor.
Inapickle
Registered Member
Posts
157
Karma
3
Just a note about the VapourSynth script for generating the YUV Histogram that I posted earlier:

Inapickle wrote:
Code: Select all
import vapoursynth as vs
core = vs.get_core()
clip = core.ffms2.Source("Test.mov")
clip = core.hist.Levels(clip)
#clip = core.resize.Spline36(clip,1632,810)
clip.set_output()



Last week, I contacted the maintainer of the VapourSynth (for Ubuntu) ppa:
Inapickle wrote:
Code: Select all
ppa:djcj/vapoursynth

From here:
https://launchpad.net/~djcj/+archive/ubuntu/vapoursynth


Requested that he include another source plugin, L-Smash Source, in the Extra Plugins package. Didn't receive a response and so I set about figuring out how to compile and install it from the git packages; quite finicky to say the least. Anyhow, he notified me yesterday that he'd updated the Extra Plugins package with L-Smash Source included. So if you installed VapourSynth from that ppa (as per my earlier instructions) you will likely have received an update prompt for the VapourSynth Extra Plugins and NNEDI3 plugin packages. I've just installed and tested the updates; whilst the L-Smash Source plugin auto-loads and works as expected, there is now a bug loading the ffms2 source plugin, which is the one used in that YUV Histogram script. I've contacted him about it.

Meanwhile, if you have installed the update and likewise find that you cannot load the YUV Histogram script, then use LWLibavSource (one of the two video source filters in the L-Smash Source plugin) for now, in place of ffms2.

Here's the amended script:

Code: Select all
import vapoursynth as vs
core = vs.get_core()
clip = core.lsmas.LWLibavSource(source=r"Path......./Test.mov")
clip = core.hist.Levels(clip)
#clip = core.resize.Spline36(clip,1632,810)
clip.set_output()

Last edited by Inapickle on Wed Feb 24, 2016 5:33 pm, edited 1 time in total.
Inapickle
Registered Member
Posts
157
Karma
3
OK, as regards:

Inapickle wrote:Yes, that's what I aim to find out and also whether there's any way to get full scale transcodes (other than RawYV12/YU2 and x264) out of ffmpeg that avoid this yuvj420p pix_format issue.


Here are some routines for transcoding native HD-AVC DSLR/Camcorder clips (typically .mov or.mp4) recorded with full luma range, in manner that avoids the whole yuvj420p and yuvj422p flag (swscale) fiasco and without recourse to the use of AviSynth via Wine (not that I could get it to work anyway).

For transcoding to YV12 formats it's actually quite straightforward using the ffmpeg merge planes filter. Configured appropriately, the filter extracts and recombines the YUV planes, so allowing pass-through of the full luma scale without deference to the source clip's pesky yuvj420p flag. Here's an example, transcoding say a .mov clip from a Nikon/Canon DSLR to UTVideo YV12 and retaining the original PCM audio.
Code: Select all
ffmpeg -i Path..../FSVideo.mov -vf mergeplanes=0x000102:yuv420p -vcodec utvideo -r 30000/1001  -s 1920x1080 -colorspace bt709 -acodec copy -y path......./FSVideo_UTVYV12_PCM.mkv
(Edit: Just remembered that when encoding HD to UTVideo with ffmpeg the color matrix (colorspace) needs to be specified as bt709, otherwise it will default to bt601)
So now we have a lossless transcode acceptable for input into KdenLive. The Clip Properties profile will show it to have a yuv420p pixel format but the luma range will be full scale, as per the source. And the behaviour will be the same as that described earlier for 'atypical' transcodes created using the VFW Utvideo codec. Leaving the “Full Luma Range” option off (default), the full luma scaling will be passed through except where an “RGB requiring” effect/transition is applied, in which case the luma will be clamped (limited) to 16-235 range. With the “Full Luma Range” option selected however, it will behave just like the native mov clip – the luma range will be compressed to 16-235, irrespective of whether an effect is applied or not.

As for transcoding to a YUY2 (4:2:2) edit intermediates like DNxHD, unfortunately, the ffmpeg mergeplanes filter does not support yuy2. One obvious workaround would be to transcode in two stages, first to UTVideo (YV12) using mergeplanes and thence to DNxHD – which will work, but means creating large intermediate files in the process. Not very efficient.

Fortunately there is another approach using our friend VapourSynth. So for this you'll need to install VapourSynth and the VapourSynth Editor as per the procedure given earlier for generating the YUV Histogram.

VapourSynth Script Set-up


As before, we open up VSEditor and create a script, as per this example:
Code: Select all
import vapoursynth as vs
core = vs.get_core()
#clip = core.ffms2.Source("Path......../FullScaleVideo.mov")
clip = core.lsmas.LWLibavSource(source=r"/Path......../FullScaleVideo.mov")
clip = core.fmtc.resample (clip, css="422")
clip = core.fmtc.bitdepth (clip, bits=8)
clip = core.std.AssumeFPS(clip, fpsnum=30000,fpsden=1001)
#clip = core.hist.Levels(clip)
clip.set_output()

Recall that # at the beginning of a line hides that line and deleting the # activates it. In this way you could set up a generic template script that can be modified as needs require.

Breaking the script down, we he have the input, specifying the source filter used and the path and name of the clip in question:
Code: Select all
clip = core.lsmas.LWLibavSource(source=r"/Path......../FullScaleVideo.mov")

So again, in this example we're loading a fullscale 1080/29.970p HD-AVC.mov clip from say a Canon or Nikon DSLR. I've included there the options of using the ffms2 source plugin or the LWLibavSource filter from the L-SmashSource plugin. See my comment in the last post about that.

Next, the colorspace conversion (resampling) of 8-bit YV12 (4:2:0) to YUY2 (4:2:2). For this we're using the fmtconv plugin which performs the resampling via 16-bit. The second line converts (dithers) back to 8-bit:
Code: Select all
clip = core.fmtc.resample (clip, css="422")
clip = core.fmtc.bitdepth (clip, bits=8)

It goes without saying that if these lines are omitted, the script will output YV12. So you could also use this approach for transcoding to YV12 formats, especially if you want to make use of Vapoursynth for other pre-filtering (denoising, sharpening).

Then a line that asserts the frame-rate of the video source. So, in this example:
Code: Select all
clip = core.std.AssumeFPS(clip, fpsnum=30000,fpsden=1001)

Ammend this as appropriate for the source, e.g.
fpsnum=25 ,fpsden=1 for 25fps
fpsnum=50 ,fpsden=1 for 50fps
fpsnum=60000 ,fpsden=1001 for 59.94fps

(Note this is not for changing frame-rates. There are other (scripted) functions for that)

Once set-up, open up the VSEditor Preview to check the script works. There will be error messages in the VSEditor log area if there's an issue. When the script is loaded for the first time (opening Preview) there will be a lag while the index file is generated. So for long clips there can be a bit of a wait before the Preview opens. The index file (.ffindex for ffms2 and .lwi for LWLibavSource) will be created at the same location as the source clip. If for any reason it gets deleted, a new one will be generated when the script is loaded again. And of course a new index file will be generated for a different input clip.

If you also want to check the YUV Histogram at this time then un-hide the line:
Code: Select all
#clip = core.hist.Levels(clip)

Be sure to hide it again, otherwise it will be included with the output.

Once you are happy with the script, name and save it as a .vpy file in whatever Home folder (/Documents or whatever). I've named it FullScale.vpy for the procedures that follow. Next step is to pipe the output to ffmpeg for transcoding to the desired format.

Piping out to FFMPEG

I'll give a few examples here assuming that the intent is to either retain the source audio format or encode it to another audio format compatible with the target video format.

Continuing with the example of a Canon or Nikon 1080/29.97p mov clip recorded with PCM audio. And we want to encode it to DNxHD (8-bit, 220Mbps) in YUY2 colorspace, keeping the original audio, and saved as a .mov file.
FFMEG command line:
Code: Select all
vspipe -y Path....../Fullscale.vpy - | ffmpeg -f yuv4mpegpipe -i - -i Path......./FullScaleVideo.mov -map 0:v -map 1:1 -c:v dnxhd -b:v 220M -c:a copy -y Path......./FullScaleVideo_DNxHD.mov

Where:
Path.../Fullscale.vpy is the path/name of the VapourSynth script.
Path.../FullScaleVideo.mov is the path/name of the source video, needed to map and remux the audio.
Path.../FullScaleVideo_DNxHD.mov is the path and name of the encoded video file.

And as a second example, let's say we have a HD-AVC 1080/25p.mp4 clip from a Panasonic DLSR, recording with aac (lc, 48KHz, stereo) audio, and we want to encode it to DNxHD (8-bit, 185Mbps) with PCM (48Hz, 16bit, Stereo):
Code: Select all
vspipe -y Path....../Fullscale.vpy - | ffmpeg -f yuv4mpegpipe -i - -i Path......./FullScaleVideo.mp4 -map 0:v -map 1:1 -c:v dnxhd  -b:v 185M -c:a pcm_s16le -y Path......./FullScaleVideo_DnxHD_PCM.mov

So then it's just a question of loading the command into a terminal and hopefully, if all has been set-up correctly, you should end up with the target transcode file in the destination folder. Some ffmpeg warnings that may arise during the encode are :
Code: Select all
[yuv4mpegpipe @ 0xc52ec0] Stream #0: not enough frames to estimate rate; consider increasing probesize

This is something I'm still looking to resolve, at least when running these encodes in Kubuntu 15.10 AMD64 (wily) with ffmpeg version 2.7.6-0ubuntu0.15.10.1. The encode will complete, but you will not see the continuous update of the encoding rate. Running the same routines in Mint KDE 17.3 I'm not seeing this warning. Since Mint 17.3 doesn't come with ffmpeg. I installed it from: https://launchpad.net/~mc3man/+archive/ ... usty-media and currently have ffmpeg version 7:3.0.0+git~trusty. So maybe an update of ffmpeg version 2.7.6 to 3.0.0 is what's needed. Meanwhile are there any FFMPEG experts out there who could suggest a workaround - how/where does one increase the probesize ?

And you may also see at points in the encoding:
Code: Select all
[yuv4mpegpipe @ 0xc52ec0] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8)

Again, the encode will complete, but to avoid these bottlenecks try what the message advises. I found a thread_queue_size value of 512 enough, e.g.

Code: Select all
vspipe -y Path....../Fullscale.vpy - | ffmpeg -thread_queue_size 512 -f yuv4mpegpipe -i - -i Path......./FullScaleVideo.mov -map 0:v -map 1:1 -c:v dnxhd -b:v 220M -c:a copy -y Path......./FullScaleVideo_DNxHD.mov

So there we have it. Sorry if this is pretty elementary stuff for people experienced with ffmpeg but I thought it would be helpful for people who don't have that experience. I had never done any FFMPEG piping before this, so this was a learning experience for me also.

Still to follow - using VapourSynth for pre-compressing full scale luma sources.

Last edited by Inapickle on Mon Mar 07, 2016 3:07 pm, edited 5 times in total.
Inapickle
Registered Member
Posts
157
Karma
3
OK, referring back to the observations made here:
viewtopic.php?f=272&t=130641&start=15#p350551
and further comments made on the this subject:
viewtopic.php?f=272&t=130641&start=45#p351236
viewtopic.php?f=272&t=130641&start=45#p351245
Quote:
TheDiveO wrote:It would be interesting to see if zimage does better scaling ... because that has in some part been a weak spot of MLT and thus Kdenlive people have complained about.


Here is a VapourSynth script for compressing full range luma (0-255) YUV to "limited" range luma (16-235) YUV , or so called "TV Levels".
Taking again, the example of say a "Full Range" 1080/29.97p HD-AVC.mov clip from a Nikon/Canon DSLR:
Code: Select all
import vapoursynth as vs
core = vs.get_core()
#clip = core.ffms2.Source("Path......../FullScaleVideo.mov")
clip = core.lsmas.LWLibavSource(source=r"/Path......../FullScaleVideo.mov")
#
#Convert Full range to Limited range:
#clip = core.resize.Point(clip, range_in_s="full", range_s="limited")
clip  = core.fmtc.bitdepth (clip, fulls=True, fulld=False)
#
#Convert YV12 to YUY2
#clip = core.fmtc.resample (clip, css="422")
#clip = core.fmtc.bitdepth (clip, bits=8)
clip = core.std.AssumeFPS(clip, fpsnum=30000,fpsden=1001)
#clip = core.hist.Levels(clip)
clip.set_output()

Included are the two options for the "full range" to "limited range" conversion:

The first using VapourSynth's built-in "resizers", which now use the zimg library:
Code: Select all
clip = core.resize.Point(clip, range_in_s="full", range_s="limited")

And the second using the fmtconv plugin:
Code: Select all
clip  = core.fmtc.bitdepth (clip, fulls=True, fulld=False)

So let's see how they fare compared to MLT/ffmpeg (using swscale).

Taking the full range YV12 "greyscale gradient" clip that I used in the earlier tests
viewtopic.php?f=272&t=130641&start=15#p350551
Here are the YUV Histograms:

1. Original Full Range source:
http://i.imgur.com/WewbdLl.png

2. Encoded to Matroska.mkv (HuffYuv-YUY2) using the KDenLive transcode pre-set:
http://i.imgur.com/t66g6YD.png

3. Encoded to HuffYuv (YUY2) with ffmpeg CLI:
http://i.imgur.com/PDGP3dQ.png

4. Converted with Vapoursynth to "limited range" YV12 (using built-in resize) and then YUY2 (for fair comparison with 1 & 2):
http://i.imgur.com/yyd3LeL.png

5. Converted with Vapoursynth to "limited range" YV12 (using fmtconv) and then YUY2:
http://i.imgur.com/vhzq2sl.png

And the winner, I would suggest, is VapourSynth using fmtconv. Can't see that the built-in resizer (using the Zimg library) really fares any better than MLT/ffmpeg (swscale) when it comes to "luma banding".

OK, so these observations suggest that when "compression" is the better option for editing in KDenLive, pre-conversion with VapourSynth (using fmtconv) may be a better choice than letting MLT/ffmpeg do it.

But what about the other end of the process ? I can see no render option in KDenlive for specifying full scale output and really that should be automatic, if the source has been marked as "full range" by virtue of the yuvj420p flag or forced "Full Luma Range" flag. That's what NLE's like Sony Vegas do; 8-bit YUV sources marked as being "full range" YUV are assigned (by default) to what Vegas calls "Studio RGB" - basically the equivalent of "PC.709" that I referred to earlier; 0-255 range YUV (Y'CbCr) values are mapped to [0,0,0]- [255,255,255] RGB and then mapped back to 0-255 range YUV, (Y'CbCr). In this way the original 0-255 scaling is preserved.

What's happening in KDenLive, in effect, is that 0-255 range YUV (Y'CbCr) is being mapped to [0,0,0]- [255,255,255] RGB, but mapped-out to (limited) 16-235 range YUV. To my mind, there should at least be a option in the render settings (equivalent to the "Full Luma Range" input flag) to force full range YUV output in such cases. As it stands, the only workaround I can see is to render out to one of the lossless YUV formats and re-scale to full range afterwards. Here is a VapourSynth script for doing just that - basically just the first script in reverse:
Code: Select all
import vapoursynth as vs
core = vs.get_core()
#clip = core.ffms2.Source("Path......../LimitedScaleVideo.mkv")
clip = core.lsmas.LWLibavSource(source=r"/Path......../LimitedScaleVideo.mkv")
#
#Convert Limited range to Full range:
#clip = core.resize.Point(clip, range_in_s="limited", range_s="full")
clip  = core.fmtc.bitdepth (clip, fulls=False, fulld=True)
#clip = core.hist.Levels(clip)
clip.set_output()

And for interests sake, here are the YUV Histograms when we do a 'round trip' conversion - Full-range YV12 > Limited range > YUY2 > Full-range YUY2 using Vapoursynth:

1. Using the built-in (zimg) resizer:
http://i.imgur.com/RiBEoV8.png
Ouch!!

2. Using fmtconv:
http://i.imgur.com/Spd1JlL.png
A lot better, no?

One final note. Please don't apply the above compression routines to 16-255 range sources; it will result in this:

http://i.imgur.com/9F53RlI.jpg
Compressed to 32-235 range - not good.

As we observed earlier, these 16-255 sources (native clips and their transcodes) behave differently to full-scale sources in KDenlive. When no effects are applied the 16-255 range is "passed through" and is only "clamped" (not compressed) to 16-235 when an"RGB requiring" effect/transition is applied.

If it is considered desirable to uniformly "pre-clamp" to limited range using VapourSynth, use this conversion line instead:
Code: Select all
clip = core.std.Limiter(clip, min=16, max=235, planes=0)
This will produce the desired result and the original luma gradient will be preserved:
http://i.imgur.com/Alc6xoJ.jpg

As discussed earlier:
viewtopic.php?f=272&t=130641&start=15#p350047
Really, the only case for compressing (downscaling) 16-255 range luma down to 16-235 is in special situations where "bringing-down" highlight detail that may be lurking in the "super-white" range in some way 'saves' or significantly improves a particular shot/scene. The penalty is that there will be some loss in contrast (flattening), as the slope of the luma curve will be lowered. Note that this cannot be done with levels or curves once the clip is on the timeline - at that point it is too late, the "super-whites" will be already clipped (clamped) by the conversion to RGB. It needs to be done by prior levels adjustment in YUV (Y'CbCr) colorspace. And in Vapoursynth that can be done using the built-in Levels function, like so:
Code: Select all
clip = core.std.Levels(clip, min_in=16, gamma=1.0, max_in=255, min_out=16, max_out=235, planes=0)
And the result:
http://i.imgur.com/U0zsV3c.jpg
Well, I guess to the naked eye it's OK, but notice that luma spiking in the YUV Histogram again. Fortunately, there is another way using a scripted import of the AVIsynth SmoothLevels filter:
http://i.imgur.com/eg1pjuh.jpg
Now that's better. But it requires a bit more set-up. I'll come back on that shortly.

Last edited by Inapickle on Wed Mar 09, 2016 7:50 pm, edited 3 times in total.


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Sogou [Bot]