60FPS is not fast enough to be accurate in the tests, you will be out by 16.6ms at best, as such you wouldn't see the 4ms difference that can occur between them, UI to move split I mention Not the input delay as a whole.
And I tested movement not shooting, with the same results that the gap can be 33-50ms ( your capture speed is likely adding the extra 16Ms you get here), also the game(demo) varies in the game load and action in question.
So I am clear, you intercept the device feed,add in a pixel colour line edit when input is sent, show on screen and then count in 16Ms points when the screen updates?
The way the device works is pretty much that. However, depending on how far down the screen the coloured bar initially appears, you can use that to determine timings less than one frame.
The camera method was a quick way to demonstrate a significant difference. I would not use that setup to do sub-frame accurate recordings. However, I do think that a 60fps recording is good enough to demonstrate a 66ms difference in a 30fps game, particularly in support of similar data from another method.
The way I have it set up does not give me easy access to movement, hence the reason I tested three other things that are attached to button presses. The digital nature also removes any possible difficulty in determining at what point a button is pressed because of dead zones in analogue sticks.
Perhaps you could go back and test the results you get for shooting?