Hope this help you to explain Hi-Res music to your CD friends
Status
Not open for further replies.
May 8, 2024 at 9:04 PM Post #421 of 517
I have to read Amir's post a bit closer, I was in a bit of a hurty.....but going by your response....56% is a PASS because you have breached the threshold beyond random chance....given enough tests(n), given the probability and the confidence intervals. etc.



I'm not sure where you are getting one cannot take their time with these tests..., you just need n=160

Typical - you cherry picked from a post you didn’t bother to read. Good to know that you believe any single result from a Google search is an accurate representation of, well, anything. No need for validation or cross reference - someone on the internet posted something, so research complete…

You can take all the time you want with an ABX but to be statistically relevant, the tests need to be done in a single sitting. 160 tests would require comparing a total of 320 individual samples (2 per test). I stand by my statement that few if any ABX test subjects have the stamina to accomplish that in a single session.

That said, please go ahead and have a properly proctored ABX session with 160 comparisons and share the results. Or take 16 properly proctored ABX sessions with 10 test per session and show that you can achieve an 80% or greater success in each of the 16 sessions.

Then repeat each at least 10 times to validate that a single 160 test session wasn’t a statistical aberration. 1600 data points might be enough to provide an interesting analysis.
 
May 8, 2024 at 9:05 PM Post #422 of 517
That is, it is your real experience so it is OK to tell me how well two components will synergise but at the same time you know (but don't say at that time) there is no basis in fact that there is any genuine synergy that could be confirmed by testing.

To me that is burying ones head in the sand and passing on factually incorrect information. That is assuming of course that it is portrayed as "fact" as is so often the case on HF. Nobody says they found that $1,000 cable X was fabulous with IEM Y because it looks cool and cost a lot so I imagined that they sounded incredible together.

Agreed with everything you said!

To me personally, when I talk to a person who is genuinely interested in the hobby, I don’t put my audio science hat on and just let the guy try out a couple of gears within his/her price range. It's off putting and weird for me to say to an newcomer yeah you're wasting money on most of them since they all sound the same under DBT ABX volume matched test. Just let the person explore and enjoy the hobby within reasonable manner

When somebody says yeah I spent 1K on cable X for IEM Y, they genuinely express their own perception/synergy with those two, not that they look great cause you can just literally go to aliexpress and buy far better looking $20 cables than the 1K-8K cables on the market

The BIG caveat is that they must put the YMMV at the end of the sentence
 
Last edited:
May 8, 2024 at 9:10 PM Post #423 of 517
Thanks for your reply

Thanks for your suggestion. I agreed 100% what you said. I do agreed that I replied a bit too fast as my mind is running too fast... I will dial down a bit (one of the AI config? LOL). I waited for hours even for this reply.
Meanwhile, I think I should give you guys more time to think as what I brought here may be very different from other posters. Given my background, I could look at an issue from very different angles.

Here I use the plural form of the word angle. I can see from Physics (hard-core scientific angle), from Pscychology (how pepole feel, think, behave; how the effects of confirmation bias, visual / audio illusion affect one's behaviour, etc), and from Computer Science (practical angle: like varios limitations, e.g. quantization limits, limitations of floating point representation, measurement tolerance, difference between accuracy / precision, etc).

It looks to me a lot of people have a tendency to look at an issue from a single angle as most people are specialized in one area.

Audio Science, like neuroscience (another subject I like), is a Multidisciplinary subject. It is very hard to master as it involves a lot of expertise in other areas as @gregorio mentioned earlier in his repy. Don't get me wrong. I am not an expert in Audio Science (as I emphasized earlier). To me, I think I am still at kindergarden level regarding Audio Science (well... may be at primary school level now). Therefore, I want to learn from you guys regarding Audio Science. Meanwhile, I would like to share what I found out too.

Incompatible contents? Are these contents factual?

I know the contents I brought are incompatible with the beliefs of most people here.
For me, I experienced the same unease/weird feeling too as I couldn't believe what I see initially when I saw the real stair step output graph from Topping E30 as shown in the original article. At that time, I have a feeling that "something is wrong...."

My understanding of digital audio told me that there should not be any stair step audio output. I could just walk away from that feeling and conclude that with something like to myself: "Hmm... yes, something is wrong... it should be the graph is fake, or the graph is taken from some internal testing points of the DAC rather than the final audio ouput, or the authour must be using some special, broken, weird filter to create such stair-step graph intentionally to prove something.... "

However, my critical thinking say no for such response and it prompts me to find out more of the root cause of this "unease/weird" feeling. I have a feeling that I need to double check all the things involved even from the very fundamental understanding/beliefs like "1+1=2".

Bingo! I found it, it was indeed related to the "1+1=2" belief in the sampling theory. I was wrong as I overlooked the limitation of the theory, i.e. it only works perfectly in the ideal situation. It would compromise under real world situation. This shows to me again that our deep beliefs regarding concepts lik the "1+1=2" or "universal truth" could be "correct but absolutely"

I feel much better now. Everything is solved. My critical thinking helps me again. It never fails.

p.s.: Hmm... could AI-driven create the text above? :thinking:...... I think it could NOW as this reply was posted and the AI BOT can use my writing to create its content now. LOL

Have any balsamic and olive oil for that word salad?

Good luck with whatever you’re trying to accomplish here, but I value my time too much to waste it engaging with you.
 
May 8, 2024 at 9:12 PM Post #424 of 517
Agreed with everything you said!

To me personally, when I talk to a person who is genuinely interested in the hobby, I don’t put my audio science hat on and just let the guy try out a couple of gears within his/her price range. It's off putting and weird for me to say to an newcomer yeah you're wasting money on most of them since they all sound the same under DBT ABX volume matched test. Just let the person explore and enjoy the hobby within reasonable manner

I see where you are coming from, like I later added to my previous comment, your approach is more philosophical and mine more pragmatic.

If somebody asks, for example, if a certain DAC is going to add this or that to the sound versus another DAC I tell them that based on my experience they are either going to make no audible difference at all and (being very generous) they will sound a great deal more the same than different. I say that because that is my experience and my firm belief not because science says it should be so and I am regurgitating others ideas. Regurgitating of ideas seems to be a very common trait with this stuff.
 
Last edited:
May 8, 2024 at 9:17 PM Post #425 of 517
"Hmm... yes, something is wrong... it should be the graph is fake, or the graph is taken from some internal testing points of the DAC rather than the final audio ouput, or the authour must be using some special, broken, weird filter to create such stair-step graph intentionally to prove something.... "

Graph is not fake. Literally a DAC without digital filter or in this case the AKM supers slow roll-off outputs a stairstep sinusoidal wave in the time domain! That's the final audio reconstructed output in time domain that needs further clean up through extensive and expensive and inaccurate analog filtering of images, quantization noise from transition band, etc. with I/V output transformers, tubes, coupling capacitors, etc.!
 
May 8, 2024 at 9:19 PM Post #426 of 517
I see where you are coming from, like I later added to my previous comment, your approach is more philosophical and mine more pragmatic.

If somebody asks, for example, if a certain DAC is going to add this or that to the sound versus another DAC I tell them that based on my experience they are either going to make no audible difference at all and (being very generous) they will sound a great deal more the same than different. I say that because that is my experience and my firm belief not because science says it should be so and I am regurgitating others ideas. Regurgitating of ideas seems to be a very common trait with this stuff.

That's natural, you express what you experience to a newcomer in the hobby, and that's what ultimately it boils down to without being weird or elitist!
 
May 8, 2024 at 9:20 PM Post #427 of 517
Have any balsamic and olive oil for that word salad?

Good luck with whatever you’re trying to accomplish here, but I value my time too much to waste it engaging with you.

Not sure I read the salad correctly but I think the surprise tasty little crouton was that he was wrong.

The narcissism won't allow outright admission, it has to be tangled up in further self congratulation.
 
May 8, 2024 at 9:56 PM Post #428 of 517
Typical - you cherry picked from a post you didn’t bother to read. Good to know that you believe any single result from a Google search is an accurate representation of, well, anything. No need for validation or cross reference - someone on the internet posted something, so research complete…
You can read it for yourself, as I posted the link.
56% for a 160 sample test
63% for a 40 sample test
70% for a 20
80% for a 10


You can take all the time you want with an ABX but to be statistically relevant, the tests need to be done in a single sitting
Please explain why that is Statically relevant.

160 tests would require comparing a total of 320 individual samples (2 per test). I stand by my statement that few if any ABX test subjects have the stamina to accomplish that in a single session.

Are you testing stamina of the listener or whether they can discriminate audio formats?

If the former, I understand why you want one session. If the latter, I don't understand the relevance of the insistence of it being a single session.


That said, please go ahead and have a properly proctored ABX session with 160 comparisons and share the results. Or take 16 properly proctored ABX sessions with 10 test per session and show that you can achieve an 80% or greater success in each of the 16 sessions.
Whether one takes 160 tests all at once, or 10 x 16 for a cumulative 160 total, the PASS should still be 56%, unless you can provide a reason WHY multiple sessions would negate the statistical significance of the test??


Then repeat each at least 10 times to validate that a single 160 test session wasn’t a statistical aberration. 1600 data points might be enough to provide an interesting analysis.
Taking 1600 samples in a test, means passing is somewhere in the low 50s percentage wise, even lower than the 56% of the 160 sample test.
 
Last edited:
May 8, 2024 at 10:13 PM Post #429 of 517
First let me say this is just supposition but I think it makes sense in respect of testing at one sitting.

I think we all understand that we hear differently or at least perceive differently at different times depending on a raft of factors such as mood, health etc.

Ideally if comparing two audio related items whatever they may be we would want all the testing done at one specific moment in time where out perception is fixed. That obviously can't happen so the next best thing is to do it in one sitting over a somewhat short space of time so that our perception is as close to the same as possible at each individual test listen.

Extend the concept out over an unreasonable timeframe and I think the context makes sense. Say you did 50 tests one test a week for a year. You could obviously have potentially quite different perceptions of sound on each occasion which would potentially skew the outcome of such a test.
 
May 8, 2024 at 10:21 PM Post #430 of 517
Apologies, a little OT but think it’s tangentially relevant to this conversation:

An example of how information sources that the ML/AI solution developers choose to target can influence machine learning responses.

I suspect ChatGPT has its training pointed at the general internet with thousands of audiophile sights vs. only a few audio science based sources. The following was using the older 3.5 Large Language Model. I’ll run the same through ChatGPT’s 4.0 LLM later to see if the current model “does better”. Bottom line - AI/ML solutions are only as good as there discovery/data accumulation models and the specific wording of the user’s question and at this time, should not be assumed to be authoritative.

Q asked:
Is high resolution audio beneficial to listeners vs. CD quality?

ChatGPT 3.5 response:
High-resolution audio can offer benefits to listeners compared to CD quality, especially for those with high-end audio equipment and discerning ears. It provides greater detail, depth, and dynamic range, potentially enhancing the listening experience. However, the difference may not be significant for all listeners or all types of music, so it depends on individual preferences and equipment quality.

Sigh. Thank you TAS, Stereophile, cable vendors, and audiophile sites. Yet another reminder to not use any single source when looking for accurate information.
Cool, I did similar experiment with Microsoft Co-pilot:

Screenshot 2024-05-09 at 10.08.09.png


Screenshot 2024-05-09 at 10.07.39.png

Screenshot 2024-05-09 at 10.08.01.png


Screenshot 2024-05-09 at 10.18.06.png


Hmm... is GPT-4 better than ChatGPT 3.5? I am not an AI expert. No comment.
 
May 8, 2024 at 10:30 PM Post #431 of 517
Cool, I did similar experiment with Microsoft Co-pilot:








Hmm... is GPT-4 better than ChatGPT 3.5? I am not an AI expert. No comment.
Sorry, I was not using the exact question. Let's do it again with the exact question:

Screenshot 2024-05-09 at 10.27.51.png

Screenshot 2024-05-09 at 10.27.17.png

Screenshot 2024-05-09 at 10.27.43.png

Interesting... Please note these are Co-pilot's view. Not mine.
 
Last edited:
May 8, 2024 at 10:33 PM Post #432 of 517
You can read it for yourself, as I posted the link.
56% for a 160 sample test
63% for a 40 sample test
70% for a 20
80% for a 10



Please explain why that is Statically relevant.



Are you testing stamina of the listener or whether they can discriminate audio formats?

If the former, I understand why you want one session. If the latter, I don't understand the relevance of the insistence of it being a single session.



Whether one takes 160 tests all at once, or 10 x 16 for a cumulative 160 total, the PASS should still be 56%, unless you can provide a reason WHY multiple sessions would negate the statistical significance of the test??



Taking 1600 samples in a test, means passing is somewhere in the low 50s percentage wise, even lower than the 56% of the 160 sample test.

I did read it for myself, apparently before you did. Your original post was simply 56%=Pass, with no reference to test sample count. I even referenced the 10 test, 80% Pass from the link. Thanks for catching up though.

Without deep diving into the basics of statistical analysis, any single small sample size is at risk of being misrepresentative. Multiple independent tests using the same controls would allow for averaging and more importantly, a look into standard deviation that would add more reliability to the data

That said, it’s unrealistic to hold individual audio enthusiasts to the same rigor as medical testing. If you take an independently proctored ABX with proper controls, post the results - that would make the discussion much more interesting.

Edited to include post responded to
 
Last edited:
May 8, 2024 at 10:46 PM Post #433 of 517
Cool, I did similar experiment with Microsoft Co-pilot:








Hmm... is GPT-4 better than ChatGPT 3.5? I am not an AI expert. No comment.

Good question about 3.5 vs. 4.0

Yes, GPT4 is better than GPT3.5 when used by ChatGPT. Microsoft copilot uses a similar architecture to ChatGPT, but does not utilize as full an implementation as ChatGPT and is also very early in its initial public release. I don’t think one could definitively say that the current Copilot implementation is better than ChatGPT using the 3.5 engine and LLM. To determine that, you would need to know much more about both product’s LLM and specifics about how those models have been trained. Anyone with that level of information about MS Copilot would certainly also have an NDA and wouldn’t be able to reveal those details.

One limitation of ChatGPT using 3.5 is that the 3.5 LLM was frozen at the end of 2022, so anything published after that is outside of the scope of its responses, though I don’t think that’s going to have much impact with anything regarding baseline audio science. Something to be aware of though.
 
Last edited:
May 8, 2024 at 10:58 PM Post #434 of 517
Agreed. A simple reworking of the question gets a different response. Still not accurate, but at least less definitive.

Follow on to earlier Q.
This response is driven by marketing material. Please rewrite using only peer reviewed and evidence based sources
ChatGPT 3.5
Studies comparing high-resolution audio to CD quality have shown mixed results regarding perceptible differences to listeners. While some research suggests that individuals with trained ears and high-quality equipment may perceive improved sound quality with high-resolution audio, other studies have found no significant difference in subjective listening tests. Factors such as listener preferences, room acoustics, and equipment quality can also influence perceived differences. Therefore, the benefits of high-resolution audio compared to CD quality remain subjective and may vary among listeners.
Is the phrase highlighted in red attempting to guide the answer (aka manipulate the result)?

IMO, the question is already biased. A better one would be just 'This response is driven by marketing material. Please rewrite using only peer reviewed and evidence based sources"

Agree?
 
May 8, 2024 at 11:12 PM Post #435 of 517
I did read it for myself, apparently before you did. Your original post was simply 56%=Pass, with no reference to test sample count. I even referenced the 10 test, 80% Pass from the link. Thanks for catching up though.

Without deep diving into the basics of statistical analysis, any single small sample size is at risk of being misrepresentative. Multiple independent tests using the same controls would allow for averaging and more importantly, a look into standard deviation that would add more reliability to the data

That said, it’s unrealistic to hold individual audio enthusiasts to the same rigor as medical testing. If you take an independently proctored ABX with proper controls, post the results - that would make the discussion much more interesting.

Edited to include post responded to

Can't even pass Golden Sound test files with DBT ABX test myself just now for the second time again (1st one I managed 8/20), but interestingly enough this time, I managed to get 7 entries right correctly even though throughout the test I thought I heard the difference and was able to correctly identify it apparently but I was fully unaware that I actually got it right lol

Code:
foo_abx 2.2.1 report
foobar2000 v2.1.5
2024-05-08 19:48:07

File A: Test A (High Performance Filter).wav
SHA1: d626785e576b21b988a3ff3c59f85d3de27ed86d
File B: Test B (Normal Filter).wav
SHA1: 6cefd9bc846b7ba69d2bb06a869596cb740a4c0e

Output:
Default : Primary Sound Driver
Crossfading: NO

19:48:07 : Test started.
19:49:24 : 01/01
19:50:31 : 02/02
19:51:33 : 03/03
19:52:34 : 04/04
19:53:36 : 05/05
19:55:01 : 06/06
19:56:24 : 07/07
19:57:31 : 07/08
19:58:33 : 08/09
19:59:34 : 08/10
20:00:07 : 09/11
20:01:38 : 09/12
20:02:40 : 09/13
20:03:22 : 09/14
20:04:23 : 09/15
20:05:45 : 09/16
20:06:18 : 09/17
20:07:39 : 09/18
20:08:52 : 10/19
20:09:56 : 10/20

 ----------
Total: 10/20
p-value: 0.5881 (58.81%)

 -- signature --
e6633dff77f911900eabdf5493b43ce5257922bc
 
Status
Not open for further replies.

Users who are viewing this thread

Back
Top