Home Computer Audio Asylum

Music servers and other computer based digital audio technologies.

Software induced jitter: how long is a piece of string? (Part 2)

Part 1 gives some Jitter basics and introduces the value chain (“string”) of audio playback. Jitter’s domination ends at the analogue output within DAC chips. Likewise, jitter distortion during recording ends within the ADC chip. That is, during recording no further jitter distortion is added after an analogue signal is digitized. Audio processing applied to this digital data (e.g. dithering or gain) will affect sound quality but such (audio data) processing does not add jitter (as long as everything is kept in the digital domain).

Treatments that act to reduce playback jitter fully apply to recording and vice versa. Clean power supply, no vibrations, maintaining constant temperature, etc. all matters. Software (OS together with player/recorder) by way of jitter induced distortion also impacts both. What is not part of this discussion is audio data deliberately changed by software (either intentionally or not) that affects sound quality. The issue covered here is why the exact same audio data presented to the DAC and sourced from the same location but streamed through different software and OS can result in sound differences.

Before delving into jitter effects of software, it’s important to explore the value chain further. Part 1 covers devices acting to shield the DAC from incoming jitter distortion. Transformer coupling, buffering and reclocking is done with audio data commonly sourced from USB. Such designs do yield good results but suffer from 96k input limitation and a performance ceiling, namely, its own intrinsic jitter. Any superior signal (with lower jitter) sent to it will have minor benefit. Hence “I don’t hear any differences when making changes to my PC” comments from owners of such devices are common.

A common alternative is audio playback from a HDD using a regular soundcard. Streaming to the DAC (via soundcard connected by PCI, PCIe, Firewire or USB) takes the following path:

HDD (SATA/IDE/RAID) > Chipset > RAM > CPU (software) > RAM > Chipset > Soundcard (XO) > DAC


Network playback offers:

Ethernet > Chipset > RAM > CPU (software+netware) > RAM > Chipset > Soundcard (XO) > DAC


Both HDD and Network playback methods act to impede streaming of audio data: whilst soundcard retrieves audio data, playback software is concurrently reading the next audio data using the same resources (RAM and Chipset). Network playback incurs additional OS software (netware) and device/component overheads.

Memory player’s advantage is avoiding this resource conflict:

RAM > CPU (software) > RAM > Chipset > Soundcard (XO) > DAC


It also offers a shorter playback chain with fewer devices and associated software. This design is most optimal and lends itself to a direct 192k I2S DAC chip interface avoiding Firewire/USB, SPDIF or AES. It’s also the platform that is most revealing of software changes.

“Headless” PCs is applicable to all options and aims to remove the video/display, keyboard and mouse of the streaming PC. Most optimal headless option (but cumbersome) is to achieve control via the printer (or parallel) PC port (using a simple text based device or another PC). Serial ports and Ethernet should be avoided due to its continuous polling nature. Event driven designs are better.

Understanding Software

The CPU (Central Processing Unit) is the most complex part of the equation and its optimization has profound bearing on sound. “CPU (software)” in the above context includes the FSB & Memory Controller allowing for RAM I/O. Highly optimal systems have CPUs consuming ~50% or less of total power. Less optimal (and most common) setups will have CPUs (together with GPUs) consuming in excess of 80%! Heat generation is yet another indicator wherein poor setups require fan based cooling. Over-clockers resort to expensive water based cooling solutions. Super-computers used liquid Helium or Nitrogen cooling. The holy grail of computing is high temperature superconductivity.

Software (through instructions) control what the CPU does thus dictating how much power, resource arbitration, error handling and signaling (including signal reflection complexities) is needed. Modern CPUs such as Intel’s Core Duo contains in excess of 100 discrete components, e.g. L1 (Data & Instruction) Cache and L2 Shared Cache. Lesser known components are things like GPRs (general purpose registers up to 16 per core), Control Registers, Execution Units, Loop Cache, other front-end components (instruction prefetcher, decoder, etc.) and TLBs (translation lookaside buffers).

At the physical level each software instruction is decoded into one or more µops (micro-ops) that’s understood and processed via specialized Execution Units. Each clock cycle is able to complete 3-4 µops concurrently (i.e. a single core acts like 3-4 cores). Aggressive instruction pipelining is done together with “out of order execution” and macro-fusion (combining µops) to achieve amazing levels of performance and concurrency. Software in execution should be seen as a dynamic ever-changing electrical circuit at work.

The act of launching a program (for example double-click a desktop icon) results in the OS (Linux, Windows, Mac, etc.) loading the program into RAM, creating a new process task (thread) and adding it to the CPU Dispatcher’s list. The CPU Dispatcher allocates a fixed amount of CPU time in a round robin fashion to each and every active thread in the system. Interrupt processing has highest priority (called DPCs) followed by threads which are grouped into priority classes. The requested program eventually gets CPU “airtime” and is able to perform its intended task, e.g. prompt user for a media file to be played.

Given such an elaborate architecture, it makes sense to ensure an optimal playback solution has:

  1. Least amount of active threads (resulting is reduced Dispatcher overheads and ensuring less competition for CPU and RAM)
  2. Maximum available RAM (avoids paging to HDD, fragmentation and maximizes amount of audio data that can be RAM loaded)
  3. Least amount of device intrusion by way of reduced interrupt processing and associated software handling. Besides devices themselves are polluters and must be removed if unneeded


This type of optimization is referred to as “reducing the OS runtime footprint”. Left unchecked this creates an unfriendly environment that has a direct bearing on jitter. One can see this by considering an audio playback program in flight being randomly interrupted due to other active threads and/or device interrupts. Even worse, the CPU dispatcher schedules the program across different CPU cores causing L1 cache misses and unwanted “snoops”. Such unwanted activity causes additional power supply noise, ground noise and signaling overheads which directly impact audio data being streamed to the soundcard. This affects XO stability, hence jitter. Overall output signal quality to the DAC deteriorates. Conversely, a friendly environment seeks to create an ultra low-stress circuit wherein the XO delivers its free-running jitter performance at the DAC chip.

Audio Player

Having an optimal environment creates an ideal platform for revealing an audio player’s impact on sound. Yes sound differences are significant and are readily observed. Every audio player is slaved to the soundcard’s XO. That is, data streaming is a regular event and the XO determines exactly when audio buffer refills are needed. Hence we have a critical timing dependency. A periodic jitter relationship is established.

Audio playback at the CPU level is a sequence of software instructions in flight. At the physical level, these instructions are executed at a furious pace that translates into a dynamic electrical circuit. Whilst it’s easy to see a poorly designed circuit and its consequences, the same is not so with software. Poorly designed software cause excessive RAM I/O, intra-core L1 cache snooping, excessive (& expensive) pipeline stalls, cache trashing and are generally inefficient (often a forced outcome as a result of extreme flexibility). Such poor designs add to jitter distortion though electrical interference that destabilise the XO and reduce signal quality. Added power & ground noise, nasty current spikes, excessive signaling stress and timing variations is responsible for this.

Given that the CPU is such a crucial part of the audio equation and is directly controlled by the OS and player, its optimal usage is essential for best performance. Software when viewed as a dynamic electrical circuit has large implications for resource utilization and its consequences. Reducing CPU stress through software is an art. There are many paths to achieving this and requires a deep understanding of the CPU. Every player unleashes a different circuit causing jitter induced sound differences. Hence sound differences do occur even when the exact same bits are presented to the DAC chip. This is not a new scientific phenomenon – its well understood physics at work.

Conclusion

Whilst every other component in the chain receives optimal treatment why shouldn’t the CPU? Not optimizing the CPU is downright stupid. One achieves good improvement though moderating the CPU’s excessive power consumption (reducing EMI) by under-clocking (which also reduces RFI) and under-volting. Its optimization however doesn’t end here. Software is not without consequences. Ignore it at your peril.

Highest performance is achieved when classical notions of DAC as “Master” or “Slave” are superseded with the design principle of creating a partnership wherein both DAC chip and Computer Transport are slaved to the XO. Jitter is tamed when our “string” is shortest and audio is streamed under minimal stress.



This post is made possible by the generous support of people like you and our sponsors:
  Analog Engineering Associates  


Topic - Software induced jitter: how long is a piece of string? (Part 2) - cics 05:01:22 07/05/09 (61)

FAQ

Post a Message!

Forgot Password?
Moniker (Username):
Password (Optional):
  Remember my Moniker & Password  (What's this?)    Eat Me
E-Mail (Optional):
Subject:
Message:   (Posts are subject to Content Rules)
Optional Link URL:
Optional Link Title:
Optional Image URL:
Upload Image:
E-mail Replies:  Automagically notify you when someone responds.