Introduction and definitions
In the discussion and experiment that follows, I propose to define an Information Process as something that takes an input stream of information, if any, and produces an output stream of information, if any.
I imagine 5 basic types of Information Process:
Lossless, where more or better input, no matter how small, is reflected in more or better output.
Truncated, where the tiniest changes in input may make no difference in output.
Paradoxical, where less information can sometimes produce more or better output (such as tiny amounts of information being worse than none).
Independent, Where input if any is completely ignored in producing output.
Dumb, Where no output is produced.
Paradoxical sounds strange but may be the most common one. It seems to be a description of human beings, and, I would have thought, most computer programs of any significant complexity.
I was shocked to find out that a fairly complicated computer program I have been developing over the past few years is actually Lossless. No matter how little the input information it is given, even amidst increasing misinformation, it still performs better than with no information.
The only ways I could think to make it Paradoxical require that there be be information loss in the program itself. I can't say I've proven this but it looks to be correct for all such processes. Information lost within the process interacts with the quantity or quality of input information to produce the Paradoxical effect.
But if information loss causes Paradoxical behavior, what causes Truncating behavior? Well, the only answer for my program appears to be the injection of random numbers. Something like this appears to be generally correct, and I note that digital audio processors add dither when lowering resolution. I cannot prove that it always requires noise or random numbers, but it is an interesting conjecture that some sort of active operation has to take place rather than just being a passive loss where information just isn't copied, for example, because passive losses produce Paradoxical and and not Truncating.
Methods
The program MakePlaylist program randomly selects songs or albums for playlists to play as background music. I've worked on things like this for about 30 years ago and realized that you must use Random Without Replacement (RWOR) operation instead of Random With Replacement (RWR) to prevent a great and largely undesired disparity in how much things are played.
But I have been concerned with two different but similar problems. In the first situation, I was continuously playing songs in a background process until I feel like listening to music, then I turn the amplifier on and hear them. But I might only listed to the background process a few hours a day, and the rest of the time the songs are being marked as 'played' by the RWOR system even though they haven't actually been listened to. That was the kind of thing I was doing in the late 1990's, when I scarcely imagined such things as playlists, I scarcely imagined them until 2021 when I got Roon and an Oppo player I could use it with.
In the present situation, with MakePlaylist, I still might not actually listen to all the songs in a particular playlist before replacing it with a new one. Around 2023 this always happened with my playback program (Roon) after I shut off the streamer remotely. Roon would hang and when restarted it would start the playlist from the beginning. So that was a big problem then, but Roon has fixed those problems now, but it can still happen anyway that I don't listen to every song in every playlist because I decide to listen to a bunch of other individually selected items in the interregnum, then my place in the previous playlist may still be lost because Roon doesn't keep your place in a playlist if you start listening to other things. (Inconveniently, Roon never shows the number of the item in a playlist that you are playing either.) I can imagine it's pretty commonly done that way too, Roon is a High End player program that requires a subscription or large initial payment and generally has all the features available in other programs.
And then the question becomes, in cases like these, where the information about the songs played is combined with and perhaps even dwarfed by misinformation about songs being played that weren't actually played, is it possible that RWR performs better than RWOR ? And I can define this performance in a singular and objective way: how many playlists do I need to create to listen to every song in a library?
(Though one could also look at the distributions of times things were played and details like that, it's much easier for the present experimental purposes to boil the question down to one easily described and measured number, which obviously is related to the distribution too but it doesn't define the spread or kurtosis of the distribution which can vary among methods, but in all cases I have seen, RWR has higher kurtosis than RWOR meaning longer 'tails'.)
With a properly working RWOR system, and when you listen to every item in every playlist, it will take ceil (L / S) playlists to listen to everything where L is the size of the library and S is the size of each playlist and ceil is the ceiling operator which rounds up fractional bits. I call this number of playlists, or the number of songs listened to (depending on context) in such a count of playlists, one epoch.
I realize that the problems' I am investigating could be solved analytically with probability formulas, but I chose to solve them experimentally by building test shells around Make Playlist: first TestPlaylist and now TestPlaylistRange, because it was a lot of fun. Ultimately my results can be compared with the true means from analytic probability theory, though it is more difficult to get the distributions and other ancilliary information like that. Then my statistics such as confidence intervals can be assessed.
Tests show that MakePlaylist works correctly in both RWOR and RWR modes.
For convenience here, I have stuck with a library size of 1000 and playlists sized 100 items each, though that is longer than most people make playlists (20-30 is more typical) the even numbers are easier to think about for experimental purposes).
I run experiments in which the last item actually listened to in the playlist (<LP>) can be set to any number between 1 and the playlist size, and playlists are created over and over until every item has been played. This may require making many thousands of playlists.
Since this is all based on random numbers (via the pseudorandom number generator built into Tcl, which is fairly good but simple) the number of playlists can vary a lot from one experiment to the next.
So, each such experiment is repeated from scratch all over again for many repetitions and averaged to get stable and presumably accurate results. I have found that I can get reasonable stable results in most cases with 1000 repetitions, but there is still some "noise" in the graphs, so for the really important tests I do 10,000 repetitions. I compute statistics such as 95% confidence interval, 82% confidence interval (useful for comparing two numbers) and residual kurtosis.
MakePlaylist and all the companion programs including TestPlaylist and TestPlaylistRange are available as Free Software under a GPL 3+ at SourceForge. What is described here requires Version 1.1 or greater.
Algorithms
RESULTS
![]() |
| RWR vs RWOR, 100-1 LP, 1000 reps |
Playing only 99 files out of a hundred makes RWOR substantially worse, requiring 2 epochs to play all the files. Still many times better than RWR. Each additional unplayed file has a lesser effect than the previous one. Even just playing 1 file out of 100 in the playlist, with a library size of 1000, it was still better, ever so slightly, than RWR.
This is the Lossless behavior.
For RWR, under the test conditions of L=1000 and S=100 it takes an average of an astonishing 7.506 epochs to select all the titles (95% confidence interval is 0.003, but the high kurtosis also means that 0.003 is probably an underestimate of the confidence interval so I can't be sure it's even above 7.5). Some titles are played over a dozen times before all files are played once. I knew RWR was bad, but I didn't realize it was this bad.
Redefining the epoch as the number of files played in an epoch of playlists. the RWR stays virtually constant as fewer items are played in each playlist. It should be this way as each RWR draw is precisely the same, completely independent of all played histories. In this regards, RWR is an Independent process, always the same regardless of input.
I felt like I had discovered the magic point where the misinformation outweighed the information and felt jubilant that I had discovered something important. But I knew I needed way more test runs, and a test infrastructure that computed important statistics so I could determine whether the difference from RWR was significant or not.
At that time, I was making actual playlists from folders of files, storing results to the played history, and running the -stats option of MakePlaylist (which also loaded the history and folders of files) to see how complete it was. I was not able to run more than a small number of the toughest cases.
To make the tests run faster, I made an option to eliminate all playlist and database output. Everything is done inside the TestPlaylist shell, which textually includes the current MakePlaylist as a procedure. And instead of selecting actual files in folders, it simply selects from integers numbered 1...<library size>. These refinements make it possible to create the needed millions of playlists even using a pre-compiled language (Tcl) without any parallelism. I also added statistics to the output. This was not all done at once but in a series of separate refinements.
At first, the result with 10000,100,1 being the magic point held up when I did 10 test runs of each.
With more improvements, I was able to increase the number of test runs to 100 the result still held, but with hardly any statistical significance. But with even more improvements, I was able to increase the number of test runs to 1000, and then the result failed, RWOR was still better than RWR even when playing only one title out of 100 from a library of 10000. The 'magic point' disappeared. The statistical significance of this difference is quite a bit shy of 95% confidence.
RWOR playlists required for 1/100/10000: 96820
RWR playlists required for 1/100/10000: 97774
Actual difference: 954
Difference required for 95% confidence of difference: 1106
The difference required for 95% confidence the means are being sampled from different distributions is the sum of the two 83% confidence intervals. The calculation is now done by TestPlaylist but these calculations were done by hand.
I was disappointed that 95% confidence was not achieved but rather unwilling to repeat these tests with, say, 10 times more repetitions than the already 1000 repetitions because these runs each took a week and a half already.
But if we ask the question of whether the true values favor RWR over RWOR, my original conjecture, that is probably greater than 95% at this point, because there are 3 possibilities:
1. RWOR better than RWR
2. They are the same.
3. RWR better than RWOR
#3 is the farthest from the existing mean values from large number of tests. Given that, #2 is more likely than #3. #3 is the least likely of all, so it is possible that RWR is worse not better or same is already at a 95% confidence interval. I don't know how to do those calculations, but I think my original idea that RWR is better specifically at 1/100/10000 is likely disconfirmed near 95% confidence, possibly better.
And things aren't hetter (and likely worse) with a smaller library size of 1000 either:
RWOR playlists required for 1/100/1000: 7307
RWR playlists required for 1/100/1000: 7507
Actual difference: 200
Difference required for 95% confidence of difference: 342
This suggests that the library size either isn't a factor, or goes in the reverse direction from what I conjectured, larger libraries show better results for RWOR down to one 1 unit of valid data out of 100. So it looks like I don't have to keep trying larger and larger libraries to find the pivot point. And that's a very good thing because each tenfold increase in the playlist size causes a tenfold increase. And they were already taking over a week.
BTW, when RWR at 1/100/10000 says 77774 playlists required, that's for each run. Do that 1000 times and it means 77,774,000 virtual playlists need to be created. That's why it takes so long. I think my Tcl code is actually very fast.
So, even with only 1 file played out of 100, RWOR still achieves better results than RWR at every library size I can try. And there's no indication that even larger libraries would break this.
That is why I am calling my RWOR algorithm Lossless. The performance with less and less information is monotonically asymptotic to No Information (RWR) so it never gets worse than No Information. You could in principle work backwards from the number of playlists required to play all the files to the number of items that are being skipped. You'd just have to create millions and millions of playlists to be sure.
At this point, the only way I could figure out to makeMakePlaylist Paradoxical was to make it lose some of the information it is supposed to save. I decided to add a PARADOX <LP> option to MakePlaylist which causes it to fail to record the first LP items in the played history. This then produces the Paradoxical result that fewer than LP items are played, there is only misinformation and no correct information at all.
The only way I can see to achieve a Truncating result is to replace the first items with RWR truly independent items. I plan to add a TRUNCATE <TP> option in which the first TP items are randomly chosen with RWR. By default, history is still recorded for them (ie, there is no information lost) but that's irrelevant for RWR, it is independent of any history. One could also combine the PARADOX and TRUNCATE options. I suspect it would still produce truncating results but with lower epoch counts than just TRUNCATE alone. The virtual epochs the RWOR system 'sees' are shorter, making things repeated more often, but it barely matters to the truncating result which is identical with RWR at TP and below.
![]() |
| Paradox 4, 15-1 LP, 10000 reps vs RWR avg 100000 |


No comments:
Post a Comment