When sorting a “StringList” is very costly

The following code looks innocuous but slows down a program significantly:

type
  TJCHListSortCompare = function(Item1, Item2: Integer): Integer of object;
  TCheckListBoxWithHints = class(TCheckListBox)
  private
    procedure QuickSort(L, R: Integer; SCompare: TJCHListSortCompare);

// [...]
procedure TCheckListBoxWithHints.QuickSort(L, R: Integer; SCompare: TJCHListSortCompare);
var
  I, J, P: Integer;
  tmpObj: TObject;
  tmpStr: string;
  tmpChecked: Boolean;
begin
  repeat
    I := L;
    J := R;
    P := (L + R) shr 1;
    repeat
      while SCompare(I, P) < 0 do Inc(I);
      while SCompare(J, P) > 0 do Dec(J);
      if I <= J then
      begin
        // exchange I and J
        tmpStr           := Items[I];
        tmpObj           := Items.Objects[I];
        tmpChecked       := Self.Checked[I];

        Items[I]         := Items[J];
        Items.Objects[I] := Items.Objects[J];
        Self.Checked[I]  := Self.Checked[J];

        Items[J]         := tmpStr;
        Items.Objects[J] := tmpObj;
        Self.Checked[J]  := tmpChecked;
        if P = I then
          P := J
        else if P = J then
          P := I;

        Inc(I);
        Dec(J);
      end;
    until I > J;
    if L < J then QuickSort(L, J, SCompare);
    L := I;
  until I >= R;
end;

Yes it’s Quicksort and it sorts strings in a TCheckListBox’s Items property, swapping not only the strings but also the objects and the Checked values.

Now, run this with, lets say 100 entries. That shouldn’t be any problem for Quicksort, should it? But it takes about 2 seconds on my computer which is muuuuuuch longer than I expected. The same code running on a simple TStringList takes less than 1/10 of a second. Why is that?

It’s because accessing the strings and changing them each results in a Windows message to be sent, handled and checked.

TCheckListBox inherits its Items property from TCustomListBox which declares it as:

    property Items: TStrings read FItems write SetItems;

Still looks innocuous? Now, let’s see how it is actually instantiated:

  FItems := TListBoxStrings.Create;
  TListBoxStrings(FItems).ListBox := Self;

So, what is TListBoxStrings? It’s a class that descends from TStrings and provides access to the strings stored in a TCustomListBox using Windows messages. E.g.:

function TListBoxStrings.Get(Index: Integer): string;
var
  Len: Integer;
begin
  // [...]
  begin
    Len := SendMessage(ListBox.Handle, LB_GETTEXTLEN, Index, 0);
    if Len = LB_ERR then Error(SListIndexError, Index);
    SetLength(Result, Len);
    if Len <> 0 then
    begin
      Len := SendMessage(ListBox.Handle, LB_GETTEXT, Index, Longint(PChar(Result)));
      SetLength(Result, Len);
    end;
  end;
end;

To get a string, it sends two messages to the Listbox’s handle and interprets its results.

Or:

procedure TListBoxStrings.Put(Index: Integer; const S: string);
var
  I: Integer;
  TempData: Longint;
begin
  I := ListBox.ItemIndex;
  TempData := ListBox.InternalGetItemData(Index);
  ListBox.InternalSetItemData(Index, 0);
  Delete(Index);
  InsertObject(Index, S, nil);
  ListBox.InternalSetItemData(Index, TempData);
  ListBox.ItemIndex := I;
end;

In order to set a string to a new value, it first deletes it and then inserts it again.

Want to guess what Delete() does? It sends a message to the listbox’s handle. And what does InsertObject do? It sends a message to the listbox’s handle.

While all this is an ingenious way to provide simple access to the strings normally only available using the mentioned messages, it’s far from efficient when you do a lot of comparisons and some swapping, which is exactly what a sorting algorithm does.

So, what can be done?

First, don’t work on the Items property directly but take a copy of it. Also, don’t swap the Checked property values (which also use messages) directly but take a copy of these too (in particular since some of the Compare functions passed to the sorting code also test the Checked property). Then sort the copy and assign it back to the listbox’s Items and Checked properties:

var
  cnt: Integer;
  tmpList: TStringList;
  ChkArr: TBoolArray;
  i: Integer;
//[...]
  tmpList := TStringList.Create;
  try
    tmpList.AddStrings(Items);
    SetLength(ChkArr, cnt);
    for i := 0 to cnt - 1 do
      ChkArr[i] := Checked[i];
    QuickSort(tmpList, ChkArr, 0, cnt - 1, Compare);
    Items.BeginUpdate;
    try
      Items := tmpList;
      for i := 0 to cnt - 1 do
        Checked[i] := ChkArr[i];
    finally
      Items.EndUpdate;
    end;
  finally
    tmpList.Free;
  end;

This code is from the GExperts Project Option Sets expert. This is one of the experts I had never used before and was shocked that, when I opened the dialog, it took several seconds before anything was shown. The reason turned out that the sorting described above was executed not only once but even twice in the FormShow event. After the changes I outlined above and reducing it to sort only once, it was down to less than half a second. That still felt like eternity, but was a significant improvement.

After I added some more tweaks, e.g. use a lookup list into an array of several hundred entries rather than linear search to find a particular string, the dialog now opens nearly instantly (on my computer).

If you want to discuss this article, you can do so in the corresponding post in the international Delphi Praxis forum.