Skip to content
September 20, 2008 / Abe Pralle

Removing list items during iteration

The problem

There’s a classic problem that comes along with forEach loops: how do you handle removing list items as you iterate – both from a language design standpoint and from an algorithmic standpoint?

The naive approach is to not worry about it and let people add and remove things to and from lists while iterating through them.  But then several common situations would introduce subtle errors – for example, removing an item from the beginning of a list during the middle of iteration would cause the iterator to skip over the next item (since item i it was on just became item i-1), and you definitely don’t want to worry about randomly skipping things you might be looking for.

So first you want to prevent programmers from modifying the list in the naive way and then you want to figure out how to let them do it a different way.

Java’s approach to restricting changes is simple and elegant.  Each list has a structural modification count of how many times the list has grown or shrunk.  When you run a for-each on the list, a separate iterator object is created with a link to the list, a current index, and a copy of the modification count.  If the list’s modification count ever changes from the iterator’s copy, then the iterator knows somebody else just messed with the list and it throws a ConcurrentModificationException (CME).

Everyone gripes (including me) when they first start encountering CME’s – so I tried leaving them out in Slag at first.  Several good students were then bewildered by runtime errors as they would remove an item during a nested double iteration and then crash when the iterator tried to access the last index during the original iteration.  So I put them in to Slag and everyone’s mostly happy.

How do you remove items then?  Java’s approach is to allow its iterator objects to remove() the item that they just iterated over.  This is a reasonable thought but falls short in three ways: 1) it’s irregular having a remove operation in something that’s ostensibly focused on reading, 2) it forces you to create an iterator object separately, instead of using the “for-each” loop, so you lose much of the convenience, and 3) every remove operation causes all higher list items to be shifted down one spot in an ArrayList, so removing almost every item as you iterate through a list of n elements is about O(n2/2), where removing 1,000 items requires about 500,000 operations.

There’s two other ways of removing list items that come to mind that work in Java or Slag: 1) you can “manually” iterate through the list in reverse order and remove the dead ones, which still presents some problems for large lists – if you ended up removing the first 500 out of 1,000 items, you’d end up doing O((n/2)2) = 250,000 operations.  Or 2) have a separate work list that starts out with the same capacity as your main list.  As you examine each item, copy the living items to the work list.  Then swap the roles of the two lists.  This is pretty efficient – O(n) – but is still kinda clunky to work with (as in, inelegant to the point of prohibiting casual employment).

A new solution

The other day I randomly thought of a new solution: copy the surviving list items back into the list itself as you go, closing gaps, and then discard the tail of the list where the number of discarded items equals the number of “removed” items.  In code:

  local Int32 pos = 0
  forEach (element in list)
    if (element.still_alive) list[pos] = element; pos++
  list.discard(pos)  # discards everything from "pos" up to the end

This code is O(n) no matter how many items are discarded!  I think the reason it took me so long to think of was that I’m in the mindset of “you can’t change the list”, but that’s not true.  You can change the contents of locations, you just can’t add or remove any locations as you’re iterating.

I’m quite pleased with this pattern; the only thing left is to integrate it into Slag to make it totally painless.  Here’s my thought:

  removeEach (element in list)
    if (element.still_alive) preserve element

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: