PDF In-Depth

JavaScript - Unduplicating a List of Names

February 01, 2001


This is such an unbelievably good tip, I couldn't wait to share it with you. It has to do with removing duplicates from a list. Tuck this one away for later; you WILL need it, it's just a matter of WHEN.

Let’s talk about associative arrays (hashes, in Perl parlance), because this is essential to understanding the Power Tip for this column. JavaScript, as you may recall, can do associative arrays, i.e., arrays of a type where the index into the array is a string rather than a number:

var MovieStars = new Object;
MovieStars['Robert Downey Jr.'] = 'drug offender';

In this example, we use the string 'Robert Downey Jr.' as the index into what amounts to an array (although in JavaScript, it's just a generic Object). The value at that index is 'drug offender'. You could (alternatively) assign a numeric value to MovieStars['Robert Downey Jr.'], or, space permitting, you could assign a very long string containing the young star's entire rap sheet for drug arrests and parole violations. (I doubt if JavaScript allows that much string storage, frankly.)

Now comes the tip I want to share with you.

If you've ever done much work revolving around mailing list maintenance (or any kind of database maintenance), you know what a pain it can be to unduplicate (remove duplicate entries from) a long list. You invariably start by sorting the list, which by itself can take a long time depending on the size of the list and the stupidness of the sort algorithm; then you go through and whack out adjacent identical entries.

Well, there's a super-easy way to unduplicate lists in JavaScript (and a corresponding technique in Perl), relying on associative array properties. Suppose you have a long list, that needs dupes removed, stored as an array called Names. Here's how to undupe it:

var unduped = new Object;for (var i = 0; i < Names.length; i++) {   

unduped[Names[i]] = Names[i];}

That's it. The unduped object now holds a list of names, with duplicate entries removed. How do you get the names back out? Simple:

var uniques = new Array;for (var k in unduped) {

Now uniques is an Array containing the names, with dupes removed.

The reason this trick works is that in JavaScript (and Perl, too) an associative array can only hold one value per index. That is, if you do:

Hues['PMS 179'] = 'brick red';Hues['PMS 179'] = 'dark red';

Now Hues['PMS 179'] will contain 'dark red', because you overwrote 'brick red' with it. Simple, right? You can't store two different values in one array slot simultaneously. In any language, that I know of.

Incidentally, if the syntax for (var k in unduped) looked strange to you, this is a legitimate JavaScript looping syntax for Objects. It lets you enumerate through the complete list of attached object properties. See p. 98 of Flanagan's JavaScript book.

PDF In-Depth Free Product Trials Ubiquitous PDF

Debenu Quick PDF Library

Get products to market faster with this amazing PDF developer SDK. Over 900 functions and an equally...

Download free demo

Five visions of a PDF Day

In the world of PDFs or as we like to say Planet (of) PDF, a year isn't a real PDF year without an intense few days of industry knowledge sharing.

May 15, 2018
Platinum Sponsor

Search Planet PDF
more searching options...
Planet PDF Newsletter
Most Popular Articles
Featured Product

Debenu PDF Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link control. Take Acrobat to the next level.


Adding a PDF Stamp Comment

OK, so you want to stamp your document. Maybe you need to give reviewers some advice about the document's status or sensitivity. This tip from author Ted Padova demonstrates how to add stamps with the Stamp Tool along with related comments.