logo To Foot
© J R Stockton, ≥ 2011-03-12

JavaScript Sorting and Order.

No-Frame * Framed Index * Frame This
Links within this site :-

See "About This JavaScript Site" in JavaScript Index and Introduction.

Material on this new page is in part collected from elsewhere on the site.

Sorting

General

In any language, for any process such as sorting or shuffling, where data are repeatedly repositioned, one should avoid repeatedly moving large items. It is quicker to move indexes or adjust pointers instead.

Note that sorting is not always guaranteed stable; the order of equally-valued items may be changed.

The simple "Bubble" sort, comparing adjacent items, takes time O(N2) for sorting N items. Better methods take time O(N×ln(N)) or O(Nk) for k around 1.3; but only for special cases can time O(N) be achieved.

See also my JavaScript Shuffle, Deal, and Draw, and in Wikipedia.

JavaScript

JavaScript uses Objects throughout, so "moving" operations basically only change pointers and do not actually move large stored data.

Array Objects already have an efficient .sort method. With no parameter, the elements are converted to Strings for comparison. For numbers of varying length or mixed sign, that does not give numerical order. Otherwise, the parameter is a comparison function.

ECMA-262 does not require stable sorting, but some implementations may give it (in stable sorting, items that compare equal retain their order).

Comparison Functions

To obtain the required order, it is often necessary to provide a comparison function, for example when sorting into numerical order or sorting an array of objects using a specific field as a key. The parameter for .sort() is a reference to a function Fn(a, b) returning negative for a<b, zero for a=b and positive for a>b.

For numeric comparison, use function NumCmp below. It should work for any type which has a suitable default conversion to Number.

Function GenCmp may be the shortest comparison function working on all types that allow the "<" operator.

Demonstration

Function RandArry generates an array apparently of random integer numbers in the range from 0 to 20 inclusive, but necessarily including three particular values.

Some possible comparison functions :-

function compareNum(a, b) { return a - b } // for Numbers

function compareStrnum(a, b) {	// compare numeric part of string
  return a.replace(/[^.0-9]/g, "") - b.replace(/[^.0-9]/g, "") }

function compareStr1(a, b) {	// compare string
  return a>b ? 1 : a<b ? -1 : 0 }

function compareStr2(a, b) {	// compare string
  if (a<b) return -1
  if (a>b) return +1
  return 0 }

function compareStr3(a, b) {	// compare string
  return a<b - a>b }

Work matching function compareStrnum really calls for the use of keys
Functions compareStr1 compareStr2 are substantially equivalent
Function compareStr3 always performs both comparisons

Comparison Efficiency

If the comparison function is not simple, it will generally be representable as a transformation of each variable (preferably by a function) followed by a comparison of transforms. Frequently it will be better to precompute keys for the items, and sort by key comparison.

Note that a sort function will be called >O(N) times when sorting an N-element array. For large arrays, therefore, it is worth doing O(N) pre/post-processing to simplify the sort function or to render one unnecessary.

Demonstration
 
For speed test, chose product of several thousand or more
  No more than 20 items will be listed

The first three items are preset.

On my PII/300 Win 98 IE4 system, pre-processing to allow use of the default .sort method showed an improvement at about 25 items and above, in this case. For 5000 items, the gain exceeded a factor of 3. On my P4/3G WinXP IE6 system, 7 items and a factor of 6.

Sorting by Date/Time

Arrays of strings starting with date/time in an ISO 8601 format such as YYYY-MM-DD or hhmmss can be sorted directly, since alphabetical order is chronological order. Just use the .sort() / .sort().reverse() methods.

If an Array of Date Objects is processed with .sort() the dates will be put into alphabetical order of toString(). Instead, use .sort(NumCmp) where NumCmp is a comparison function as above, where the subtraction forces conversion to Number (which is trivial, as that's what a Date Object actually contains).

Build on NumCmp to sort an Array of something-including-dates.

Date Objects in the year range 2002-2285 can be converted to fixed-length strings by S = String(+D) and, after sorting, converted back by D = new Date(+S). Alternatively, use S as a key.

UK Postcode Sorting

How can valid UK postcodes be processed so that the result behaves well (though not geographically) in a string sort?

The formats considered are X# #XX, X## #XX, X#X #XX, XX# #XX, XX## #XX, XX#X #XX ; are any others in use? Later ... I have been told that XXX #XX and XXX XX# exist; ... TDCU 1ZZ is Tristan da Cunha (Risks 24.66).

Add your own choices, and  
 

Tested only with the above good code formats; code not optimised.

I have chosen to sort 1-letter regions before 2-letter ones.

A subset of the above should suffice for pattern validation.

Read Postcodes in the United Kingdom, ukgovtalk.

Order

The actual order of items in an Object is not guaranteed. It is likely to be related to the order of creation. Probably, for (J in Obj) reflects the actual order of items in the Object.

	MSIE8	FF3.0	Op 10	Sf 4	Cr 4
  S1:	0,2,1	0,1,2	0,2,1	0,1,2	0,1,2
  S2:	0,2,1	0,1,2	0,2,1	0,1,2	0,1,2
  S3:	1,2,0	0,1,2	1,2,0	0,1,2	0,1,2

The above applies to all Objects, including Arrays; Arrays are not ordered, but can be addressed in order.

Disorder

See JavaScript Random Numbers and JavaScript Shuffle, Deal, and Draw.

More


Rows and columns of tables, although indexable, are not arrays, and have
no sort method.

Since the elements can be both read and written, all normal sort
algorithms (which does not imply sort methods) can be programmed to work
on tables in situ.

I have an instinctive feeling that handling table elements may be slower
than handling elements of a corresponding array.  If speed matters,
check that.

The built-in sort method is likely to be faster than a home-brew, if
given a well-considered sort function or if not needing one.

If you have to sore something unhelpful, such as US-format date strings
or dates containing letters, every comparison will need two of those to
be rendered comparable, and sorting tends to take N log N or more
comparisons.  Therefore, in such cases, do a preliminary pass to
generate directly comparable sort keys.

Assume that rows are to be stable, and sorted into the order of a
particular column.  I think that if you represent the entire structure
as an array of rows, with the rows being arrays, with the zero element
of a row holding the sort key, and the rest of the array holding the
table data however convenient, then you can use the built-in sort after
setting the keys to suit the order desired.

The art of sorting lies in not moving the actual data, just adjusting
what are in effect pointers.  That's easy in JavaScript.

Be aware that there is no need to sort on demand.  The page can be
fetched containing, for example, an "2-D" array of rows, and other
simple arrays indexing the data in all orders on offer.  Then, when the
user "demands" a sort, just write the table in the desired order using
the appropriate one of those arrays.

If the user can input data, so that it cannot all be pre-sorted at
download, you can insert a pointer to the new data at the right position
of each list while the user is looking for the next data line.

You can also sort all the downloaded data in JavaScript onload, with the
sort buttons temporarily disabled, distracting the user until the
buttons can be enabled,

-----------------------

Actually, (I think) your new TBODY is the old TBODY with its data rows
in a different order.

So : one can construct a new Array ARRY whose elements are the DOM rows
of the old TBODY - they will in fact be pointers to their rows.

AFAIK there is nothing to stop a TROW from having a property KEY as well
as its collection of TD/TH elements, and a WHEREITWAS element which is
the index of the row in the original TBODY.  So, when a sort is called
for, compute the sort keys, sort ARRY with its native sort method and a
s1mple comparison function which compares the KEY property.

Now construct an empty TBODY, insert the rows in the required order,
insert this second TBODY, and remove or hide the first.

That does what ought to be the minimum of rewriting of anything, visible
or not.  And, if the browser is smart enough, it has the rendering
information of the old row to hand when the new row is written, so it
only needs to copy the block of pixels.  And a browser which liked to
draw things while the script was still running would not find itself
redrawing anything unnecessarily.

For each new type of sort, you need only write a new sort function and
bind it to a strategically placed clickable element.  And the sort
functions can have at the end a *+1/*-1 controlled by a checkbox
"reverse sort".
Home Page
Mail: no HTML
© Dr J R Stockton, near London, UK.
All Rights Reserved.
These pages are tested mainly with Firefox and W3's Tidy.
This site, http://www.merlyn.demon.co.uk/, is maintained by me.
Head.