PrePAN

Sign in to PrePAN

Profile

User's Modules

Sort::Naturally::ICU Perl extension for human-friendly ("natural") sort order, which using ICU library for locale aware sorting.

DESCRIPTION

See: http://search.cpan.org/~serval/Sort-Naturally-XS-0.7.8/lib/Sort/Naturally/XS.pm#DESCRIPTION

LOCALE AWARE SORTING

The following example demonstrates default way to locale aware sorting:

use POSIX;

my @list = ('a'..'c', 'A'..'C');

setlocale(POSIX::LC_ALL, 'en_US.utf8');
my @result = sort @list;
# @result contains  a, A, b, B, c, C

setlocale(POSIX::LC_ALL, 'en_CA.utf8');
@result = sort @list;
# @result contains  A, a, B, b, C, c

The problem is that not all Unix-like OSs completely support POSIX, in fact only Linux fully POSIX compatible. Therefore you can't use above approach in Mac OS or FreeBSD. This module is designed to solve this issue.

To be able to sort a list with an arbitrary locale at any platform it's necessary to use the sorted function with a locale keyword argument. locale should be LDML locale identifier:

use Sort::Naturally::ICU qw/sorted/;

my $list = ['a'..'c', 'A'..'C'];

my $result_us = sorted($list, locale => 'en-US-u-va-posix');
# $result_us contains A, B, C, a, b, c

my $result_ca = sorted($list, locale => 'en-CA-u-va-posix');
# $result_ca contains a, A, b, B, c, C

the_serval@twitter 0 comments

Sort::Naturally::XS Perl extension for human-friendly ("natural") sort order

Description

Natural sort order is an ordering of mixed (consists of characters and digits) strings in alphabetical order, except that digits parts are ordered as a numbers.

For example, standard machine-oriented alphabetical sort for list:

test21 test20 test10 test11 test2 test1

result to:

test1 test10 test11 test2 test20 test21

It isn't human-friendly, because test10 and test11 comes before test2. Natural sort order suggests the following:

test1 test2 test10 test11 test20 test21

Advantages

  • Written in C and XS, so it's really fast
  • Support already exists Sort::Naturally module API
  • Fix some Sort::Naturally deviation from normal sort behavior, like "foobar" comes before "foo13'

Benchmark

See synopsis section

the_serval@twitter 3 comments