It missed several big towns because of a wrong condition
Data is downloaded from geonames.org, and processed with an AWK and a Perl script. The result is the part of the distribution, so the average user (or a packager) doesn't have to download that much of data.