No description

Find a file

Yann Esposito (Yogsototh) 48d94fee6d benchmarks		2018-09-13 12:55:27 +02:00
dictionaries	added some dictionaries	2018-09-02 23:58:33 +02:00
src/HFIG	fixed the doctests	2018-09-13 12:13:11 +02:00
src-benchmark	benchmarks	2018-09-13 12:55:27 +02:00
src-doctest	fixed the doctests	2018-09-13 12:13:11 +02:00
src-exe	fixed the doctests	2018-09-13 12:13:11 +02:00
src-test	added command line and lovecraftian gen	2018-09-04 10:53:34 +02:00
.dir-locals.el	benchmarks	2018-09-13 12:55:27 +02:00
.gitignore	initial commit	2018-09-02 13:02:08 +02:00
.hlint.yaml	initial commit	2018-09-02 13:02:08 +02:00
.travis.yml	initial commit	2018-09-02 13:02:08 +02:00
CHANGELOG.md	initial commit	2018-09-02 13:02:08 +02:00
human-friendly-id-gen.cabal	benchmarks	2018-09-13 12:55:27 +02:00
LICENSE	initial commit	2018-09-02 13:02:08 +02:00
package.yaml	benchmarks	2018-09-13 12:55:27 +02:00
README.md	Externalize dictionaries (for now)	2018-09-04 15:20:20 +02:00
Setup.hs	initial commit	2018-09-02 13:02:08 +02:00
shell.nix	fixed the doctests	2018-09-13 12:13:11 +02:00
stack.yaml	initial commit	2018-09-02 13:02:08 +02:00

README.md

human-friendly-id-gen

New Haskell project to generate Human Friendly Ids.

Those ids should be easier to read / write and remember than classical random base64 ids.

The package provide both a lib and an executable hfig (for Human Friendly Identifier Generator).

Strategies

There are different strategies depending on your preferences.

Short strategy

We generate random phonemes that should be not too hard to pronounce but in the same time having sufficiently different phonemes to be able to have not too long words to prevent collision.

rupomdovi
waziridro
moplaloxo
kankujochplu
drubrusadka
dripuxmopbi
jotchibluzuv
plotabrprabudr
zopranblokplab
tirbrozprakow

Here is the probability of collision if you generate a sample of n of those words:

n	%
1000	2.5e-8
10k	2.5e-6
100k	2.5e-4
1M	2.5e-2

You can also ask to use more phonemes if you only use 2 phonemes which generate words like:

blilwa
wirpa
winupl
tani
ludu
probrip
pichprox
joprux
drudibl
zibrku

The probility of collision become:

n	%
10	1e-5
100	1e-3
1k	0.11
10k	1.0

Lovecraftian strategy

My nickname isn't yogsototh for nothing so why not generate as if Lovecraft could have invented them.

ymhiovhotl
zhaobritl
v'odher
neltha
ucnouthlaxr
kola
adavhig
ctuthrilbh
yakthembru
athoubr'murh

The probability collision table looks like:

n	%
10	6.669334400426838e-8
100	6.669334400426838e-6
1k	6.669334400426837e-4
10k	6.669334400426838e-2
100k	1.0

if you generate two names for an id, you should be safe.

n	%
10	8.8e-17
100	8.8e-15
1k	8.8e-13
10k	8.8e-11
100k	8.8e-9
1M	8.8e-7

Dictionary Strategy

You can read any file and each line will be considered as a word. We then take a few random words.

You can gather some word list in this repository to use.

There is a default english dictionary with approximatively 370k English words.

Here is an example:

shuckins-digitinerved-microspectrophotometrical
indeterminableness-getaways-sceloporus
diverts-okayed-cast
semirhythmically-thasian-thrawart
smashups-phototherapeutics-swollenness
bindingness-phoenicia-ringy
execs-axes-barotaxis
monimiaceous-presutural-submembers
heterodyned-pourparley-zecchino
fragmentate-contrude-taeniae

And here are the different table of collision probability.

use 1 word to make the identifier:

n	%
10	1.3e-4
100	1.3e-2
1k	1.0

combine 2 words to make the identifier:

n	%
10	3.6e-10
100	3.6e-8
1k	3.6e-6
10k	3.6e-4
100k	3.6e-2
1M	1.0

combine 3 words to make the identifier:

n	%
10	9.8e-16
100	9.8e-14
1k	9.8e-12
10k	9.8e-10
100k	9.8e-8
1M	9.8e-6