----- # Custom isHidden: false menupriority: 1 kind: article created_at: 2010-02-23T10:09:52+02:00 title: When regexp is not the best solution multiTitle: fr: When regexp is not the best solution en: When regexp is not the best solution multiDescription: fr: pas de description. en: no description. tags: - programming - regexp - regular expression - extension - file ----- Regular expression are really useful. Unfortunately, they are not always the best way of doing things. Particularly when transformations you want to make are easy. I wanted to know how to get file extension from filename the fastest way possible. There is 3 natural way of doing this:
# regexp
str.match(/[^.]*$/);
ext=$&
# split
ext=str.split('.')[-1]
# File module
ext=File.extname(str)
#!/usr/bin/env ruby
require 'benchmark'
n=80000
tab=[ '/accounts/user.json',
'/accounts/user.xml',
'/user/titi/blog/toto.json',
'/user/titi/blog/toto.xml' ]
puts "Get extname"
Benchmark.bm do |x|
x.report("regexp:") { n.times do
str=tab[rand(4)];
str.match(/[^.]*$/);
ext=$&;
end }
x.report(" split:") { n.times do
str=tab[rand(4)];
ext=str.split('.')[-1] ;
end }
x.report(" File:") { n.times do
str=tab[rand(4)];
ext=File.extname(str);
end }
end
Get extname user system total real regexp: 2.550000 0.020000 2.570000 ( 2.693407) split: 1.080000 0.050000 1.130000 ( 1.190408) File: 0.640000 0.030000 0.670000 ( 0.717748)Conclusion of this benchmark, dedicated function are better than your way of doing stuff (most of time). ## file path without the extension.
#!/usr/bin/env ruby
require 'benchmark'
n=80000
tab=[ '/accounts/user.json',
'/accounts/user.xml',
'/user/titi/blog/toto.json',
'/user/titi/blog/toto.xml' ]
puts "remove extension"
Benchmark.bm do |x|
x.report(" File:") { n.times do
str=tab[rand(4)];
path=File.expand_path(str,File.basename(str,File.extname(str)));
end }
x.report("chomp:") { n.times do
str=tab[rand(4)];
ext=File.extname(str);
path=str.chomp(ext);
end }
end
remove extension user system total real File: 0.970000 0.060000 1.030000 ( 1.081398) chomp: 0.820000 0.040000 0.860000 ( 0.947432)Conclusion of the second benchmark. One simple function is better than three dedicated functions. No surprise, but it is good to know.