3.1 KiB
isHidden | menupriority | kind | created_at | title | multiTitle | multiDescription | tags | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
false | 1 | article | 2010-02-23T10:09:52+02:00 | When regexp is not the best solution |
|
|
|
Regular expression are really useful. Unfortunately, they are not always the best way of doing things. Particularly when transformations you want to make are easy.
I wanted to know how to get file extension from filename the fastest way possible. There is 3 natural way of doing this:
# regexp
str.match(/[^.]*$/);
ext=$&
split
ext=str.split('.')[-1]
File module
ext=File.extname(str)
At first sight I believed that the regexp should be faster than the split because it could be many .
in a filename. But in reality, most of time there is only one dot and I realized the split will be faster. But not the fastest way. There is a function dedicated to this work in the File
module.
Here is the Benchmark ruby code:
#!/usr/bin/env ruby
require 'benchmark'
n=80000
tab=[ '/accounts/user.json',
'/accounts/user.xml',
'/user/titi/blog/toto.json',
'/user/titi/blog/toto.xml' ]
puts "Get extname"
Benchmark.bm do |x|
x.report("regexp:") { n.times do
str=tab[rand(4)];
str.match(/[^.]*$/);
ext=$&;
end }
x.report(" split:") { n.times do
str=tab[rand(4)];
ext=str.split('.')[-1] ;
end }
x.report(" File:") { n.times do
str=tab[rand(4)];
ext=File.extname(str);
end }
end
And here is the result
Get extname user system total real regexp: 2.550000 0.020000 2.570000 ( 2.693407) split: 1.080000 0.050000 1.130000 ( 1.190408) File: 0.640000 0.030000 0.670000 ( 0.717748)
Conclusion of this benchmark, dedicated function are better than your way of doing stuff (most of time).
file path without the extension.
#!/usr/bin/env ruby
require 'benchmark'
n=80000
tab=[ '/accounts/user.json',
'/accounts/user.xml',
'/user/titi/blog/toto.json',
'/user/titi/blog/toto.xml' ]
puts "remove extension"
Benchmark.bm do |x|
x.report(" File:") { n.times do
str=tab[rand(4)];
path=File.expand_path(str,File.basename(str,File.extname(str)));
end }
x.report("chomp:") { n.times do
str=tab[rand(4)];
ext=File.extname(str);
path=str.chomp(ext);
end }
end
and here is the result:
remove extension user system total real File: 0.970000 0.060000 1.030000 ( 1.081398) chomp: 0.820000 0.040000 0.860000 ( 0.947432)
Conclusion of the second benchmark. One simple function is better than three dedicated functions. No surprise, but it is good to know.