Presentation drawing

split a file by keyword

Strangely enough, I didn’t find any built-in tool to split a file by keyword. I made one myself in awk. I put it here mostly for myself. But it could also helps someone else. The following code split a file for each line containing the word UTC.

#!/usr/bin/env awk
BEGIN{i=0;}
/UTC/ { 
    i+=1;
    FIC=sprintf("fic.%03d",i); 
} 
{print $0>>FIC}

In my real world example, I wanted one file per day, each line containing UTC being in the following format:

Mon Dec  7 10:32:30 UTC 2009

I then finished with the following code:

#!/usr/bin/env awk
BEGIN{i=0;}
/UTC/ {
    date=$1$2$3; 
    if ( date != olddate ) {
        olddate=date;
        i+=1;
        FIC=sprintf("fic.%03d",i); 
    }
} 
{print $0>>FIC}

comments

Copyright ©, Yann Esposito
Created: 02/18/2010 Modified: 05/09/2010
Entirely done with Vim and nanoc
Validation [xhtml] . [css] . [rss]