Smashing Theory: AWK

Showing posts with label AWK. Show all posts

Saturday, September 10, 2016

seq & AWK: Generating Sequence of Even and Odd Number Between 1 and 10

Command (even):

$ seq 1 10 | awk '$1%2==1 {print $1}'

Result:

1
3
5
7
9

Command (odd):

$ seq 1 10 | awk '$1%2==0 {print $1}'

Result:

AWK & grep: Comparison and Similarity

Command:

$ seq 1 10

Result:

1
2
3
4
5
6
7
8
9
10

Command:

$ seq 1 10 | grep '1'

Result:

1
10

Command:

$ seq 1 10 | awk '/1/'

Result:

1
10

Thursday, September 1, 2016

AWK: Hashtag Safe String Function

Command:

$ cat twitter_hash_safe_string.awk

Result:

function twitter_hash_safe_string(hashStr)
{
gsub(/^[0-9]+/,"", hashStr);
gsub(" ", "", hashStr);
gsub("\\.", "", hashStr);
gsub("'", "", hashStr);
gsub("%", "", hashStr);
gsub(/"/, "", hashStr);
gsub("\$", "", hashStr);
gsub("\$", "", hashStr);
gsub("&", "And", hashStr);
gsub("-", "", hashStr);
gsub("=", "", hashStr);
gsub(">", "", hashStr);
gsub("<", "", hashStr);

gsub("\\$", "S", hashStr);

gsub("\\/", " #", hashStr);
gsub(",", " #", hashStr);
gsub(":", " #", hashStr);
gsub(";", " #", hashStr);

return hashStr;
}

Wednesday, May 11, 2016

Homebrew: Installing GNU Awk and GNU sed

Command:

$ brew install gawk gnu-sed

Tuesday, February 9, 2016

AWK: Printing CSV file

printing CSV file:

csv.awk:

{
        readcsv($0);
        print $2" by "$3;
}

function readcsv(string)
{
        #フィールド番号
        n=1;

        #コンマ区切り
        m=split(string, array , ",");

        #完全フィールドか否か
        aflag=1;

        for(i=1;i<=m;i++){
                # "  "で括られていない完全フィールド。
                if( !(array[i] ~ /^\"/) && !(array[i] ~/\"$/) && aflag){
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        $n=array[i];
                        n++;
                        continue;
                }

                # "  "で括られている完全フィールド
                if(( array[i] ~ /^\".*\"$/ ) && aflag){
                        #先頭と末端にあるダブルクオートを削除する
                        gsub(/^\"/,"", array[i]);
                        gsub(/\"$/,"", array[i]);
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        $n=array[i];
                        n++;
                        continue;
                }

                # 先頭が "の部分フィールド
                if(array [i] ~ /^\"/){
                        #先頭のダブルクオートを削除
                        gsub(/^\"/,"", array[i]);
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        aflag=0;
                        $n=array[i];
                        continue;
                }

                # "  "で括られていない、続きの部分フィールド。
                if(!(array[i] ~ /^\"/) && !(array[i] ~/\"$/) && aflag==0){
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        $n = $n "," array[i];
                        continue;
                }

                # 末端が "の部分フィールド
                if((array [i] ~ /\"$/) && aflag==0){
                        #末端のダブルクオートを削除
                        gsub(/\"$/,"", array[i]);
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        $n = $n "," array[i];
                        aflag=1;
                        n++;
                        continue;
                }
        }
}

Bash command:

$ cat list.csv | awk -f ./csv.awk

AWK: FS, RS, OFS, ORS

Testing FS (field separator), RS (record separator), OFS (output field separator), ORS (output record separator):

Command:

$ echo -n '"test,test",testtest,test,test' | awk 'BEGIN{FS=",";RS="\n";OFS="-";ORS="+"}{print $1, $2, $3, $4, $5, $6}'

Output:

"test-test"-testtest-test-test-+

AWK: split Function

Testing the behavior of split function in awk:

$ echo -n 'test,test,test' | awk '{m=split($0,array,",")}{print m}'

3

Friday, February 5, 2016

Version of AWK

$ awk --version

awk version 20070501

Thursday, February 4, 2016

AWK: Number of Words

Printing the number of words in lines without newline:

$ awk 'BEGIN {words=0}{words+=NF} END { printf "%i", words} ./lines.txt

AWK: Number of Lines

print out number of lines without newline:

$ awk 'BEGIN {} END { printf "%i", NR }' ./lines.txt

Wednesday, February 3, 2016

Tweeting Random Line in CSV, Formatted Using AWK

Tweeting random entry in list.txt. 2nd, 3rd, and 4th column in comma-separated value is used:

$ head -$[${RANDOM} % `wc -l < ./list.txt` + 1] ./list.txt | tail -1 | awk -F, '{print $4"("$2")"$3}' | ruby ./tweet.rb

Tuesday, February 2, 2016

AWK: Calling Bash Version of echo Command with \c Option

I don't know how to embed Bourne shell echo with \c option inside AWK script, so I did this way.

echo.bash

#!/bin/bash

for i; do

echo -n $i

done

AWK script using above echo.bash script, so you can call bash version of echo:

$ awk -F, '{                                             

cmd="./echo.bash \""$2"\" | openssl dgst -sha256"

while (cmd | getline line){

print line

}

close(cmd)

}' ./test.txt

Monday, February 1, 2016

AWK: Hashing Arbitrary Columns of Every Lines Using SHA-2 (SHA256) and MD5

*This code has a bug and may not work in your environment.
*This gives wrong SHA-2 or MD5 hash value.
*The echo command inside awk seems not aware of -n option.
*echo command under Bourne shell also has this behaviour.

SHA-2 (SHA256) hash every second column values in comma separated values:

awk -F, '{

cmd="echo -n \""$2"\" | openssl dgst -sha256"

while (cmd | getline line){

print line

}

close(cmd)

}' ./test.txt

MD5 (message-digest algorithm) hash every second column values in comma-separated values:

awk -F, '{

cmd="echo -n \""$2"\" | md5"

while (cmd | getline line){

print line

}

close(cmd)

}' ./test.txt

Sunday, January 31, 2016

AWK: Extracting the Values of Specified Columns Separated by a Delimiter

Extracting the second column values from lines with delimiter '|' separating each columns:

awk -F'|' '{print $2}' ./lines.txt

Saturday, January 30, 2016

AWK: Appending Characters to Each Line

Appending '|' character to the end of each lines:

awk '{print $0"|"}' ./input.txt > ./output.txt

Friday, January 29, 2016

Getting Random Line Number -1

Bash line to generate random line number from 1 to max - 1.
echo $(ruby -e 'print rand') $(wc -l < "./lines.txt") | awk '{printf("%d\n", $1*$2)}'

Music (音楽)

The Smashing Pumpkins: IN ASHES The Smashing Pumpkins: Cyr
The Smashing Pumpkins: Ramona
The Smashing Pumpkins: Wyttch William Patrick Corgan: Hard Times William Patrick Corgan: Cotillions William Patrick Corgan: Cri De Coeur William Patrick Corgan: Like Lambs William Patrick Corgan: Anon Knights of Malta by Smashing Pumpkins Silvery Sometimes (Ghosts) (Music video) by The Smashing Pumpkins Solara (Music video) by The Smashing Pumpkins Billy Corgan: Pillbox Billy Corgan: Neath The Darkest Eves Aeronaut by William Patrick Corgan