Showing posts with label Comma-separated values. Show all posts
Showing posts with label Comma-separated values. Show all posts

Friday, June 9, 2017

macOS Sierra: Bash (Unix shell): Associative Array Example: Reading from CSV into Array

Command:

$ cat associative_array3.sh


Result:

#!/bin/bash

aa=()

while IFS=, read -ra line; do if [ -n "${line[0]}" ]; then aa+=("${line[0]}::${line[3]}");fi;done < List\ of\ organisms\ by\ population.csv

total=0

for k in "${aa[@]}"
do
value="${k##*::}"
total=$((total + value))
done

echo "Total population: $total"

LC_CTYPE=C; MIN=1; MAX=$total; while rnd=$(cat /dev/urandom | tr -dc 0-9 | fold -w${#MAX} | head -1 | sed 's/^0*//;'); [ -z $rnd ] && rnd=0; (( $rnd < $MIN || $rnd > $MAX)); do :; done;

echo "Random index: $rnd"

index=$rnd

current=0

for k in "${aa[@]}"
do
value="${k##*::}"
current=$((current + value))

if [ $index -le $current ]
then
name="${k%%::*}"
population=$value
break
fi
done

echo "$name (population: $population)"
printf "Probability: "
echo "$population/$total" | bc -l


Command:

$ ./associative_array3.sh


Result:

Total population: 13958220136
Random index: 8955227583
Domestic pig (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 4227209587
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 7805864096
Cattle (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 8511144278
Domestic pig (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 3892081176
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 2589979307
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 7946872317
Cattle (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 372258826
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 13908122684
Short-eared owl (population: 2000000)
Probability: .00014328474408006714

Result:

Total population: 13958220136
Random index: 7247815301
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 9352855300
Domestic sheep (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 9642135454
Domestic sheep (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 11711491958
Cat (population: 600000000)
Probability: .04298542322402014307

Result:

Total population: 13958220136
Random index: 3990365909
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 272907397
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 9041133545
Domestic pig (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 7777621570
Cattle (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 9017444984
Domestic pig (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 4118816981
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 3797333834
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 12061584807
Mourning dove (population: 475000000)
Probability: .03403012671901594660

Result:

Total population: 13958220136
Random index: 12170053267
Mourning dove (population: 475000000)
Probability: .03403012671901594660

Result:

Total population: 13958220136
Random index: 531916373
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 10657946193
Domestic goat (population: 850000000)
Probability: .06089601623402853602

Result:

Total population: 13958220136
Random index: 13481628295
Collared trogon (population: 50000000)
Probability: .00358211860200167858

Result:

Total population: 13958220136
Random index: 9130914826
Domestic pig (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 7627418145
Cattle (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 12225242832
Mourning dove (population: 475000000)
Probability: .03403012671901594660

Result:

Total population: 13958220136
Random index: 3731595661
Human (population: 7350000000)
Probability: .52657143449424675272

Result:

Total population: 13958220136
Random index: 11796334858
Cat (population: 600000000)
Probability: .04298542322402014307

Result:

Total population: 13958220136
Random index: 7477886491
Cattle (population: 1000000000)
Probability: .07164237204003357179

Result:

Total population: 13958220136
Random index: 760089938
Human (population: 7350000000)
Probability: .52657143449424675272

Tuesday, February 9, 2016

Perl: iTunes XML to CSV (Without XML Parser)

itunes.pl

use constant MODE_NONE => 0;
use constant MODE_FIND_BRAKET_NAME => 1;
use constant MODE_FIND_VALUE =>2;
use constant FALSE =>0;
use constant TRUE => 1;

$mode = MODE_NONE;
$tagName = "";
$value = "";
$keyValue = "";
$openTag = FALSE;
$closeTag = FALSE;
$isTrack = FALSE;

@itunestags = ("Track ID","Name","Artist","Album Artist","Composer","Album","Grouping","Kind","Size","Total Time","Disc Number","Disc Count","Track Number","Track Count","Year","Date Modified","Date Added","Bit Rate","Sample Rate","Play Count","Play Date","Play Date UTC","Skip Count","Skip Date","Release Date","Rating","Album Rating","Album Rating Computed","Normalization","Artwork Count","Persistent ID","Track Type","Protected","Purchased","Location","File Folder Count","Library Folder Count");

print "Track ID,Name,Artist,Album Artist,Composer,Album,Year\n";

while (defined($char = getc())){

        if ($char eq "<"){
                $mode = MODE_FIND_BRAKET_NAME;
                $tagName = "";
        }elsif ($char eq ">"){
                if ($closeTag)
                {
                        if ($value eq "Tracks"){
                                $isTrack = TRUE;
                        }elsif ($value eq "Playlists"){
                                $isTrack = FALSE;
                        }elsif ($value ~~ @itunestags){
                                $keyValue = $value;
                        }else{
                                $value =~ s/"/""/g;

                                if ($value =~ /[,"]/){
                                        $value = "\"" . $value . "\"";
                                }

                                $value =~ s/&#38;/&/g;

                                if ($keyValue ~~ @itunestags){
                                        $track{$keyValue} = $value;
                                }
                        }

                        if ($isTrack && $tagName eq "dict"){
                                printf "%s,%s,%s,%s,%s,%s,%s\n", $track{'Track ID'}, $track{'Name'}, $track{'Artist'}, $track{'Album Artist'}, $track{'Composer'}, $track{'Album'},$track{'Year'};
                                $keyValue ="";
                                %track=();

                        }

                        $value = "";
                        $closeTag = FALSE;
                }else{
                        $mode = MODE_FIND_VALUE;
                        $value = "";
                }
        }else{
                if ($mode == MODE_FIND_BRAKET_NAME){
                        if ($char eq "/"){
                                $closeTag = TRUE;
                        }else{
                                $tagName .= $char;
                        }
                }elsif($mode == MODE_FIND_VALUE){
                        $value .= $char;
                }
        }
}

Bash command:

$ cat ./itunes.xml | perl ./itunes.pl

AWK: Printing CSV file

printing CSV file:

csv.awk:

{
        readcsv($0);
        print $2" by "$3;
}

function readcsv(string)
{
        #フィールド番号
        n=1;

        #コンマ区切り
        m=split(string, array , ",");

        #完全フィールドか否か
        aflag=1;

        for(i=1;i<=m;i++){
                # "  "で括られていない完全フィールド。
                if( !(array[i] ~ /^\"/) && !(array[i] ~/\"$/) && aflag){
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        $n=array[i];
                        n++;
                        continue;
                }

                # "  "で括られている完全フィールド
                if(( array[i] ~ /^\".*\"$/ ) && aflag){
                        #先頭と末端にあるダブルクオートを削除する
                        gsub(/^\"/,"", array[i]);
                        gsub(/\"$/,"", array[i]);
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        $n=array[i];
                        n++;
                        continue;
                }

                # 先頭が "の部分フィールド
                if(array [i] ~ /^\"/){
                        #先頭のダブルクオートを削除
                        gsub(/^\"/,"", array[i]);
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        aflag=0;
                        $n=array[i];
                        continue;
                }

                # "  "で括られていない、続きの部分フィールド。
                if(!(array[i] ~ /^\"/) && !(array[i] ~/\"$/) && aflag==0){
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        $n = $n "," array[i];
                        continue;
                }

                # 末端が "の部分フィールド
                if((array [i] ~ /\"$/) && aflag==0){
                        #末端のダブルクオートを削除
                        gsub(/\"$/,"", array[i]);
                        #ダブルクオートのエスケープ文字を削除
                        gsub(/\"\"/,"\"", array[i]);

                        $n = $n "," array[i];
                        aflag=1;
                        n++;
                        continue;
                }
        }
}

Bash command:

$ cat list.csv | awk -f ./csv.awk

Thursday, February 4, 2016

Get a Random Line from CSV file

First line in a CSV file contains column names, so I ignore the line and get a line after it:

$ head -$[$[${RANDOM} % $[`wc -l < ./lines.csv` - 1]] + 2] ./lines.csv | tail -1

Wednesday, February 3, 2016

Tweeting Random Line in CSV, Formatted Using AWK

Tweeting random entry in list.txt. 2nd, 3rd, and 4th column in comma-separated value is used:

$ head -$[${RANDOM} % `wc -l < ./list.txt` + 1] ./list.txt | tail -1 | awk -F, '{print $4"("$2")"$3}' | ruby ./tweet.rb

Monday, February 1, 2016

AWK: Hashing Arbitrary Columns of Every Lines Using SHA-2 (SHA256) and MD5

*This code has a bug and may not work in your environment.
*This gives wrong SHA-2 or MD5 hash value.
*The echo command inside awk seems not aware of -n option.
*echo command under Bourne shell also has this behaviour.

SHA-2 (SHA256) hash every second column values in comma separated values:

awk -F, '{
cmd="echo -n \""$2"\" | openssl dgst -sha256"
while (cmd | getline line){
print line
}
close(cmd)
}' ./test.txt


MD5 (message-digest algorithm) hash every second column values in comma-separated values:

awk -F, '{
cmd="echo -n \""$2"\" | md5"
while (cmd | getline line){
print line
}
close(cmd)
}' ./test.txt

Sunday, January 31, 2016

Sed: Replacing Delimiter Characters to Create Comma-Separated Values

Replacing all vertical lines '|' with comma ',':

cat ./lines.txt | sed s/\|/,/g