Vishwanath Krishnamurthi's blog

A blog on Java EE, clean code, open source and TDD

Fun with unix commands -(Bash & imdb-api )

with 2 comments

I’ve been a Windows user for long. I had wanted to switch to Linux, learn a good bit of commands and shell scripting, but never really started until about a month back. So with a fresh install of Ubuntu,  as I was trying with Seds and awks, I just thought I could put these newly learned commands into fun use. And I remembered how I wished for the 1000GB of movies (my brother had that in an external harddisk) sorted based on imdb ratings.

Geekish work starts:

There’s a great RESTful API put up at http://www.imdbapi.com/

So all that was required was:

  • Scanning the filesystem for video files
  • Strip away unwanted words from movie’s filename
  • Encode the filenames (esp, spaces and apostrophes)
  • Send request to imdbapi
  • Parse the result 

So what I ended up with was some bash scripts run like this:

sh lsMovies.sh | ./encode.sh > mlist.txt

sh askImdb.sh  (that saves JSON response in a file mresponse.txt)

Not much need to parse it. Just replace every } with a newline character and save it as a CSV file.

Open the file with the seperator character set as : and voila ! Movie names, ratings, votes, decription etc go into different columns.

Ha, so can do all sorts of sorting 😉

Here’s the code:

Scanning the filesystem for video files, trimming away needless words from filenames

#lsMovies.sh -Scan directory given as argument. Else scan current directory for videos.
#!/bin/bash

if [ -z $1];
then
echo 'Directory parameter not provided. Searching in current dir'
dir='.'
else
echo 'Searching in' $1
dir=$1
fi

find $dir -iname *.avi |
sed "s/.*\///g" | #remove any path info
sed "s/\./ /g"| #replace . with space
sed "s/\[.*//g; s/(.*//g; s/DVD.*//i; s/xvid.*//i; s/rip.*//i; s/.avi//g"

Then the encoding part

encode.sh


#!/bin/bash
sed "s/ /%20/g" |
sed "s/'/%27/g"

Sending request: askImdb.sh


echo "" > mresponse.txt #clear contents initially
count=1
while IFS= read -r line
do
echo "Movie #"$count "requesting imdb for info about"$line
curl http://www.imdbapi.com/?t="$line" >>mresponse.txt
count=$(($count+1))
echo "info on" $count "movies retrieved"
done

So to cheat without parsing, the insertion of newlines


 cat mresponse.txt |
sed "s/}/\n/g"

plus saving this as csv and opening with : as separator is all that’s left !

I’m walking down the list watching the best rated movies ! Fun 🙂

[Update: Did a Java Swing application later, that scans the filesystem for .avi files, queries imdbapi.com for imdb info for each movie, parses the result using Jackson and creates  a CSV report of it using OpenCsv. Try out https://github.com/stratwine/movie-insight.  Download section has the executable JAR ]

***End of post. WordPress Ads may follow***
Advertisements

Written by Vishwanath Krishnamurthi

October 26, 2011 at 6:36 pm

Posted in Uncategorized

Tagged with , , ,

2 Responses

Subscribe to comments with RSS.

  1. Nice way to have fun…great post vishwa

    sivaramom

    November 5, 2011 at 3:31 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: