Fun with unix commands -(Bash & imdb-api )
I’ve been a Windows user for long. I had wanted to switch to Linux, learn a good bit of commands and shell scripting, but never really started until about a month back. So with a fresh install of Ubuntu, as I was trying with Seds and awks, I just thought I could put these newly learned commands into fun use. And I remembered how I wished for the 1000GB of movies (my brother had that in an external harddisk) sorted based on imdb ratings.
Geekish work starts:
There’s a great RESTful API put up at http://www.imdbapi.com/
So all that was required was:
- Scanning the filesystem for video files
- Strip away unwanted words from movie’s filename
- Encode the filenames (esp, spaces and apostrophes)
- Send request to imdbapi
Parse the result
So what I ended up with was some bash scripts run like this:
sh lsMovies.sh | ./encode.sh > mlist.txt sh askImdb.sh (that saves JSON response in a file mresponse.txt)
Not much need to parse it. Just replace every } with a newline character and save it as a CSV file.
Open the file with the seperator character set as : and voila ! Movie names, ratings, votes, decription etc go into different columns.
Ha, so can do all sorts of sorting
Here’s the code:
Scanning the filesystem for video files, trimming away needless words from filenames
#lsMovies.sh -Scan directory given as argument. Else scan current directory for videos. #!/bin/bash if [ -z $1]; then echo 'Directory parameter not provided. Searching in current dir' dir='.' else echo 'Searching in' $1 dir=$1 fi find $dir -iname *.avi | sed "s/.*\///g" | #remove any path info sed "s/\./ /g"| #replace . with space sed "s/\[.*//g; s/(.*//g; s/DVD.*//i; s/xvid.*//i; s/rip.*//i; s/.avi//g"
Then the encoding part
#!/bin/bash sed "s/ /%20/g" | sed "s/'/%27/g"
Sending request: askImdb.sh
echo "" > mresponse.txt #clear contents initially count=1 while IFS= read -r line do echo "Movie #"$count "requesting imdb for info about"$line curl http://www.imdbapi.com/?t="$line" >>mresponse.txt count=$(($count+1)) echo "info on" $count "movies retrieved" done
So to cheat without parsing, the insertion of newlines
cat mresponse.txt | sed "s/}/\n/g"
plus saving this as csv and opening with : as separator is all that's left !
I'm walking down the list watching the best rated movies ! Fun
[Update: Did a Java Swing application later, that scans the filesystem for .avi files, queries imdbapi.com for imdb info for each movie, parses the result using Jackson and creates a CSV report of it using OpenCsv. Try out https://github.com/stratwine/movie-insight. Download section has the executable JAR ]