Vishwanath Krishnamurthi's blog

A blog on Java EE, clean code, open source and TDD

Why learn algorithms and data structures ?

leave a comment »

The question is – Why learn algorithms and data-structures ? 

and here are some thoughts. Probably you’d find this question answered in the preface of any algorithms book but anyway, here’s my take.

1) To not be constrained by the programming language for data-structure

Without a good knowledge of various data structures, it is easy to be constrained to thinking for solutions in terms of the data structures directly provided by the language.

If you were a Java programmer, you’d probably be thinking on solutions just in terms of what is provided in java.lang.util package.

Well, there’s lots more. Take for instance a simple need:  In a low end, basic phone, as the user types something, you’d like to present the user with “auto complete” feature. Thinking in terms of the data structures readily provided by the language /libraries doesn’t help much.

A simple trie  would help here. For another example, to think of any networking website that we use-

How does Facebook suggest you friends, How does LinkedIn tell personX is a 2nd level contact ?

A graph data structure helps here.

Or when there’s a steady stream of numbers coming in – and you want to maintain the largest 10 numbers. (You don’t have the
luxury of storing all the numbers and sorting them)  Max heap is the one to go to.

The list goes on..

2) Well, algorithms is “technology” too

Think about an air ticket reservation system. There are plenty of websites that search various airlines, for the dates required and presents with the results. A quick response back to the user, is what we’d like to have. Sure, for this we’d like to add all the required hardware..  Wherever there’s scope for pre-processing, (say a scheduled-job that pre-computes some for some common queries and stores it in database) we’d make use of it. Wherever there’s possibility of caching we’d make use of it. To search for various airlines, probably we’d be making web service calls.

We use various technologies throughout – “webservices / cron jobs / caching / good hardware / load balancing…” and these are all interesting – but not without having efficient algorithms (to find path / find shortest path / cheapest path etc / paths without break)

No amount of additional hardware is going to compensate for an inefficient algorithm. Super computers could take years to solve with an inefficient algorithm, what a PC might take just minutes with an efficient algorithm. (I’ll add some stats on this later)

3) You’ll come across some very interesting, non intuitive ways of solving problems

Given an array with n/2 distinct elements and n/2 copies of another element. Find the missing element.

Well, the normal solution we  would all think of is to do a linear scan through the array and to find the missing element after reading through  n/2+2  elements at the worst case. [ Worst case = all n/2 distinct numbers appear first in the array and then the number-which-repeats appears n/2 times]

Interestingly, this could be solved faster this way:


while(true) do

i = getRandomNumberWithinRange(1 to n)

j = getRandomNumberWithinRange(1 to n)

if(i!=j) and a[i]=a[j] then return i;

And the probability that the algorithm does not quit in 10 iterations is 0.1074 !

“Randomization” turns out to be an amazing practical way of solving some section of problems.

And to give a more “real-world”ish application for “randomization” -The randomized version of quicksort is one of the most practical/ efficient sorting algorithms !

Get ready to be surprised by lots of cool, non-intuitive-yet-amazing ways of solving problems

4) Refresh a little bit on probability, logarithm and some fun math

It’s interesting to get back to some math, – specifically on probability, expectations, logarithm..

5) Learn some interesting paradigms

“Divide and Conquer” is quite a common/popular paradigm. But so are “dynamic programming”, “backtracking” etc..

It’s fun to be exposed to these..

For starters -

The basic fibonacci(n) algorithm/code which we’ve been exposed to while learning recursion is practically useless.

With ‘dynamic programming’ however, it could be optimized with just a few more lines of code.

6) It’s a skill

Certain things are just “knowledge” – that could be learnt-along-the-way when a need arises.

For example: There’s not really a need to know all the “MVC” frameworks out there. Knowing one is just enough. There’s a need to use some other MVC framework  ? Well that could be learnt along the way..

But algorithms and data structures don’t fall into that category. It’s a skill, to deduce an algorithm for a problem  / It’s a skill to analyse the complexity and tell which algorithm is the better one / It’s good to learn in advance. Acquiring a skill is interesting, right ?

7) Quadratic complexity when O(n) solution exists

Without some learning of algorithms, it’s quite possible that we come up with inefficient solutions when an efficient one already exists !

8)  Understanding what’s behind the everyday tools..

It’s fun to know what’s behind the tools/framework that we use..

To give an example, for any DI framework, all the dependencies have to be resolved first.

BeanA depends on BeanB which in turn depends on BeanC ?

In what order should the resolution be ? BeanC, BeanB, BeanA  - yes.

Obviously real-world dependencies are not so simple and ‘topological sort’ is right there for this situation.

With some learning, its fun to think and guess the algorithms that are being used in our everyday tools

- Well, that’s it.. from the top of my head.


Written by Vishwanath Krishnamurthi

March 27, 2013 at 10:27 pm

Embracing assertive programming

leave a comment »

I am not using this term “assertive programming” strictly on the basis of  using “assert” keyword, but a little loosely, to  fail-fast and expose bugs early.It is a great technique for programmers to shorten their debugging-time, but still I find that not many  embrace “assertive programming”

Languages like Eiffel contain constructs for preconditions and postconditions and goes a step further in helping to program assertively, but at least we’ve got an “assert” keyword in Java.

Here’s a very good article on “failing fast“.

There’s a choice on the way an assertion is done.

Choice1: The assert keyword is handy, throws error on failing and could be switched off at production.

Choice2: The other way is to go towards an API (Spring-Assert, GoogleGuavaPreconditions) or a custom-class,that throws an Exception

My personal preference is to go with Choice2, just so I could catch the exception in a global exception handler and still have my application end gracefully. But regardless of the way, ‘asserting a condition that must be true’, is a powerful tool, not to be disregarded.

It is not always about ‘asserts‘ either. For example, when using transaction attributes, you could ensure that a transaction must be present using ‘mandatory’ rather than using a ‘required’ and stay away from a vague transaction attribute like ‘supports’

While working with read-only values, you could use something from the programming language to ensures it. Marking references as, ‘final’, better yet, using Immutable objects, would go towards it.

There’s a lot of good to be assertively programming, then why not embrace it ?

Good Reads:

To assert or not to assert

Written by Vishwanath Krishnamurthi

December 9, 2012 at 11:57 am

Where to place the Java EE descriptors ?

leave a comment »

No intro required – these are some of the standard descriptors used in a JavaEE application. However placing these descriptors in the wrong location is a common mistake.

So here’s a list of some descriptors and the correct location they are expected in – Might save some time searching around the net.

ejb-jar.xml

In a Jar, it would go in /META-INF directory
In a War, it would go in /WEB-INF directory

beans.xml

In a Jar, it would go in /META-INF directory
In a War, it would go in /WEB-INF directory.

But note that just because you have WEB-INF/beans.xml, the JARs in WEB-INF/lib do not become bean-archives.
The JAR would need a META-INF/beans.xml separately

persistence.xml

In a Jar, it would go in /META-INF directory
It would go under WEB-INF/classes/META-INF in a WAR

application.xml

In /META-INF directory of Ear

Written by Vishwanath Krishnamurthi

August 25, 2012 at 11:08 am

Spring Webflow – Tips

with one comment

Just a random list of tips – handy if you are using SWF

Passing a String in an evaluate expression:

To pass a string literal, enclose the string in single quotes ‘ ‘ as in


<evaluate expression="searchService.fetchSearchResult('hello')  result="flowScope.searchResult" />

Set a boolean 

Likewise to set a boolean value,

<set name="flowScope.isTicketAvailable" value=" 'true' "/>

Flush flow changes without restarting the server

While developing, do set, development=true as such and save yourself lots of time

<flow:flow-builder-services id="flowBuilderServices"
 view-factory-creator="viewFactoryCreator" development="true"/>


The RequestControlContext object

 On the rare cases where you need a requestControlContext in your bean, you can get hold of it by passing ‘flowRequestContext’ from the flow.

<evaluate expression="someBean.someMethod( flowRequestContext)">

public void someMethod(RequestControlContext context){ //your operation
}

The start state

During development, if you wanted to skip ahead a few states, just define the start-state in the <flow> element.

Want to trace the flow?

Its pretty simple to add a couple of methods in your listener that extends FlowExecutionListenerAdapter

@Override
 public void stateEntered(RequestContext context,
 StateDefinition previousState, StateDefinition newState) {
logger.debug("Entered State:" + newState.getId());
 }
  @Override
 public void transitionExecuting(RequestContext context, TransitionDefinition transition)
 {
 logger.debug("Executing transition:" + transition.getId());
 }

and trace the flow.

Written by Vishwanath Krishnamurthi

May 19, 2012 at 1:37 am

Posted in Spring 3

Tagged with , ,

A few things to avoid while writing unit tests

with 2 comments

Long back I wrote a post on some good practices to follow while writing unit tests.  I thought I’d add one on mistakes/practices to avoid while writing unit tests.

1) Reading a configuration file:

If for writing a unit test, you access a file, then strictly speaking, it is not a unit test.

For example,

class CarTest
{
Engine engine;
// your test for engine object
}

If you created the object under test by doing,

Engine engine = new Engine();

that’s neat. Whereas if DI was used to create the ‘engine’ object, then it wouldn’t have been a unit test. (DI involves reading an external file/ scanning for annotations )

Likewise, if for test data, you retrieved the data from a file, then its not really a unit test.

the main drawback being ?

Reading files takes time. A unit test has to be really fast. Something that’s executed very often.

2) Mocking the wrong objects:

Mocking frameworks are great.  You could easily mock a ‘dao’ or a ‘service’ call and write a unit-test quickly. Mocking a ‘value-object’ like the one above is needless.

public void  AccountBalanceTest
{
@Mock AccountInfo accountInfo;
@Test
public void  debitShouldDecuctTheGivenAmount()
{
when(accountInfo.getCurrentBalance()).thenReturn(500);
Account account = new Account(accountInfo);
account.debit(100);
}
}

It turns out to be messy when a lot of such stub values are defined.

‘Value-Objects’ being easily constructible with a ‘new’ should rather be used like this:

@Test
public void  debitShouldDecuctTheGivenAmount()
{
AccountInfo accountInfo = new AccountInfo();
accountInfo.setCurrentBalance(500);
Account account = new Account(accountInfo);
account.debit(100);
}
}

This reads a lot better, doesn’t it? And when there are a lot of test data to be prepared, there’s of course the very useful builder pattern http://nat.truemesh.com/archives/000714.html

3) Ignoring the feedback

A great thing about writing tests is that you get “feedback” about your code.

For example,  if you started writing a test like this,

@Test
public void testValidateAddressShouldCheckForMandatoryPostCode()
{
Customer customer = new Customer();
Address address = new Address();
address.setPostCode("123456");
customer.setAddress(address);
Validator validator = new Validator();
validator.validateAddress(customer);
}

you should be seeing a “warning” here. You’re spending some effort constructing the object required for test,

and this “warning” should hint you that, the validateAddress() method should be refactored to take an address parameter and not a customer parameter.

It is easy to set up the object like above and get the test passing – but that’s not the point. The idea should be to take the feedback and refactor.

@Test
public void testValidateAddressShouldCheckForMandatoryPostCode()
{
Address address = new Address();
address.setPostCode("123456");
Validator validator = new Validator();
validator.validateAddress(address); // refactored method.
}

Yes, the above one was a simplified example, but hope the idea is clear.

Similarly, when a piece of code is hard to test, it is possible to overcome the pain by using some advanced mocking frameworks and finishing up the test. But “being hard to test” might be a feedback telling  you to take efforts to refactor the code and make it testable/loosely-coupled.

Here’s a great post http://martinfowler.com/articles/modernMockingTools.html that explains it.

So to put it again, “the feedback should not be ignored”

 

Good TDD resources

http://misko.hevery.com/
http://www.growing-object-oriented-software.com/
http://vimeo.com/10569751

 

Written by Vishwanath Krishnamurthi

April 9, 2012 at 11:49 pm

Posted in better code, Java EE, TDD

Tagged with , ,

Contributing to open source projects

with one comment

Recently I was voted as a committer at the lovely OpenEJB project !  Am I excited ? You bet !
For a long time I had wanted to contribute to Open Source projects that I love. Like everything takes a start somewhere, my start was at the OpenEJB project a few months back. Thanks to the friendly folks there who are great at mentoring, I’ve come to understand a few things  about “contributing to open source projects” in general, which I thought I’d share.

So which project to contribute to ?

The answer is pretty simple. Whichever project/projects that interest you.

But the project looks daunting.

Yes it might be. But there’s no rush to understand everything all at once. There’s always a small piece of work, to start from. Building an understanding is a gradual process…

What are the first things?

Get in touch with the folks at the project.

Almost all projects have a mailing list. Subscribe to the mailing list and observe what’s going on.. But without being shy or hidden for long, pop out a “Hi” to the folks there, just introducing yourself. You could ask them to suggest simple pieces that you could start with.

Mental Todos versus Saying a Hi ?

We make mental todos like “Wow.. This project is so good. I must get involved and contribute..” but never really start at it. I think saying a “Hi” and getting in touch,  works better than making mental todos.

Mailing lists are great and essential because members work from different timezones and there’s a need for asynchronous communication.

But that’s not the only way to get in touch. Most projects have an IRC chat,.. so you could chat up !

Contribute ? But I am not an expert… 

You don’t have to be !  Whether you are a beginner or an expert in that technology is not a factor for contributing to an open source project.

Just ,

1) Your interest

2) Time you could afford

would be the factors.

Don’t have hours to spend ? No big deal…

It’s of course great if you could have a few hours everyday to spend at the project. But even if you don’t, there are things you could to with the limited time. Like

1) Answer a user’s question

2) Document something

3) Blog a “how-to”

4) Read up some project docs and get better.

It’s not all about patches and code !

A must read…

I hope to add more to this post later on, but check out the awesome  ”contribution tips” OpenEJB doc.   Excellent tips there !!

***End of post. WordPress Ads may follow***

Written by Vishwanath Krishnamurthi

December 11, 2011 at 5:09 pm

Fun with unix commands -(Bash & imdb-api )

with 2 comments

I’ve been a Windows user for long. I had wanted to switch to Linux, learn a good bit of commands and shell scripting, but never really started until about a month back. So with a fresh install of Ubuntu,  as I was trying with Seds and awks, I just thought I could put these newly learned commands into fun use. And I remembered how I wished for the 1000GB of movies (my brother had that in an external harddisk) sorted based on imdb ratings.

Geekish work starts:

There’s a great RESTful API put up at http://www.imdbapi.com/

So all that was required was:

  • Scanning the filesystem for video files
  • Strip away unwanted words from movie’s filename
  • Encode the filenames (esp, spaces and apostrophes)
  • Send request to imdbapi
  • Parse the result 

So what I ended up with was some bash scripts run like this:

sh lsMovies.sh | ./encode.sh > mlist.txt

sh askImdb.sh  (that saves JSON response in a file mresponse.txt)

Not much need to parse it. Just replace every } with a newline character and save it as a CSV file.

Open the file with the seperator character set as : and voila ! Movie names, ratings, votes, decription etc go into different columns.

Ha, so can do all sorts of sorting ;)

Here’s the code:

Scanning the filesystem for video files, trimming away needless words from filenames

#lsMovies.sh -Scan directory given as argument. Else scan current directory for videos.
#!/bin/bash

if [ -z $1];
then
echo 'Directory parameter not provided. Searching in current dir'
dir='.'
else
echo 'Searching in' $1
dir=$1
fi

find $dir -iname *.avi |
sed "s/.*\///g" | #remove any path info
sed "s/\./ /g"| #replace . with space
sed "s/\[.*//g; s/(.*//g; s/DVD.*//i; s/xvid.*//i; s/rip.*//i; s/.avi//g"

Then the encoding part

encode.sh


#!/bin/bash
sed "s/ /%20/g" |
sed "s/'/%27/g"

Sending request: askImdb.sh


echo "" > mresponse.txt #clear contents initially
count=1
while IFS= read -r line
do
echo "Movie #"$count "requesting imdb for info about"$line
curl http://www.imdbapi.com/?t="$line" >>mresponse.txt
count=$(($count+1))
echo "info on" $count "movies retrieved"
done

So to cheat without parsing, the insertion of newlines


 cat mresponse.txt |
sed "s/}/\n/g"

plus saving this as csv and opening with : as separator is all that's left !

I'm walking down the list watching the best rated movies ! Fun :)

[Update: Did a Java Swing application later, that scans the filesystem for .avi files, queries imdbapi.com for imdb info for each movie, parses the result using Jackson and creates  a CSV report of it using OpenCsv. Try out https://github.com/stratwine/movie-insight.  Download section has the executable JAR ]

***End of post. WordPress Ads may follow***

Written by Vishwanath Krishnamurthi

October 26, 2011 at 6:36 pm

Posted in linux

Tagged with , , ,

Follow

Get every new post delivered to your Inbox.

Join 102 other followers